Highlights

Integration with diffusers

🧨 Diffusers now leverage PEFT as a backend for LoRA inference for Stable Diffusion models (#873, #993, #961). Relevant PRs on 🧨 Diffusers are huggingface/diffusers#5058, huggingface/diffusers#5147, huggingface/diffusers#5151 and huggingface/diffusers#5359. This helps in unlocking a vast number of practically demanding use cases around adapter-based inference 🚀. Now you can do the following with easy-to-use APIs and it supports different checkpoint formats (Diffusers format, Kohya format ...):

use multiple LoRAs
switch between them instantaneously
scale and combine them
merge/unmerge
enable/disable

For details, refer to the documentation at Inference with PEFT.

New tuning methods

Multitask Prompt Tuning: Thanks @mayank31398 for implementing this method from https://arxiv.org/abs/2303.02861 (#400)
LoHa (low-rank Hadamard product): @kovalexal did a great job adding LoHa from https://arxiv.org/abs/2108.06098 (#956)
LoKr (Kronecker Adapter): Not happy with just one new adapter, @kovalexal also added LoKr from https://arxiv.org/abs/2212.10650 to PEFT (#978)

Other notable additions

Allow merging of LoRA weights when using 4bit and 8bit quantization (bitsandbytes), thanks to @jiqing-feng (#851, #875)
IA³ now supports 4bit quantization thanks to @His-Wardship (#864)
We increased the speed of adapter layer initialization: This should be most notable when creating a PEFT LoRA model on top of a large base model (#887, #915, #994)
More fine-grained control when configuring LoRA: It is now possible to have different ranks and alpha values for different layers (#873)

Experimental features

For some adapters like LoRA, it is now possible to activate multiple adapters at the same time (#873)

Breaking changes

It is no longer allowed to create a LoRA adapter with rank 0 (r=0). This used to be possible, in which case the adapter was ignored.

What's Changed

As always, a bunch of small improvements, bug fixes and doc improvements were added. We thank all the external contributors, both new and recurring. Below is the list of all changes since the last release.

Fixed typos in custom_models.mdx by @Psancs05 in #847
Release version 0.6.0.dev0 by @pacman100 in #849
DOC: Add a contribution guide by @BenjaminBossan in #848
clarify the new model size by @stas00 in #839
DOC: Remove backlog section from README.md by @BenjaminBossan in #853
MNT: Refactor tuner forward methods for simplicity by @BenjaminBossan in #833
🎉 Add Multitask Prompt Tuning by @mayank31398 in #400
Fix typos in ia3.py by @metaprotium in #844
Support merge lora module for 4bit and 8bit linear by @jiqing-feng in #851
Fix seq2seq prompt tuning (#439) by @glerzing in #809
MNT: Move tuners to subpackages by @BenjaminBossan in #807
FIX: Error in forward of 4bit linear lora layer by @BenjaminBossan in #878
MNT: Run tests that were skipped previously by @BenjaminBossan in #884
FIX: PeftModel save_pretrained Doc (#881) by @houx15 in #888
Upgrade docker actions to higher versions by @younesbelkada in #889
Fix error using deepspeed zero2 + load_in_8bit + lora by @tmm1 in #874
Fix doc for semantic_segmentation_lora by @raghavanone in #891
fix_gradient_accumulation_steps_in_examples by @zspo in #898
FIX: linting issue in example by @BenjaminBossan in #908
ENH Remove redundant initialization layer calls by @BenjaminBossan in #887
[docs] Remove duplicate section by @stevhliu in #911
support prefix tuning for starcoder models by @pacman100 in #913
Merge lora module to 8bit model by @jiqing-feng in #875
DOC: Section on common issues encountered with PEFT by @BenjaminBossan in #909
Enh speed up init emb conv2d by @BenjaminBossan in #915
Make base_model.peft_config single source of truth by @BenjaminBossan in #921
Update accelerate dependency version by @rohithkrn in #892
fix lora layer init by @SunMarc in #928
Fixed LoRA conversion for kohya_ss by @kovalexal in #916
[CI] Pin diffusers by @younesbelkada in #936
[LoRA] Add scale_layer / unscale_layer by @younesbelkada in #935
TST: Add GH action to run unit tests with torch.compile by @BenjaminBossan in #943
FIX: torch compile gh action installs pytest by @BenjaminBossan in #944
Fix NotImplementedError for no bias. by @Datta0 in #946
TST: Fix some tests that would fail with torch.compile by @BenjaminBossan in #949
ENH Allow compile GH action to run on torch nightly by @BenjaminBossan in #952
Install correct PyTorch nightly in GH action by @BenjaminBossan in #954
support multiple ranks and alphas for LoRA by @pacman100 in #873
feat: add type hints by @SauravMaheshkar in #858
FIX: setting requires_grad on adapter layers by @BenjaminBossan in #905
[tests] add transformers & diffusers integration tests by @younesbelkada in #962
Fix integrations_tests.yml by @younesbelkada in #965
Add 4-bit support to IA3 - Outperforms QLoRA in both speed and memory consumption by @His-Wardship in #864
Update integrations_tests.yml by @younesbelkada in #966
add the lora target modules for Mistral Models by @pacman100 in #974
TST: Fix broken save_pretrained tests by @BenjaminBossan in #969
[tests] add multiple active adapters tests by @pacman100 in #961
Fix missing tokenizer attribute in test by @BenjaminBossan in #977
Add implementation of LyCORIS LoHa (FedPara-like adapter) for SD&SDXL models by @kovalexal in #956
update BibTeX by @pacman100 in #989
FIX: issues with (un)merging multiple LoRA and IA³ adapters by @BenjaminBossan in #976
add lora target modules for stablelm models by @kbulutozler in #982
Correct minor errors in example notebooks for causal language modelling by @SumanthRH in #926
Fix typo in custom_models.mdx by @Pairshoe in #964
Add base model metadata to model card by @BenjaminBossan in #975
MNT Make .merged a property by @BenjaminBossan in #979
Fix lora creation by @pacman100 in #993
TST: Comment out flaky LoHA test by @BenjaminBossan in #1002
ENH Support Conv2d layers for IA³ by @BenjaminBossan in #972
Fix word_embeddings match for deepspeed wrapped model by @mayank31398 in #1000
FEAT: Add safe_merge option in merge by @younesbelkada in #1001
[core / LoRA] Add safe_merge to bnb layers by @younesbelkada in #1009
ENH: Refactor LoRA bnb layers for faster initialization by @BenjaminBossan in #994
FIX Don't assume model_config contains the key model_type by @BenjaminBossan in #1012
FIX stale.py uses timezone-aware datetime by @BenjaminBossan in #1016
FEAT: Add fp16 + cpu merge support by @younesbelkada in #1017
fix lora scaling and unscaling by @pacman100 in #1027
[LoRA] Revert original behavior for scale / unscale by @younesbelkada in #1029
[LoRA] Raise error when adapter name not found in set_scale by @younesbelkada in #1034
Fix target_modules type in config.from_pretrained by @BenjaminBossan in #1046
docs(README): bit misspell current path link StackLLaMa by @guspan-tanadi in #1047
Fixed wrong construction of LoHa weights, updated adapters conversion script by @kovalexal in #1021
Fix P-tuning for sequence classification docs by @ehcalabres in #1049
FIX: Setting active adapter correctly by @BenjaminBossan in #1051
Fix Conv1D merge error for IA3 by @SumanthRH in #1014
Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models by @kovalexal in #978
[core] Fix use_reentrant issues by @younesbelkada in #1036
[tests] Update Dockerfile to use cuda 12.2 by @younesbelkada in #1050
Add testing for regex matching and other custom kwargs by @SumanthRH in #1031
Fix Slack bot not displaying error messages by @younesbelkada in #1068
Fix slow tests not running by @younesbelkada in #1071
Release version 0.6.0 by @BenjaminBossan in #1072

New Contributors

@Psancs05 made their first contribution in #847
@metaprotium made their first contribution in #844
@jiqing-feng made their first contribution in #851
@houx15 made their first contribution in #888
@tmm1 made their first contribution in #874
@raghavanone made their first contribution in #891
@zspo made their first contribution in #898
@rohithkrn made their first contribution in #892
@Datta0 made their first contribution in #946
@kbulutozler made their first contribution in #982
@Pairshoe made their first contribution in #964
@ehcalabres made their first contribution in #1049

Full Changelog: v0.5.0...v0.6.0

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

🧨 Diffusers now uses 🤗 PEFT, new tuning methods, better quantization support, higher flexibility and more