🧨 Diffusers now uses 🤗 PEFT, new tuning methods, better quantization support, higher flexibility and more
Highlights
Integration with diffusers
🧨 Diffusers now leverage PEFT as a backend for LoRA inference for Stable Diffusion models (#873, #993, #961). Relevant PRs on 🧨 Diffusers are huggingface/diffusers#5058, huggingface/diffusers#5147, huggingface/diffusers#5151 and huggingface/diffusers#5359. This helps in unlocking a vast number of practically demanding use cases around adapter-based inference 🚀. Now you can do the following with easy-to-use APIs and it supports different checkpoint formats (Diffusers format, Kohya format ...):
- use multiple LoRAs
- switch between them instantaneously
- scale and combine them
- merge/unmerge
- enable/disable
For details, refer to the documentation at Inference with PEFT.
New tuning methods
- Multitask Prompt Tuning: Thanks @mayank31398 for implementing this method from https://arxiv.org/abs/2303.02861 (#400)
- LoHa (low-rank Hadamard product): @kovalexal did a great job adding LoHa from https://arxiv.org/abs/2108.06098 (#956)
- LoKr (Kronecker Adapter): Not happy with just one new adapter, @kovalexal also added LoKr from https://arxiv.org/abs/2212.10650 to PEFT (#978)
Other notable additions
- Allow merging of LoRA weights when using 4bit and 8bit quantization (bitsandbytes), thanks to @jiqing-feng (#851, #875)
- IA³ now supports 4bit quantization thanks to @His-Wardship (#864)
- We increased the speed of adapter layer initialization: This should be most notable when creating a PEFT LoRA model on top of a large base model (#887, #915, #994)
- More fine-grained control when configuring LoRA: It is now possible to have different ranks and alpha values for different layers (#873)
Experimental features
- For some adapters like LoRA, it is now possible to activate multiple adapters at the same time (#873)
Breaking changes
- It is no longer allowed to create a LoRA adapter with rank 0 (
r=0
). This used to be possible, in which case the adapter was ignored.
What's Changed
As always, a bunch of small improvements, bug fixes and doc improvements were added. We thank all the external contributors, both new and recurring. Below is the list of all changes since the last release.
- Fixed typos in custom_models.mdx by @Psancs05 in #847
- Release version 0.6.0.dev0 by @pacman100 in #849
- DOC: Add a contribution guide by @BenjaminBossan in #848
- clarify the new model size by @stas00 in #839
- DOC: Remove backlog section from README.md by @BenjaminBossan in #853
- MNT: Refactor tuner forward methods for simplicity by @BenjaminBossan in #833
- 🎉 Add Multitask Prompt Tuning by @mayank31398 in #400
- Fix typos in ia3.py by @metaprotium in #844
- Support merge lora module for 4bit and 8bit linear by @jiqing-feng in #851
- Fix seq2seq prompt tuning (#439) by @glerzing in #809
- MNT: Move tuners to subpackages by @BenjaminBossan in #807
- FIX: Error in forward of 4bit linear lora layer by @BenjaminBossan in #878
- MNT: Run tests that were skipped previously by @BenjaminBossan in #884
- FIX: PeftModel save_pretrained Doc (#881) by @houx15 in #888
- Upgrade docker actions to higher versions by @younesbelkada in #889
- Fix error using deepspeed zero2 + load_in_8bit + lora by @tmm1 in #874
- Fix doc for semantic_segmentation_lora by @raghavanone in #891
- fix_gradient_accumulation_steps_in_examples by @zspo in #898
- FIX: linting issue in example by @BenjaminBossan in #908
- ENH Remove redundant initialization layer calls by @BenjaminBossan in #887
- [docs] Remove duplicate section by @stevhliu in #911
- support prefix tuning for starcoder models by @pacman100 in #913
- Merge lora module to 8bit model by @jiqing-feng in #875
- DOC: Section on common issues encountered with PEFT by @BenjaminBossan in #909
- Enh speed up init emb conv2d by @BenjaminBossan in #915
- Make base_model.peft_config single source of truth by @BenjaminBossan in #921
- Update accelerate dependency version by @rohithkrn in #892
- fix lora layer init by @SunMarc in #928
- Fixed LoRA conversion for kohya_ss by @kovalexal in #916
- [
CI
] Pin diffusers by @younesbelkada in #936 - [
LoRA
] Add scale_layer / unscale_layer by @younesbelkada in #935 - TST: Add GH action to run unit tests with torch.compile by @BenjaminBossan in #943
- FIX: torch compile gh action installs pytest by @BenjaminBossan in #944
- Fix NotImplementedError for no bias. by @Datta0 in #946
- TST: Fix some tests that would fail with torch.compile by @BenjaminBossan in #949
- ENH Allow compile GH action to run on torch nightly by @BenjaminBossan in #952
- Install correct PyTorch nightly in GH action by @BenjaminBossan in #954
- support multiple ranks and alphas for LoRA by @pacman100 in #873
- feat: add type hints by @SauravMaheshkar in #858
- FIX: setting requires_grad on adapter layers by @BenjaminBossan in #905
- [
tests
] add transformers & diffusers integration tests by @younesbelkada in #962 - Fix integrations_tests.yml by @younesbelkada in #965
- Add 4-bit support to IA3 - Outperforms QLoRA in both speed and memory consumption by @His-Wardship in #864
- Update integrations_tests.yml by @younesbelkada in #966
- add the lora target modules for Mistral Models by @pacman100 in #974
- TST: Fix broken save_pretrained tests by @BenjaminBossan in #969
- [tests] add multiple active adapters tests by @pacman100 in #961
- Fix missing tokenizer attribute in test by @BenjaminBossan in #977
- Add implementation of LyCORIS LoHa (FedPara-like adapter) for SD&SDXL models by @kovalexal in #956
- update BibTeX by @pacman100 in #989
- FIX: issues with (un)merging multiple LoRA and IA³ adapters by @BenjaminBossan in #976
- add lora target modules for stablelm models by @kbulutozler in #982
- Correct minor errors in example notebooks for causal language modelling by @SumanthRH in #926
- Fix typo in custom_models.mdx by @Pairshoe in #964
- Add base model metadata to model card by @BenjaminBossan in #975
- MNT Make .merged a property by @BenjaminBossan in #979
- Fix lora creation by @pacman100 in #993
- TST: Comment out flaky LoHA test by @BenjaminBossan in #1002
- ENH Support Conv2d layers for IA³ by @BenjaminBossan in #972
- Fix word_embeddings match for deepspeed wrapped model by @mayank31398 in #1000
- FEAT: Add
safe_merge
option inmerge
by @younesbelkada in #1001 - [
core
/LoRA
] Addsafe_merge
to bnb layers by @younesbelkada in #1009 - ENH: Refactor LoRA bnb layers for faster initialization by @BenjaminBossan in #994
- FIX Don't assume model_config contains the key model_type by @BenjaminBossan in #1012
- FIX stale.py uses timezone-aware datetime by @BenjaminBossan in #1016
- FEAT: Add fp16 + cpu merge support by @younesbelkada in #1017
- fix lora scaling and unscaling by @pacman100 in #1027
- [
LoRA
] Revert original behavior for scale / unscale by @younesbelkada in #1029 - [
LoRA
] Raise error when adapter name not found inset_scale
by @younesbelkada in #1034 - Fix target_modules type in config.from_pretrained by @BenjaminBossan in #1046
- docs(README): bit misspell current path link StackLLaMa by @guspan-tanadi in #1047
- Fixed wrong construction of LoHa weights, updated adapters conversion script by @kovalexal in #1021
- Fix P-tuning for sequence classification docs by @ehcalabres in #1049
- FIX: Setting active adapter correctly by @BenjaminBossan in #1051
- Fix Conv1D merge error for IA3 by @SumanthRH in #1014
- Add implementation of LyCORIS LoKr (KronA-like adapter) for SD&SDXL models by @kovalexal in #978
- [
core
] Fixuse_reentrant
issues by @younesbelkada in #1036 - [
tests
] Update Dockerfile to use cuda 12.2 by @younesbelkada in #1050 - Add testing for regex matching and other custom kwargs by @SumanthRH in #1031
- Fix Slack bot not displaying error messages by @younesbelkada in #1068
- Fix slow tests not running by @younesbelkada in #1071
- Release version 0.6.0 by @BenjaminBossan in #1072
New Contributors
- @Psancs05 made their first contribution in #847
- @metaprotium made their first contribution in #844
- @jiqing-feng made their first contribution in #851
- @houx15 made their first contribution in #888
- @tmm1 made their first contribution in #874
- @raghavanone made their first contribution in #891
- @zspo made their first contribution in #898
- @rohithkrn made their first contribution in #892
- @Datta0 made their first contribution in #946
- @kbulutozler made their first contribution in #982
- @Pairshoe made their first contribution in #964
- @ehcalabres made their first contribution in #1049
Full Changelog: v0.5.0...v0.6.0