Highlights
New Methods
Context-aware Prompt Tuning
@tsachiblau added a new soft prompt method called Context-aware Prompt Tuning (CPT) which is a combination of In-Context Learning and Prompt Tuning in the sense that, for each training sample, it builds a learnable context from training examples in addition to the single training sample. Allows for sample- and parameter-efficient few-shot classification and addresses recency-bias.
Explained Variance Adaptation
@sirluk contributed a new LoRA initialization method called Explained Variance Adaptation (EVA). Instead of randomly initializing LoRA weights, this method uses SVD on minibatches of finetuning data to initialize the LoRA weights and is also able to re-allocate the ranks of the adapter based on the explained variance ratio (derived from SVD). Thus, this initialization method can yield better initial values and better rank distribution.
Bone
@JL-er added an implementation for Block Affine (Bone) Adaptation which utilizes presumed sparsity in the base layer weights to divide them into multiple sub-spaces that share a single low-rank matrix for updates. Compared to LoRA, Bone has the potential to significantly reduce memory usage and achieve faster computation.
Enhancements
PEFT now supports LoRAs for int8
torchao quantized models (check this and this notebook) . In addition, VeRA can now be used with 4 and 8 bit bitsandbytes quantization thanks to @ZiadHelal.
Hot-swapping of LoRA adapters is now possible using the hotswap_adapter
function. Now you are able to load one LoRA and replace its weights in-place with the LoRA weights of another adapter which, in general, should be faster than deleting one adapter and loading the other adapter in its place. The feature is built so that no re-compilation of the model is necessary if torch.compile
was called on the model (right now, this requires ranks and alphas to be the same for the adapters).
LoRA and IA³ now support Conv3d
layers thanks to @jsilter, and @JINO-ROHIT added a notebook showcasing PEFT model evaluation using lm-eval-harness toolkit.
With the target_modules
argument, you can specify which layers to target with the adapter (e.g. LoRA). Now you can also specify which modules not to target by using the exclude_modules
parameter (thanks @JINO-ROHIT).
Changes
- There have been made several fixes to the OFT implementation, among other things, to fix merging, which makes adapter weights trained with PEFT versions prior to this release incompatible (see #1996 for details).
- Adapter configs are now forward-compatible by accepting unknown keys.
- Prefix tuning was fitted to the
DynamicCache
caching infrastructure of transformers (see #2096). If you are using this PEFT version and a recent version of transformers with an old prefix tuning checkpoint, you should double check that it still works correctly and retrain it if it doesn't. - Added
lora_bias
parameter to LoRA layers to enable bias on LoRA B matrix. This is useful when extracting LoRA weights from fully fine-tuned parameters with bias vectors so that these can be taken into account. - #2180 provided a couple of bug fixes to LoKr (thanks @yaswanth19). If you're using LoKr, your old checkpoints should still work but it's recommended to retrain your adapter.
from_pretrained
now warns the user if PEFT keys are missing.- Attribute access to modules in
modules_to_save
is now properly and transparently handled. - PEFT supports the changes to bitsandbytes 8bit quantization from the recent v0.45.0 release. To benefit from these improvements, we thus recommend to upgrade bitsandbytes if you're using QLoRA. Expect slight numerical differences in model outputs if you're using QLoRA with 8bit bitsandbytes quantization.
What's Changed
- Bump version to 0.13.1.dev0 by @BenjaminBossan in #2094
- Support Conv3d layer in LoRA and IA3 by @jsilter in #2082
- Fix Inconsistent Missing Keys Warning for Adapter Weights in PEFT by @yaswanth19 in #2084
- FIX: Change check if past_key_values is empty by @BenjaminBossan in #2106
- Update install.md by @Salehbigdeli in #2110
- Update OFT to fix merge bugs by @Zeju1997 in #1996
- ENH: Improved attribute access for modules_to_save by @BenjaminBossan in #2117
- FIX low_cpu_mem_usage consolidates devices by @BenjaminBossan in #2113
- TST Mark flaky X-LoRA test as xfail by @BenjaminBossan in #2114
- ENH: Warn when from_pretrained misses PEFT keys by @BenjaminBossan in #2118
- FEAT: Adding exclude modules param(#2044) by @JINO-ROHIT in #2102
- fix merging bug / update boft conv2d scaling variable by @Zeju1997 in #2127
- FEAT: Support quantization for VeRA using bitsandbytes (#2070) by @ZiadHelal in #2076
- Bump version to 0.13.2.dev0 by @BenjaminBossan in #2137
- FEAT: Support torchao by @BenjaminBossan in #2062
- FIX: Transpose weight matrix based on fan_in_fan_out condition in PiSSA initialization (#2103) by @suyang160 in #2104
- FIX Type annoations in vera/bnb.py by @BenjaminBossan in #2139
- ENH Make PEFT configs forward compatible by @BenjaminBossan in #2038
- FIX Raise an error when performing mixed adapter inference and passing non-existing adapter names by @BenjaminBossan in #2090
- FIX Prompt learning with latest transformers error by @BenjaminBossan in #2140
- adding peft lora example notebook for ner by @JINO-ROHIT in #2126
- FIX TST: NaN issue with HQQ GPU test by @BenjaminBossan in #2143
- FIX: Bug in target module optimization if child module name is suffix of parent module name by @BenjaminBossan in #2144
- Bump version to 0.13.2.dev0 by @BenjaminBossan in #2145
- FIX Don't assume past_key_valus for encoder models by @BenjaminBossan in #2149
- Use
SFTConfig
instead ofSFTTrainer
keyword args by @qgallouedec in #2150 - FIX: Sft train script FSDP QLoRA embedding mean resizing error by @BenjaminBossan in #2151
- Optimize DoRA in
eval
andno dropout
by @ariG23498 in #2122 - FIX Missing low_cpu_mem_usage argument by @BenjaminBossan in #2156
- MNT: Remove version pin of diffusers by @BenjaminBossan in #2162
- DOC: Improve docs for layers_pattern argument by @BenjaminBossan in #2157
- Update HRA by @DaShenZi721 in #2160
- fix fsdp_auto_wrap_policy by @eljandoubi in #2167
- MNT Remove Python 3.8 since it's end of life by @BenjaminBossan in #2135
- Improving error message when users pass layers_to_transform and layers_pattern by @JINO-ROHIT in #2169
- FEAT Add hotswapping functionality by @BenjaminBossan in #2120
- Fix to prefix tuning to fit transformers by @BenjaminBossan in #2096
- MNT: Enable Python 3.12 on CI by @BenjaminBossan in #2173
- MNT: Update docker nvidia base image to 12.4.1 by @BenjaminBossan in #2176
- DOC: Extend modules_to_save doc with pooler example by @BenjaminBossan in #2175
- FIX VeRA failure on multiple GPUs by @BenjaminBossan in #2163
- FIX: Import location of HF hub errors by @BenjaminBossan in #2178
- DOC: fix broken link in the README of loftq by @dennis2030 in #2183
- added checks for layers to transforms and layer pattern in lora by @JINO-ROHIT in #2159
- ENH: Warn when loading PiSSA/OLoRA together with other adapters by @BenjaminBossan in #2186
- TST: Skip AQLM test that is incompatible with torch 2.5 by @BenjaminBossan in #2187
- FIX: Prefix tuning with model on multiple devices by @BenjaminBossan in #2189
- FIX: Check for prefix tuning + gradient checkpointing fails by @BenjaminBossan in #2191
- Dora_datacollector_updated by @shirinyamani in #2197
- [BUG] Issue with using
rank_pattern
andalpha_pattern
together inLoraConfig
by @sirluk in #2195 - evaluation of peft model using lm-eval-harness toolkit by @JINO-ROHIT in #2190
- Support Bone by @JL-er in #2172
- BUG🐛: Fixed scale related bugs in LoKr | Added rank_dropout_scale parameter by @yaswanth19 in #2180
- update load_dataset for examples/feature_extraction by @sinchir0 in #2207
- [FEAT] New LoRA Initialization Method: Explained Variance Adaptation by @sirluk in #2142
- [FIX] EVA
meta
device check bug + add multi-gpu functionality by @sirluk in #2218 - CPT Tuner by @tsachiblau in #2168
- [FIX] Invalid
None
check forloftq_config
attribute inLoraConfig
by @sirluk in #2215 - TST: Move slow compile tests to nightly CI by @BenjaminBossan in #2223
- CI Update AutoAWQ version to fix CI by @BenjaminBossan in #2222
- FIX Correctly set device of input data in bnb test by @BenjaminBossan in #2227
- CI: Skip EETQ tests while broken by @BenjaminBossan in #2226
- Add Validation for Invalid
task_type
in PEFT Configurations by @d-kleine in #2210 - [FEAT] EVA: ensure deterministic behavior of SVD on multi gpu setups by @sirluk in #2225
- TST: Eva: Speed up consistency tests by @BenjaminBossan in #2224
- CI: Fix failing torchao test by @BenjaminBossan in #2232
- TST: Update Llava model id in test by @BenjaminBossan in #2236
- TST: Skip test on multi-GPU as DataParallel fails by @BenjaminBossan in #2234
- Bump version of MacOS runners from 12 to 13 by @githubnemo in #2235
- new version Bone by @JL-er in #2233
- ENH Argument to enable bias for LoRA B by @BenjaminBossan in #2237
- FIX: Small regression in BNB LoRA output by @BenjaminBossan in #2238
- Update CPT documentation by @tsachiblau in #2229
- FIX: Correctly pass low_cpu_mem_usage argument when initializing a PEFT model with task_type by @BenjaminBossan in #2253
- FIX Correctly determine word embeddings on Deberta by @BenjaminBossan in #2257
- FIX: Prevent CUDA context initialization due to AWQ by @BenjaminBossan in #2230
- ENH: Updates for upcoming BNB Int8 release by @matthewdouglas in #2245
- Prepare for PEFT release of v0.14.0 by @BenjaminBossan in #2258
New Contributors
- @jsilter made their first contribution in #2082
- @yaswanth19 made their first contribution in #2084
- @Salehbigdeli made their first contribution in #2110
- @JINO-ROHIT made their first contribution in #2102
- @ZiadHelal made their first contribution in #2076
- @suyang160 made their first contribution in #2104
- @qgallouedec made their first contribution in #2150
- @eljandoubi made their first contribution in #2167
- @dennis2030 made their first contribution in #2183
- @sirluk made their first contribution in #2195
- @JL-er made their first contribution in #2172
- @sinchir0 made their first contribution in #2207
- @tsachiblau made their first contribution in #2168
- @d-kleine made their first contribution in #2210
- @githubnemo made their first contribution in #2235
- @matthewdouglas made their first contribution in #2245
Full Changelog: v0.13.2...v0.14.0