diff --git a/docs/source/reducing_memory_usage.md b/docs/source/reducing_memory_usage.md index f258c0a20f..9b90fbacfb 100644 --- a/docs/source/reducing_memory_usage.md +++ b/docs/source/reducing_memory_usage.md @@ -90,6 +90,33 @@ from trl import SFTConfig training_args = SFTConfig(..., packing=True, max_length=512) ``` +## PEFT for parameter-efficient fine-tuning + +Parameter-Efficient Fine-Tuning (PEFT) methods like LoRA are among the most effective techniques for reducing memory usage during training. Instead of training all model parameters, PEFT methods train only a small number of adapter parameters, significantly reducing memory requirements and enabling fine-tuning of larger models on limited hardware. + +For comprehensive details on using PEFT with TRL, including various adapter methods, quantization options, and advanced configurations, see [PEFT Integration](peft_integration). + +To use PEFT for reducing memory usage: + +```python +from datasets import load_dataset +from peft import LoraConfig +from trl import SFTTrainer + +dataset = load_dataset("trl-lib/Capybara", split="train") + +peft_config = LoraConfig() + +trainer = SFTTrainer( + model="Qwen/Qwen2.5-0.5B", + train_dataset=dataset, + peft_config=peft_config, +) +``` + +PEFT can be combined with other memory reduction techniques such as quantization (4-bit or 8-bit) for even greater memory savings. See [PEFT Integration](peft_integration) for quantization examples. + + ## Liger for reducing peak memory usage > [Liger Kernel](https://github.com/linkedin/Liger-Kernel) is a collection of Triton kernels designed specifically for LLM training. It can effectively increase multi-GPU training throughput by 20% and reduce memory usage by 60%.