4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) #476

TimDettmers · 2023-05-19T18:16:55Z

This adds QLoRA support to PEFT. More information about QLoRA from our abstract:

We develop QLoRA tuning, a method that finetunes by backpropagating gradients through a frozen 4-bit base model into low rank adapters (LoRA). With QLoRA tuning we can finetune 30B/65B parameter models on 24/48GB GPUs while preserving regular 16-bit full finetuning runtime and task performance. We achieve the memory efficiency and quantization precision through a combination of new methods: nested quantization to reduce the average memory footprint from 4.5 to 4.1 bits per parameter, paged optimizers to manage gradient checkpointing memory spikes, and a new data type, 4-bit NormalFloat (NF4), which is information theoretically and empirically optimal for normally distributed weights. To demonstrate the effectiveness and ease of use of QLoRA tuning we finetune more than 1,000 models to create a detailed dissection of instruction following performance across datasets (FLAN, Alpaca, Chip2, SuperNatural Instructions, Chip2, AnthropicHH), models types (LLaMA, T5), and model scales (125M to 65B). A discussion of the results is forthcoming in our paper.

@sgugger @younesbelkada

fixes

younesbelkada

Thank you for your inspiring work, as always! I just left 3 comments - also curious to see what @sgugger will say!

younesbelkada · 2023-05-19T18:26:08Z

setup.py

@@ -41,7 +41,6 @@
        "packaging>=20.0",
        "psutil",
        "pyyaml",
-        "torch>=1.13.0",


I think this change is not needed

younesbelkada · 2023-05-19T18:27:08Z

examples/int8_training/peft_adalora_whisper_large_training.py

@@ -18,7 +18,6 @@
 import numpy as np
 import torch
 import transformers
-import wandb


Probably this change is not needed either

younesbelkada · 2023-05-19T18:28:57Z

src/peft/peft_model.py

+        if hasattr(self.base_model, "model"):
+            self.base_model.model.generation_config = self.generation_config
+        else:
+            self.base_model.generation_config = self.generation_config


I think these changes are fine for now, I will investigate later why this change is needed

HuggingFaceDocBuilderDev · 2023-05-19T18:29:22Z

The documentation is not available anymore as the PR was closed or merged.

sgugger

Thanks for adding the support to 4bit quantization! Apart from Younes comments on potentially unrelated changes, this looks great to me!

final changes

TimDettmers · 2023-05-20T15:21:09Z

Thank you, Younes & Sylvain! Thanks for the changes, Younes. This looks all good to me.

younesbelkada

Again thank you for your great work, Tim!

ewof · 2023-05-20T16:43:03Z

holy shit it's happening

ewof · 2023-05-20T16:43:23Z

dettmers is a hero

artidoro and others added 13 commits May 10, 2023 14:24

4bit lora

e1b46e2

4bit test

9c429c6

fixing 4bits bugs

11fb649

fp4 pass variables

99c2267

fix inference datatype and generation config

2cbfe49

updating prep for int8 function to work for 4-bit

c35b143

Added FP4 LoRA and FP4 fine-tuning example.

6354dd0

Merge branch 'main' into bnb_beta

f7a4ab0

LinearFP4 -> Linear4bit

b7e6548

fixes

92dfff3

Fixed 4-bit example.

659cdc0

Merge pull request #1 from younesbelkada/fix-beta

1a7716f

fixes

Style changes.

088a89f

TimDettmers mentioned this pull request May 19, 2023

4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) huggingface/accelerate#1458

Merged

younesbelkada reviewed May 19, 2023

View reviewed changes

sgugger approved these changes May 19, 2023

View reviewed changes

final changes

8922d37

younesbelkada mentioned this pull request May 20, 2023

final changes TimDettmers/peft#2

Merged

Merge pull request #2 from younesbelkada/fix-tests-beta

f5d0fbc

final changes

younesbelkada approved these changes May 20, 2023

View reviewed changes

younesbelkada merged commit d6015bc into huggingface:main May 20, 2023

Erthar mentioned this pull request May 20, 2023

AttributeError: module 'bitsandbytes.nn' has no attribute 'Linear4bit' bitsandbytes-foundation/bitsandbytes#416

Closed

jasonmcaffee mentioned this pull request May 20, 2023

Hugging Face peft and bitsandbytes mismatch oobabooga/text-generation-webui#2228

Closed

1 task

This was referenced May 26, 2023

QLoRA微调支持 THUDM/VisualGLM-6B#68

Closed

QLoRA支持 hiyouga/ChatGLM-Efficient-Tuning#120

Closed

1049451037 mentioned this pull request May 27, 2023

Is QLoRA save and load correct? #508

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) #476

4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) #476

TimDettmers commented May 19, 2023

younesbelkada left a comment •

edited

Loading

younesbelkada May 19, 2023

younesbelkada May 19, 2023

younesbelkada May 19, 2023

HuggingFaceDocBuilderDev commented May 19, 2023 •

edited

Loading

sgugger left a comment

TimDettmers commented May 20, 2023

younesbelkada left a comment

ewof commented May 20, 2023

ewof commented May 20, 2023

4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) #476

4-bit QLoRA via bitsandbytes (4-bit base model + LoRA) #476

Conversation

TimDettmers commented May 19, 2023

younesbelkada left a comment • edited Loading

Choose a reason for hiding this comment

younesbelkada May 19, 2023

Choose a reason for hiding this comment

younesbelkada May 19, 2023

Choose a reason for hiding this comment

younesbelkada May 19, 2023

Choose a reason for hiding this comment

HuggingFaceDocBuilderDev commented May 19, 2023 • edited Loading

sgugger left a comment

Choose a reason for hiding this comment

TimDettmers commented May 20, 2023

younesbelkada left a comment

Choose a reason for hiding this comment

ewof commented May 20, 2023

ewof commented May 20, 2023

younesbelkada left a comment •

edited

Loading

HuggingFaceDocBuilderDev commented May 19, 2023 •

edited

Loading