-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
RuntimeError: only Tensors of floating point dtype can require gradients for QLoRA since transformers 4.40 #1720
Comments
Yes, I can reproduce the error. The reason is that since |
Thanks for checking on this, appreciate it. Sure for now I'll use the
earlier version for my projects and demos.!
…On Fri, May 10, 2024, 3:10 PM Benjamin Bossan ***@***.***> wrote:
Yes, I can reproduce the error. The reason is that since
transformers==4.40, the pre_classifier module of this model is converted
to a bitsandbytes Linear4bit when instead of being a normal PyTorch
nn.Linear. As this module as being added to the modules_to_save, PEFT
tries to enable gradients on it, resulting in the error you see. We'll
discuss this internally and think of an appropriate fix. In the meantime,
if possible, downgrade to an earlier transformers version.
—
Reply to this email directly, view it on GitHub
<#1720 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2J3R72Z5SFTDPMDVSE37DZBSIXHAVCNFSM6AAAAABHO6UKESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBUGI4TAMRTGE>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
Hi @dipanjanS ! huggingface/transformers#29958 introduced a fix for that and introduced this bug you are sharing which shouldn't be really a bug since the To temporary fix your issue, can you load the 4-bit model with: model_checkpoint = "distilbert/distilbert-base-uncased"
id2label = {0: "NEGATIVE", 1: "POSITIVE"}
label2id = {"NEGATIVE": 0, "POSITIVE": 1}
import torch
from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer, BitsAndBytesConfig
config = BitsAndBytesConfig(
load_in_4bit=True, # quantize the model to 4-bits when you load it
bnb_4bit_quant_type="nf4", # use a special 4-bit data type for weights initialized from a normal distribution
bnb_4bit_use_double_quant=True, # nested quantization scheme to quantize the already quantized weights
bnb_4bit_compute_dtype=torch.bfloat16, # use bfloat16 for faster computation,
+ llm_int8_skip_modules=["classifier", "pre_classifier"]
)
model = AutoModelForSequenceClassification.from_pretrained(model_checkpoint,
id2label=id2label,
label2id=label2id,
num_labels=2,
quantization_config=config)
from peft import prepare_model_for_kbit_training
model = prepare_model_for_kbit_training(model)
from peft import LoraConfig, get_peft_model, TaskType, replace_lora_weights_loftq
config = LoraConfig(
r=8,
lora_alpha=32,
target_modules=["q_lin", "k_lin", "v_lin", "out_lin"],
lora_dropout=0.05,
bias="none",
task_type=TaskType.SEQ_CLS)
peft_model = get_peft_model(model, config)
replace_lora_weights_loftq(peft_model)
print_trainable_parameters(peft_model) |
Awesome, can confirm this definitely works! Just wanted to check
going forward should I explicitly mention those classifier layers to be
skipped in the BnB configuration? or would this be automatically handled in
a future release. Based on your recommendation I will try to use that going
forward
…On Mon, May 13, 2024 at 8:29 PM Younes Belkada ***@***.***> wrote:
Hi @dipanjanS <https://github.com/dipanjanS> !
Thanks for the issue, I had a deeper look. Previously there was a silent
bug in transformers that was quantizing the pre_classifier layer, which
shouldn't happen as only the last layer should be quantized.
huggingface/transformers#29958
<huggingface/transformers#29958> introduced a fix
for that and introduced this bug you are sharing which shouldn't be really
a bug since the pre_classifier should be quantized at first place, only
the last layer shouldn't be quantized.
To temporary fix your issue, can you load the 4-bit model with: llm_int8_skip_modules=["classifier",
"pre_classifier"] ?
model_checkpoint = "distilbert/distilbert-base-uncased"
id2label = {0: "NEGATIVE", 1: "POSITIVE"}
label2id = {"NEGATIVE": 0, "POSITIVE": 1}
import torch
from transformers import AutoModelForSequenceClassification, TrainingArguments, Trainer, BitsAndBytesConfig
config = BitsAndBytesConfig(
load_in_4bit=True, # quantize the model to 4-bits when you load it
bnb_4bit_quant_type="nf4", # use a special 4-bit data type for weights initialized from a normal distribution
bnb_4bit_use_double_quant=True, # nested quantization scheme to quantize the already quantized weights
bnb_4bit_compute_dtype=torch.bfloat16, # use bfloat16 for faster computation,+ llm_int8_skip_modules=["classifier", "pre_classifier"]
)
model = AutoModelForSequenceClassification.from_pretrained(model_checkpoint,
id2label=id2label,
label2id=label2id,
num_labels=2,
quantization_config=config)
from peft import prepare_model_for_kbit_training
model = prepare_model_for_kbit_training(model)
from peft import LoraConfig, get_peft_model, TaskType, replace_lora_weights_loftq
config = LoraConfig(
r=8,
lora_alpha=32,
target_modules=["q_lin", "k_lin", "v_lin", "out_lin"],
lora_dropout=0.05,
bias="none",
task_type=TaskType.SEQ_CLS)
peft_model = get_peft_model(model, config)
replace_lora_weights_loftq(peft_model)
print_trainable_parameters(peft_model)
—
Reply to this email directly, view it on GitHub
<#1720 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AA2J3R77JKB4MFDC2TFJ4X3ZCDIL5AVCNFSM6AAAAABHO6UKESVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMBXHA3TQNZZGM>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi @dipanjanS |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Thank you for the detailed answer. I would like to ask, how could we know what layers are supposed to be quantized and what aren't to fix this issue in other models. For my case i encoutered this error while trying to load Llama3-8B using the following config : |
System Info
Who can help?
@pacman100
Information
Tasks
examples
folderReproduction
This is the colab notebook for a simple fine-tuning of a DistilBERT model using QLoRA
The main code snippet of interest which is erroring out:
Error happens in the line
peft_model = get_peft_model(model, config)
above when the PEFT model is being made. Error trace is as follows.Expected behavior
Ideally the model should get created and then fine-tuned. The same notebook used to work fine with
transformers==4.38
but something might have changed as it is no longer working withtransformers==4.40
, have validated the same that when I downgrade this code still works. I want some help in figuring out if something is wrong in the code which I need to change \ fundamentally doing wrong or there is a deeper issue.The text was updated successfully, but these errors were encountered: