-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finetuned model does not load trained weights properly, but instead uses random initialization #577
Comments
Hi @ilektram Line 197 in e48dfc3
|
Thank you for the reply. In the meantime I have tried using the following method:
I have observed that the weight tensors are populated however the inference issue persists as I am still obtaining accuracy that is around 50%. The odd thing is that each time I load the model and run inference on the validation set the result is slightly different (but always between 45-55% as opposed to validation run after training where accuracy was always 93%). I am therefore wondering if there is a parameter that introduces randomness in the reloaded model? I have used I have also tried replacing the line
with
but the same behaviour was observed. Evaluation metrics:
I should add as a note that the trained model was saved using the respective callback method. |
Hello @ilektram, I'm running https://github.com/huggingface/peft/blob/main/examples/sequence_classification/LoRA.ipynb and it works as expected without any performance issues post loading. Here, is the mode: https://huggingface.co/smangrul/roberta-large-peft-lora-latest
Output:
|
Can I ask, is there a reason why you load the model with AutoModelForSequenceClassification & AutoTokenizer rather than the Roberta specific classes? |
Experienced a similar issue, turned out that dataset shuffling was non-deterministic. Still observing some differences between evaluation and loaded adapter, but <1%. |
Hello @ilektram, I am wondering if you find a solution to your problem? Dear @younesbelkada @pacman100 , I am running the similar issue despite updating Transformer to the latest version (4.30.2; PS thanks for many fix related to saving PEFT in that version!). In particular, I feel like I can't load the actual fine tuned model but instead the baseline model appear to be loaded. Here's my code:
My evaluation metric at the end of training is: {'eval_train_loss': 5.794477462768555, 'eval_train_microF1': 0.08219178082191782, 'eval_train_macroF1': 0.017915627236747927, 'eval_train_microAUC': 0.828394538670285, 'eval_train_macroAUC': 0.5873078248178315, 'eval_train_labels': 128, 'eval_train_count': 200, 'eval_train_acc10': 0.21, 'eval_train_acc5': 0.185, 'eval_train_acc': 0.075} However, with the loaded model the evaluation metric is: {'eval_loss': 7.802577972412109, 'eval_microF1': 0.0, 'eval_macroF1': 0.0, 'eval_microAUC': 0.5221097862957937, 'eval_macroAUC': 0.532642330878301, 'eval_labels': 128, 'eval_count': 200, 'eval_acc10': 0.015, 'eval_acc5': 0.0, 'eval_acc': 0.0}. I have confirmed that the evaluation dataset is identical with 200 samples. I ran the evaluation of loaded model using both the trainer.evaluate() function and naive torch evaluation function and got the same result. On a separate note and related to another post, here is my lora config during training phase:
Not sure if I need to change anything in modules_to_save as suggested in the other post. My previous understanding is that when you set TaskType to SEQ_CLS, the module_to_save will be set to the last classifier layer ('score' in my case). Not sure if this is the problem. Deeply appreciate any guidance! Update on 07.07 @ilektram: Seems the problem is indeed with the modules_to_saves in LoraConfig as suggested in aforementioned post. Loaded model worked as expected after I adjusted my code as below:
Of interest, while previously the trainable parameters at score layer looks like this:
Now it looks like this:
I am quite puzzled as per the official tutorial, I shouldn't need to specify module_to_save again. There appears to be a bug. Deeply appreciate some clarification. |
Hi, no unfortunately was unable to get it to work correctly so I eventually
abandoned the approach.
Στις Παρ 7 Ιουλ 2023 στις 02:35 ο χρήστης Hanyin Wang <
***@***.***> έγραψε:
… Hello @ilektram <https://github.com/ilektram>, I am wondering if you find
a solution to your problem?
Dear @younesbelkada <https://github.com/younesbelkada> , I am running the
similar issue despite updating Transformer to the latest version (4.30.2),
and thanks for many fix related to saving PEFT there.
In particular, I feel like I can't load the actual fine tuned model but
instead the baseline model appear to be loaded.
Here's my code:
config = PeftConfig.from_pretrained(loraweight)
tokenizer = LlamaTokenizer.from_pretrained(config.base_model_name_or_path, model_max_length=512)
model = LlamaForSequenceClassification.from_pretrained(config.base_model_name_or_path, num_labels=738,
torch_dtype=torch.float16, device_map="auto")
model = PeftModel.from_pretrained(model, loraweight)
My evaluation loss at the end of training is: {'eval_train_loss':
5.794477462768555, 'eval_train_microF1': 0.08219178082191782,
'eval_train_macroF1': 0.017915627236747927, 'eval_train_microAUC':
0.828394538670285, 'eval_train_macroAUC': 0.5873078248178315,
'eval_train_labels': 128, 'eval_train_count': 200, 'eval_train_acc10':
0.21, 'eval_train_acc5': 0.185, 'eval_train_acc': 0.075}
However, with the loaded model the evaluation loss is: {'eval_loss':
7.802577972412109, 'eval_microF1': 0.0, 'eval_macroF1': 0.0,
'eval_microAUC': 0.5221097862957937, 'eval_macroAUC': 0.532642330878301,
'eval_labels': 128, 'eval_count': 200, 'eval_acc10': 0.015, 'eval_acc5':
0.0, 'eval_acc': 0.0}.
I have confirmed that the evaluation dataset is identical with 200
samples. I ran the evaluation of loaded model using both the
trainer.evaluate() function and naive torch evaluation function and got the
same result.
Deeply appreciate any guidance!
—
Reply to this email directly, view it on GitHub
<#577 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEDVMRZZD6M3GDGVZHVY67TXO5RWBANCNFSM6AAAAAAZGFFMVE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I am encountering the same issue where loading the model multiple times gives different results for the same input and accuracy hovers around 50%. I described my situation in more detail here: |
@hanyin88 I think this line is not doing what you think it does: config = LoraConfig(
r=lora_r,
lora_alpha=lora_alpha,
target_modules=lora_target_modules,
lora_dropout=lora_dropout,
bias="none",
task_type=TaskType.SEQ_CLS,
modules_to_save=lora_target_modules.append("score"), # <==
)
Also, in general, the layers targeted for LoRA should not be added to |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. |
Hi, I found the same problem training a BLOOM model for sequence classification when using peft After some efforts, I encountered that all the adapters weights were correctly loaded after building the PEFT model with As far as I understand, it seems that there is a missmatch with the name of the classification layer of the model and the name that the PEFT model expects for the classification layer, which causes the peft model to keep the weights randomly initialised. In this way, I was able to load the classification head just loading the weights manually (you may have other layer names depending the model that you're using): import torch
from torch.nn import Parameter
adapters_weights = torch.load(os.path.join(adapters_path, 'adapter_model.bin'))
model = AutoModelForSequenceClassification.from_pretrained(
peft_config.base_model_name_or_path,
config=model_conf,
)
# Load the weights of the trained classification head (you may need to modify your tensor if your using half precission)
model.score.weight = Parameter(adapters_weights['base_model.model.score.weight'])
model = PeftModelForSequenceClassification.from_pretrained(self.model, adapters_path) If PEFT didn't save the weights of the classification head in the Nevertheless, it looks like this has been fixed with the release of peft |
Thanks for kind following up on the issue! @BenjaminBossan Thanks for the kind insight! You are totally right that, what I accidentally or coincidentally did was to set @TuronLab I think you likely find the fundamental problem here and your solution seems even better. :) As far as I know this bug only applies to |
To expand on this, it will add |
Hi, I'm glad you found this useful. Yes, as @BenjaminBossan says, when you specify that you're going to solve a sequence classification problem, it will automatically try to save Here is the lora_config = LoraConfig(
r=8,
lora_alpha=32,
target_modules=["query", "value"],
lora_dropout=0.5,
bias="none",
inference_mode=False,
task_type=TaskType.SEQ_CLS,
) |
System Info
torch==2.0.0
peft==0.3.0
accelerate==0.20.3
transformers==4.30.1
Who can help?
@pacman100 @youn
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
Finetuned PEFT model based on LlamaForSequenceClassification does not load trained weights of the final layer from local and instead initialises them randomly each time the model is loaded.
happy to share the fine-tuned model if there is a way to upload a compressed version.
I have observed a similar behaviour when trying to load via both the torch state_dict() method and from_pretrained() using the respective models. Both seem to not do a proper loading of the model fine-tuned weights.
Expected behavior
When loading a fine-tuned model into peft from the local disk, the model weights should always be the same and the validation set accuracy should be the same as the one logged against the latest epoch in the relevant trainer_state.json file produced during training and evaluation.
The text was updated successfully, but these errors were encountered: