-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Continue fine-tuning from previously trained model: error on inference #104
Comments
If you are using LoRA fine-tuning and the path string where you save the model does not include 'lora,' you can try renaming your model save path to add '_lora.' |
Dear @jiajunlong, My model path is now:
and contains the following files:
Do you know what I am doing wrong? Thanks a lot for your help. |
Just wanted to add that the |
When you were evaluation, was the model_path in load_pretrained_model set to "/scratch/riggi/Analysis/MLProjects/TinyLLaVA/fine-tuning/radioimg-dataset/TinyLLaVA-Phi-2-SigLIP-3.1B/vision_freeze/nepochs2/_lora"? |
Could you please check the function that loads model parameters in the codebase? When loading model weights, it checks if the model path contains the string "lora" to load the weights accordingly. The error you encountered above suggests that you used a non-lora method to load the weights. |
Dear @jiajunlong,
In the meanwhile, I also tried a different approach (not sure if correct though...).
The above code produces these files:
Then, I trained from this saved model for 1 epoch using the same custom_finetune script. I only changed Now, to be honest, I am not sure at all if this is the correct approach to continue fine-tuning. It seems that every LORA fine-tuning adds an adapted layer to the saved model, that's why I merged first the LORA weights in the base model before continuing the training. Another (2nd-order) problem) is that I also need to somehow adjust the learning rate of the warmup+cosine strategy when continuing training, otherwise the schedule starts from scratch and not from the end of 1st epoch strategy. Please, let me know if I am doing it the wrong way. Thanks a lot for your time. |
Hello, I would like to ask how you loaded this model TinyLLaVA-Phi-2-SigLIP-3.1B . |
@eva10084 Initially, I loaded the model from huggingface
The model files for $SAVED_MODEL_PATH is the path where I want the trained model to be saved.
Hope that helps. |
Dear all,
I have fine-tuned TinyLLaVA-Phi-2-SigLIP-3.1B for 1 epoch and then continued the fine-tuning for another epoch starting from trained models saved after the first epoch. Both training runs were successful. For that runs I used the
custom_finetune.sh
script with provided default parameters.The evaluation runs fine for the first model (epoch1) but fails for the final model (epoch2) with this error:
It seems that the second model is saved without some components or with different layer names.
Any hint to solve this error?
Thanks a lot for your help.
PS: To do evaluation I am using the sample code shown in this previous issue: #79
The text was updated successfully, but these errors were encountered: