-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Freeze manually #2
Comments
Hi, thanks!
Probably yes, but you need to make sure you don't accidentally freeze the lora parameters.
Probably not. After merging, lora_A and lora_B will no longer exist. |
Thank you for your kind reply. However, in the example in https://github.com/cccntu/LoRAnanoGPT/blob/master/train.py, line 236, it uses DDP without 'find_unused_parameters=True' argument. Thnak you! |
Honestly I don't know. Can you solve it by simply adding 'find_unused_parameters=True'? I've only used it on one GPU. Or does using get_lora_parameter solve this issue? |
It looks like this method is correct in the sense that it only updates the parameters you pass in to the optimizer, but Torch will still compute gradients for all weights, as |
Hi, thank you for your great work.
I want to use yours for my experiment.
I wonder get_lora_params() would load parameters to optimizer, but if the model itself can compute gradient, wouldn't the model still compute gradient?
Would be freezing the model enough for using minlora without the get_lora_params?
Also, when merging lora to the model to have another lora module, should I have to set lora_A and lora_B requires_grad=False before merging?
Thank you.
The text was updated successfully, but these errors were encountered: