-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Potential Gradient Error when Reloading Frozen Weights in qmodule.py
_load_from_state_dict
#293
Comments
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days. |
This issue was closed because it has been stalled for 5 days with no activity. |
There is a potential issue in the
_load_from_state_dict
method where reloading frozen weights into a frozen module might cause a gradient-related error. The FIXME comment in the code points out this problem.Proposed Solution:
Explicitly set the requires_grad state of the reloaded parameters to False to prevent errors.
Alternatively, use the .data attribute to directly copy the values without altering the requires_grad state.
or
This fix would prevent potential gradient errors when reloading frozen weights into a frozen module, ensuring the operation is safe and consistent.
The text was updated successfully, but these errors were encountered: