Potential Gradient Error when Reloading Frozen Weights in `qmodule.py` `_load_from_state_dict` #293

cjfghk5697 · 2024-08-24T08:09:55Z

There is a potential issue in the _load_from_state_dict method where reloading frozen weights into a frozen module might cause a gradient-related error. The FIXME comment in the code points out this problem.

if type(self.weight.data) is not type(deserialized_weight):
    # Reloading frozen weights into unfrozen module: move to the correct device and force assignment
    self.weight = torch.nn.Parameter(deserialized_weight.to(self.weight.device))
else:
    # FIXME: here we should copy frozen weights into frozen module, but this leads to grad error
    self.weight = torch.nn.Parameter(deserialized_weight.to(self.weight.device))

Proposed Solution:

Explicitly set the requires_grad state of the reloaded parameters to False to prevent errors.
Alternatively, use the .data attribute to directly copy the values without altering the requires_grad state.

self.weight = torch.nn.Parameter(deserialized_weight.to(self.weight.device))
self.weight.requires_grad = False

or

self.weight.data.copy_(deserialized_weight.to(self.weight.device).data)

This fix would prevent potential gradient errors when reloading frozen weights into a frozen module, ensuring the operation is safe and consistent.

The text was updated successfully, but these errors were encountered:

github-actions · 2024-09-24T02:02:58Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions · 2024-09-29T02:06:47Z

This issue was closed because it has been stalled for 5 days with no activity.

github-actions · 2024-11-01T02:08:04Z

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions · 2024-11-07T02:01:28Z

This issue was closed because it has been stalled for 5 days with no activity.

github-actions bot added the Stale label Sep 24, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Sep 29, 2024

dacorvo reopened this Sep 30, 2024

github-actions bot removed the Stale label Oct 1, 2024

github-actions bot added the Stale label Nov 1, 2024

github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 7, 2024

dacorvo reopened this Nov 10, 2024

github-actions bot removed the Stale label Nov 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Potential Gradient Error when Reloading Frozen Weights in `qmodule.py` `_load_from_state_dict` #293

Potential Gradient Error when Reloading Frozen Weights in `qmodule.py` `_load_from_state_dict` #293

cjfghk5697 commented Aug 24, 2024 •

edited

Loading

github-actions bot commented Sep 24, 2024

github-actions bot commented Sep 29, 2024

github-actions bot commented Nov 1, 2024

github-actions bot commented Nov 7, 2024

Potential Gradient Error when Reloading Frozen Weights in qmodule.py _load_from_state_dict #293

Potential Gradient Error when Reloading Frozen Weights in qmodule.py _load_from_state_dict #293

Comments

cjfghk5697 commented Aug 24, 2024 • edited Loading

github-actions bot commented Sep 24, 2024

github-actions bot commented Sep 29, 2024

github-actions bot commented Nov 1, 2024

github-actions bot commented Nov 7, 2024

Potential Gradient Error when Reloading Frozen Weights in `qmodule.py` `_load_from_state_dict` #293

Potential Gradient Error when Reloading Frozen Weights in `qmodule.py` `_load_from_state_dict` #293

cjfghk5697 commented Aug 24, 2024 •

edited

Loading