-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Biases are excluded from BYOL moving average update #440
Comments
Hi! thanks for your contribution!, great first issue! |
@annikabrundyn Would you mind having a look? |
Yeah, the paper explicitly mentions that bias parameters are omitted from optimization with LARS and weight decay but there's no mention about omission for the moving average update |
I think this issue should be looked at rather than closed (and ideally fixed if it indeed wasn't an intentional modification) - not that it seems to affect performance much. The PR would be super simple, and I'm happy to do if the original contributor of this implementation is too busy? |
hey @wjn0 I've just checked the paper again and I think you're right - thanks for spotting this! if you'd like you can submit a PR and ping me for review? also happy to make the change myself this weekend |
No longer exclude biases from the moving average update in BYOL. Fixes Lightning-Universe#440
No longer exclude biases from the moving average update in BYOL. Fixes #440
Based on the code of the
BYOLMAWeightUpdate
, bias terms are excluded from the moving average update. I don't think this matches the original paper. I did not see any documentation on this point, but it seems intentional. Is there a source or reference for this difference (assuming I haven't misunderstood the original paper)?The relevant code is here, excerpted:
To see some of the parameters which are excluded from the MA update with this condition in place (when using a ResNet-50 as the backbone, for example), you can run the following snippet:
The text was updated successfully, but these errors were encountered: