Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Config option to enable/disable batchnorm running values #13

Closed
josedvq opened this issue Apr 18, 2024 · 2 comments
Closed

Config option to enable/disable batchnorm running values #13

josedvq opened this issue Apr 18, 2024 · 2 comments

Comments

@josedvq
Copy link

josedvq commented Apr 18, 2024

Thanks for sharing this great tool

🚀 Feature

Add constructor argument batch_norm_track_running_stats to control track_running_stats of frozen batch norm layers, or avoid changing track_running_stats of batchnorm layers.

Motivation

Currently FinetuningSchedular calls self.freeze(modules=pl_module, train_bn=False). That in turns executes this code for each model module:

if isinstance(module, _BatchNorm):
    module.track_running_stats = False
# recursion could yield duplicate parameters for parent modules w/ parameters so disabling it
for param in module.parameters(recurse=False):
    param.requires_grad = False

This can be confusing because setting track_running_stats=False is not normally done when fine-tunning. It also silently changes the values set when creating the model. The default value of track_running_stats is True so when one fine-tunes a model following the standard procedure of setting requires_grad normally track_running_stats will be True unless set otherwise. I'm not sure why it is implemented this way in BaseFinetuning

Pitch

Adding a constructor argument to provide control over the batchnorm behavior when using the module. I think the default should be to set track_running_stats=True to reproduce the standard "recipe" for fine-tuning.
Alternatively do not alter track_running_stats at all.
At the very least I think it would be important to document the current behavior.


@josedvq josedvq changed the title Config option to enable/disable batchnorm running gradients Config option to enable/disable batchnorm running values Apr 18, 2024
@speediedan
Copy link
Owner

Thanks for your valuable suggestion/feedback @josedvq! Apologies for the delayed response, I've have limited bandwidth to dedicate to Finetuning Scheduler between minor releases so am just preparing FTS for the upcoming 2.3.0 Lightning release now.

Root Cause

About 19 months ago FTS set a default of train_bn=False during layer freezing to address issue #5 handing control of BatchNorm freezing to the FTS schedule while leaving track_running_stats True for the frozen BatchNorm layers by default. About a month subsequently, BaseFinetuning changed this default behavior and began to set track_running_stats to False upon freezing BatchNorm layers. Given that there are use cases for both settings track_running_stats settings, I think a constructor parameter like the one you suggest is a great idea.

In light of your suggestion I've just pushed a commit that I think accordingly improves FTS usability...

New State w/ Usability Improvements - (FTS >= v2.3.0)

The commit will be released with the next Lightning/FTS minor release (2.3.0). This commit introduces the frozen_bn_track_running_stats option to the FTS callback constructor, allowing the user to override the default Lightning behavior that disables track_running_stats when freezing BatchNorm layers. To avoid changing the current behavior without warning, I've made the default for this parameter False but am warning users the default value will be True starting with FTS release 2.4.0.

Hope you've found FTS of some utility once working through the issue you encountered.

Thanks again for your contribution, you've helped improve FTS for everyone! Feel free to reach out anytime if you have other issues or want to share more about your use case. Best of luck with your work!

@speediedan
Copy link
Owner

fixed with commit 9d7014b

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants