Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BatchNorm guide #2536

Merged
merged 4 commits into from
Nov 28, 2022
Merged

BatchNorm guide #2536

merged 4 commits into from
Nov 28, 2022

Conversation

cgarciae
Copy link
Collaborator

@cgarciae cgarciae commented Oct 15, 2022

What does this PR do?

Add a Using BatchNorm guide with the following content:

  • How to define a model that uses BatchNorm
  • How to handle the batch_stats collection
  • How to configure apply
  • How to modify TrainState, train_step, and eval_step to handle a model with BatchNorm.

Our codediff sphinx directive is use throughout to highlight the changes to add BatchNorm support for existing code.

Live preview: https://flax--2536.org.readthedocs.build/en/2536/guides/batch_norm.html

@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@zaxtax
Copy link
Collaborator

zaxtax commented Oct 19, 2022

I think this is good, but it could help if the code could be fully copy-pasted. So maybe including all relevant import statements.

I wouldn't use the word "just" since even though it's a small change the introduction of the train flag is confusing and surprising. As you say, BatchNorm is often the first model most users encounter that behaves much more differently at train vs test time. I think that's worth dwelling on.

Minor:
"BatchNorm add an" -> "BatchNorm adds an"

@codecov-commenter
Copy link

codecov-commenter commented Nov 3, 2022

Codecov Report

Merging #2536 (de58b2a) into main (478cffe) will increase coverage by 1.33%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##             main    #2536      +/-   ##
==========================================
+ Coverage   79.44%   80.78%   +1.33%     
==========================================
  Files          49       50       +1     
  Lines        5211     5365     +154     
==========================================
+ Hits         4140     4334     +194     
+ Misses       1071     1031      -40     
Impacted Files Coverage Δ
flax/training/checkpoints.py 65.86% <0.00%> (-0.51%) ⬇️
flax/io.py 85.26% <0.00%> (ø)
flax/linen/module.py 92.64% <0.00%> (+0.01%) ⬆️
flax/core/scope.py 89.84% <0.00%> (+0.04%) ⬆️
flax/metrics/tensorboard.py 92.85% <0.00%> (+0.08%) ⬆️
flax/errors.py 85.14% <0.00%> (+0.45%) ⬆️
flax/config.py 92.85% <0.00%> (+15.93%) ⬆️
flax/struct.py 96.92% <0.00%> (+18.46%) ⬆️
flax/serialization.py 94.44% <0.00%> (+25.35%) ⬆️
flax/training/train_state.py 72.22% <0.00%> (+72.22%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@cgarciae cgarciae marked this pull request as ready for review November 4, 2022 17:04
@cgarciae cgarciae requested review from 8bitmp3 and levskaya and removed request for 8bitmp3 November 4, 2022 17:05
@cgarciae
Copy link
Collaborator Author

cgarciae commented Nov 4, 2022

Hey @zaxtax, thanks for the feedback! What did you mean by:

I think this is good, but it could help if the code could be fully copy-pasted.

Like a "copy to clipboard" widget?

@8bitmp3 8bitmp3 self-requested a review November 4, 2022 21:38
@8bitmp3 8bitmp3 self-assigned this Nov 4, 2022
@zaxtax
Copy link
Collaborator

zaxtax commented Nov 4, 2022 via email

@cgarciae
Copy link
Collaborator Author

cgarciae commented Nov 5, 2022

Makes sense! Once the content for this guide is finalized the idea is to copy this into a notebook and add an "open in colab" link.

Copy link
Collaborator

@levskaya levskaya left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really awesome, I don't have any useful critiques at the moment. lgtm!!!!

Copy link
Collaborator

@marcvanzee marcvanzee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some small nits, but overall looks really great! This is a super useful tutorial, BatchNorm is quite tricky and this guide explains exactly what needs to be changed.

LGTM, assuming you incorporate my changes.

docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
docs/guides/batch_norm.rst Outdated Show resolved Hide resolved
@cgarciae
Copy link
Collaborator Author

Hey @marcvanzee! Is the ready to go? 🚀

@marcvanzee
Copy link
Collaborator

Sure, I thought @8bitmp3 might want to have a look, but I think he has plenty of other tasks, and given that I already looked at it in quite some detail I think we can merge it.

@marcvanzee
Copy link
Collaborator

@8bitmp3 if you still have comments, feel free to create a new PR with suggested modifications.

Copy link
Collaborator

@8bitmp3 8bitmp3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed and updated. Thank you 👍

@marcvanzee
Copy link
Collaborator

Thanks @8bitmp3, great improvements!

@copybara-service copybara-service bot merged commit d291af9 into google:main Nov 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants