-
Notifications
You must be signed in to change notification settings - Fork 648
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Flax Dropout guide #2675
Add Flax Dropout guide #2675
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Adding @cgarciae as a reviewer too since he's been thinking about this guide as well. |
Thanks @marcvanzee 👍 @cgarciae Your feedback would be super welcome 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work! I think this will be an awesome tutorial, just needs a bit of work to get it really perfect.
@marcvanzee @cgarciae Thanks. PTAL 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
Thanks for the help @cgarciae 👍 @marcvanzee some of @cgarciae's review comments got stuck in "pending", they are out now (see above). We have gone over them and the new commits are on the way. Some changes will involve some changes to your code suggestions. |
@marcvanzee fyi working on adding the training step |
Reviewed the code and text with @marcvanzee. Committed and pushed the changes to the branch. cc @cgarciae. Amendments to the Sharp Bits will be in a separate PR. |
LGTM -- thanks a lot for this @8bitmp3, looks really awesome!! 🥳 (Please fix the doc build failures) |
Codecov Report
@@ Coverage Diff @@
## main #2675 +/- ##
=======================================
Coverage 81.21% 81.21%
=======================================
Files 53 53
Lines 5605 5605
=======================================
Hits 4552 4552
Misses 1053 1053 Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
All checks ✔️ @cgarciae PTAL. Thank you. Live preview: https://flax--2675.org.readthedocs.build/en/2675/guides/dropout.html (build successful @marcvanzee). |
@cgarciae PTAL. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @8bitmp3! Looks great, lets merge this!
nn.Module
andnn.Dropout
), initialization, the forward pass, and the training step (withTrainStep
).flax/examples
like a Transformer-based model trained on the WMT dataset (with dropout and attention dropout).Shorten the Flax - The Sharp Bits by migrating some of the dropout instructions to the new guide.(To be done in a separate PR)Live preview: https://flax--2675.org.readthedocs.build/en/2675/guides/dropout.html