-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Device and dtype properties #462
Conversation
Codecov Report
@@ Coverage Diff @@
## master #462 +/- ##
=====================================
- Coverage 96% 96% -0%
=====================================
Files 130 130
Lines 4301 4341 +40
=====================================
+ Hits 4126 4159 +33
- Misses 175 182 +7 |
for more information, see https://pre-commit.ci
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this mean metrics used in a lightning module where mixed precision training is used would be converted to use fp16 precision as well? is that always desirable? do people want to compute metrics in fp32 while doing the rest of model training in fp16?
@ananthsub the PR does not actually introduce that kind of change, it has always been the case in TM that if you cast your metric to fp16 the metric states would also be casted (since we have overridden the If people want fp32 metrics when doing mixed training, I am not sure about. I am not sure that it matters for that many to have the extra precision during training. However, when it comes to testing, it is very clear for me that users should be using fp32. |
On mixed-precision case - I agree with Ananth that this is potentially a concern (especially for metrics with accumulations - fp16 will overflow at ~64k so having a 100k sample dataset and doing fp16 training will break even simple metrics like Accuracy), but not new thing introduced by this diff. Let's file an issue and track it? I'm assuming people will get Worth it to have an example of how to do mixed-precision metric calculation in the docs once this question comes up though. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great !
yes, pls do so 🐰 |
@SkafteNicki mind checking /resolving the last comments? |
@SkafteNicki we are having a very related discussion about device & dtype properties here: https://docs.google.com/document/d/1xHU7-iQSpp9KJTjI3As2EM0mfNHHr37WZYpDpwLkivA/edit#heading=h.cvihcwdhwas5 Given metrics are nn.Modules, what happens if metrics have parameters which live on different devices or have different dtypes? Then we're at odds with this: pytorch/pytorch#7460 (comment) This makes metrics a restricted set of modules, which could potentially limit use cases in the future. |
It's not desirable, but this behavior was already introduced by accident in cda5dbd. I opened #484 for tracking. |
* add gpu testing * change super * move to metric + simplify * fix bert * update docs * add typing * changelog * Apply suggestions from code review Co-authored-by: Jirka Borovec <Borda@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> (cherry picked from commit b10dba4)
@ananthsub, @maximsch2 please see PR #493 that should fix the problems with auto cast. |
Before submitting
What does this PR do?
Fixes #455
Issue describes how the
Bootstrapper
metric does not work currently on gpu. Trying to fix this made me realize that we do not have a easy way of getting the device and dtype of a metric. This PR implements the logic from theDeviceDtypeMixin
class taken from lightning into the coreMetric
class.https://github.com/PyTorchLightning/pytorch-lightning/blob/38ceb8943ef9b858abead1fbba43ea9a9b4cd93b/pytorch_lightning/core/mixins/device_dtype_mixin.py
Additional, solve the issue using the new property :]
PR review
Anyone in the community is free to review the PR once the tests have passed.
If we didn't discuss your PR in Github issues there's a high chance it will not be merged.
Did you have fun?
Make sure you had fun coding 🙃