Default to `precision=bf16` on CPU when `precision=16` is passed #10033

carmocca · 2021-10-19T23:33:17Z

What does this PR do?

With this change, Trainer(precision=16, accelerator='cpu) now runs with precision='bf16' automatically.
This is desirable for no-code-change transitions between environments with different accelerators, for example, local (CPU only) to Colab (GPU or TPU)

Trainer(amp_backend="apex", precision=16, accelerator='cpu') will still raise an error as apex does not support bf16

Part of #10027

Does your PR introduce any breaking changes? If yes, please list them.

None

Before submitting

Was this discussed/approved via a GitHub issue? (not for typos and docs)
Did you read the contributor guideline, Pull Request section?
Did you make sure your PR does only one thing, instead of bundling different changes together?
[n/a] Did you make sure to update the documentation with your changes? (if necessary)
Did you write any new necessary tests? (not for typos and docs)
Did you verify new and existing tests pass locally with your changes?
Did you list all the breaking changes introduced by this pull request?
Did you update the CHANGELOG? (not for typos, docs, test updates, or internal minor changes/refactorings)

PR review

Is this pull request ready for review? (if not, please submit in draft mode)
Check that all items from Before submitting are resolved
Make sure the title is self-explanatory and the description concisely explains the PR
Add labels and milestones (and optionally projects) to the PR so it can be classified

pytorch_lightning/trainer/connectors/accelerator_connector.py

awaelchli

nice

will you add a changelog entry? because this would be a backward incompatible change since the default type changes

pytorch_lightning/trainer/connectors/accelerator_connector.py

codecov · 2021-10-20T02:27:48Z

Codecov Report

Merging #10033 (9fbb921) into master (d45897d) will increase coverage by 0%.
The diff coverage is 90%.

@@           Coverage Diff           @@
##           master   #10033   +/-   ##
=======================================
  Coverage      93%      93%           
=======================================
  Files         180      180           
  Lines       15870    15890   +20     
=======================================
+ Hits        14689    14720   +31     
+ Misses       1181     1170   -11

tchaton

LGTM !

rohitgr7 · 2021-10-20T10:16:46Z

pytorch_lightning/trainer/connectors/accelerator_connector.py

-        if self.precision in (16, "bf16"):
+
+        # maybe convert the precision value
+        if self.precision == 16 and self.use_cpu:


I think we support Trainer(precision="16"), do we?

I don't think so because we don't convert the value to a PrecisionType (which works for int and str)

IMO we should, but should be done in a follow-up. The previous code here had the same check

carmocca added 2 commits October 20, 2021 01:24

Default to bf16 on CPU

0e9c7ee

Add fsdp test

d83a281

carmocca added the feature Is an improvement or enhancement label Oct 19, 2021

carmocca added this to the v1.5 milestone Oct 19, 2021

carmocca requested review from awaelchli and Borda as code owners October 19, 2021 23:33

carmocca self-assigned this Oct 19, 2021

carmocca requested review from justusschock, kaushikb11, rohitgr7, SeanNaren, tchaton and williamFalcon as code owners October 19, 2021 23:33

carmocca commented Oct 19, 2021

View reviewed changes

pytorch_lightning/trainer/connectors/accelerator_connector.py Show resolved Hide resolved

carmocca added 4 commits October 20, 2021 01:37

Message

cd0f6b6

Add test

dae02c0

Add message

92dba2f

Fix test

041281e

awaelchli approved these changes Oct 20, 2021

View reviewed changes

pytorch_lightning/trainer/connectors/accelerator_connector.py Outdated Show resolved Hide resolved

pytorch_lightning/trainer/connectors/accelerator_connector.py Outdated Show resolved Hide resolved

carmocca added 6 commits October 20, 2021 02:24

Review

5b189af

Review

8f6c1bb

Review

6767da2

.

c376daf

Remove test that doe snot apply

ef0ea87

Remove tests

989f708

.

9fbb921

s-rog approved these changes Oct 20, 2021

View reviewed changes

mergify bot added the ready PRs ready to be merged label Oct 20, 2021

tchaton approved these changes Oct 20, 2021

View reviewed changes

tchaton enabled auto-merge (squash) October 20, 2021 08:32

rohitgr7 reviewed Oct 20, 2021

View reviewed changes

tchaton merged commit f0b3e0f into master Oct 20, 2021

tchaton deleted the feat/deafult-to-bf16 branch October 20, 2021 13:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default to `precision=bf16` on CPU when `precision=16` is passed #10033

Default to `precision=bf16` on CPU when `precision=16` is passed #10033

carmocca commented Oct 19, 2021 •

edited

Loading

awaelchli left a comment

codecov bot commented Oct 20, 2021 •

edited

Loading

tchaton left a comment

rohitgr7 Oct 20, 2021

carmocca Oct 20, 2021

Default to precision=bf16 on CPU when precision=16 is passed #10033

Default to precision=bf16 on CPU when precision=16 is passed #10033

Conversation

carmocca commented Oct 19, 2021 • edited Loading

What does this PR do?

Does your PR introduce any breaking changes? If yes, please list them.

Before submitting

PR review

awaelchli left a comment

Choose a reason for hiding this comment

codecov bot commented Oct 20, 2021 • edited Loading

Codecov Report

tchaton left a comment

Choose a reason for hiding this comment

rohitgr7 Oct 20, 2021

Choose a reason for hiding this comment

carmocca Oct 20, 2021

Choose a reason for hiding this comment

Default to `precision=bf16` on CPU when `precision=16` is passed #10033

Default to `precision=bf16` on CPU when `precision=16` is passed #10033

carmocca commented Oct 19, 2021 •

edited

Loading

codecov bot commented Oct 20, 2021 •

edited

Loading