[AMP] Turn off accumulation data types for mixed precision pass #8341

AndrewZhaoLuo · 2021-06-25T16:10:13Z

CUDA codegen cannot seem to handle half types super well. Furthermore, mixing half types and floating point also seems to expose additional issues. Furthermore, some schedules which are supposed to support heterogenous outputs do not.

This seems like a problem in codegen not with the mixed precision pass, so for now I am turning off accumulating into FP32 for the mixed precision pass. With this we can tune BERT and YoloV2 with results here: https://docs.google.com/spreadsheets/d/12lgyfuHaRS-X4uG-1iQOV8oAuPpuVAbspcmkOSPRFHQ/edit#gid=0

I will leave the codegen issues for #8294.
I will leave the issues with schedule not supporting output dtypes here #8340

comaniac

LGTM. btw I added a tag to the PR title so that it can be easily found in the future.

comaniac · 2021-06-25T20:54:17Z

CI failed due to #8344

…r now test to mixed precision more tests add internal func call broadcast failures moreee add comment and change lstm unit test to pass on cuda

…he#8341) * don't use mixed precision accumulators * turn off fp32 accumulators for now, adjust passing test cases * Add TODO on cuda codegen for failures. Make test case pass on cuda for now test to mixed precision more tests add internal func call broadcast failures moreee add comment and change lstm unit test to pass on cuda * remove debug statements * to mixed precision * rebase main * rtol and atol adjustments * bump up tolerance again * jostle CI

comaniac approved these changes Jun 25, 2021

View reviewed changes

comaniac changed the title ~~Turn off accumulation data types for mixed precision pass~~ [AMP] Turn off accumulation data types for mixed precision pass Jun 25, 2021

AndrewZhaoLuo mentioned this pull request Jun 25, 2021

[AMP] CUDA support for mixed precision pass #8294

Closed

tqchen assigned comaniac Jun 25, 2021

AndrewZhaoLuo force-pushed the andrewluo-fix-cuda branch from 268a49e to 39cfc8c Compare June 25, 2021 21:21

AndrewZhaoLuo added 7 commits June 27, 2021 23:33

don't use mixed precision accumulators

6b2d16f

turn off fp32 accumulators for now, adjust passing test cases

d777bc2

Add TODO on cuda codegen for failures. Make test case pass on cuda fo…

e2493a0

…r now test to mixed precision more tests add internal func call broadcast failures moreee add comment and change lstm unit test to pass on cuda

remove debug statements

a07f37d

to mixed precision

90d763d

rebase main

0bcf899

rtol and atol adjustments

effd073

AndrewZhaoLuo force-pushed the andrewluo-fix-cuda branch from 39cfc8c to effd073 Compare June 28, 2021 06:33

AndrewZhaoLuo added 2 commits June 28, 2021 12:55

bump up tolerance again

44e15c9

jostle CI

09c91ba

masahi merged commit 282c532 into apache:main Jun 29, 2021

junrushao mentioned this pull request Nov 1, 2021

Apache TVM v0.8 Release Note Candidate #9416

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMP] Turn off accumulation data types for mixed precision pass #8341

[AMP] Turn off accumulation data types for mixed precision pass #8341

AndrewZhaoLuo commented Jun 25, 2021 •

edited

Loading

comaniac left a comment •

edited

Loading

comaniac commented Jun 25, 2021

[AMP] Turn off accumulation data types for mixed precision pass #8341

[AMP] Turn off accumulation data types for mixed precision pass #8341

Conversation

AndrewZhaoLuo commented Jun 25, 2021 • edited Loading

comaniac left a comment • edited Loading

Choose a reason for hiding this comment

comaniac commented Jun 25, 2021

AndrewZhaoLuo commented Jun 25, 2021 •

edited

Loading

comaniac left a comment •

edited

Loading