XLNet Bug when training with apex 16-bit precision #6567

johndolgov · 2020-08-18T11:13:02Z

XLNet training fail, while using 16-bit precision, because of tensor creation with explicit usage dtype=torch.float mode in relative_positional_encodings function.

codecov · 2020-08-18T11:23:29Z

Codecov Report

Merging #6567 into master will decrease coverage by 0.77%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #6567      +/-   ##
==========================================
- Coverage   79.18%   78.41%   -0.78%     
==========================================
  Files         156      156              
  Lines       28129    28129              
==========================================
- Hits        22275    22056     -219     
- Misses       5854     6073     +219

Impacted Files	Coverage Δ
src/transformers/modeling_xlnet.py	`83.30% <100.00%> (ø)`
src/transformers/optimization.py	`25.55% <0.00%> (-70.00%)`	⬇️
src/transformers/pipelines.py	`25.63% <0.00%> (-54.32%)`	⬇️
src/transformers/modeling_tf_gpt2.py	`65.68% <0.00%> (-29.33%)`	⬇️
src/transformers/optimization_tf.py	`33.33% <0.00%> (-24.33%)`	⬇️
src/transformers/modeling_tf_auto.py	`48.79% <0.00%> (-18.08%)`	⬇️
src/transformers/data/processors/squad.py	`13.76% <0.00%> (-14.38%)`	⬇️
src/transformers/modeling_auto.py	`64.36% <0.00%> (-14.37%)`	⬇️
src/transformers/modelcard.py	`82.71% <0.00%> (-2.47%)`	⬇️
src/transformers/modeling_distilbert.py	`96.19% <0.00%> (-1.64%)`	⬇️
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 12d7624...1c18ddf. Read the comment docs.

JetRunner

Thanks for your contribution!

Could you post a screenshot of the thrown error before this change?

johndolgov · 2020-08-20T08:50:30Z

Of course, @JetRunner, here it is

JetRunner

Of course, @JetRunner, here it is

Great! Would you mind adding a one-line comment explaining the cast?

johndolgov · 2020-08-20T14:51:27Z

@JetRunner, done

JetRunner · 2020-08-20T17:34:16Z

Merging since the CI error looks unrelated.

* xlnet fp16 bug fix * comment cast added * Update modeling_xlnet.py Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

…ce#6567)" This reverts commit bbd2774.

xlnet fp16 bug fix

7080cfa

JetRunner reviewed Aug 20, 2020

View reviewed changes

JetRunner approved these changes Aug 20, 2020

View reviewed changes

comment cast added

0a9991a

Update modeling_xlnet.py

1c18ddf

JetRunner merged commit 9539583 into huggingface:master Aug 20, 2020

Zigur pushed a commit to Zigur/transformers that referenced this pull request Oct 26, 2020

XLNet Bug when training with apex 16-bit precision (huggingface#6567)

14c1658

* xlnet fp16 bug fix * comment cast added * Update modeling_xlnet.py Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

fabiocapsouza added a commit to fabiocapsouza/transformers that referenced this pull request Nov 15, 2020

Revert "XLNet Bug when training with apex 16-bit precision (huggingfa…

5e04391

…ce#6567)" This reverts commit bbd2774.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

XLNet Bug when training with apex 16-bit precision #6567

XLNet Bug when training with apex 16-bit precision #6567

johndolgov commented Aug 18, 2020 •

edited

Loading

codecov bot commented Aug 18, 2020 •

edited

Loading

JetRunner left a comment •

edited

Loading

johndolgov commented Aug 20, 2020

JetRunner left a comment

johndolgov commented Aug 20, 2020

JetRunner commented Aug 20, 2020

XLNet Bug when training with apex 16-bit precision #6567

XLNet Bug when training with apex 16-bit precision #6567

Conversation

johndolgov commented Aug 18, 2020 • edited Loading

codecov bot commented Aug 18, 2020 • edited Loading

Codecov Report

JetRunner left a comment • edited Loading

Choose a reason for hiding this comment

johndolgov commented Aug 20, 2020

JetRunner left a comment

Choose a reason for hiding this comment

johndolgov commented Aug 20, 2020

JetRunner commented Aug 20, 2020

johndolgov commented Aug 18, 2020 •

edited

Loading

codecov bot commented Aug 18, 2020 •

edited

Loading

JetRunner left a comment •

edited

Loading