GPT-2 currently exhausts all available GPU memory on an 8 GB GPU #673

BradLarson · 2020-09-29T17:09:10Z

In testing PR #671, we noticed that the GPT-2 model now exhausts all available memory on 8 GB GPUs (example: GTX 1080) for both eager mode and X10 runtimes. It did not do this previously, so at some point the RAM usage of this model has increased to the point where it can no longer train on these GPUs.

We should investigate why this happened and see if memory usage for this model can be brought back down.

xihui-wu · 2020-10-05T19:22:25Z

As I tested an 16GB GPU VM, I do see that among the last nearly 2/3 of the epochs (10 epochs in total), a peak memory usage of 9187MB happens once in each epoch, and they happen around last training batch.

xihui-wu · 2020-11-18T17:24:32Z

I just verified again on a new 16GB GPU DLVM instance created today, issue sustains.

saeta assigned xihui-wu Sep 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPT-2 currently exhausts all available GPU memory on an 8 GB GPU #673

GPT-2 currently exhausts all available GPU memory on an 8 GB GPU #673

BradLarson commented Sep 29, 2020

xihui-wu commented Oct 5, 2020 •

edited

Loading

xihui-wu commented Nov 18, 2020

GPT-2 currently exhausts all available GPU memory on an 8 GB GPU #673

GPT-2 currently exhausts all available GPU memory on an 8 GB GPU #673

Comments

BradLarson commented Sep 29, 2020

xihui-wu commented Oct 5, 2020 • edited Loading

xihui-wu commented Nov 18, 2020

xihui-wu commented Oct 5, 2020 •

edited

Loading