Requirements, training times, ... trained models? #3

tpietruszka · 2020-02-25T16:13:16Z

It would be quite helpful to know approx memory requirements and training times for different configs. I could not find such information in the Mogrifier paper or anywhere in this repo.

I've started training the Mogrifier with one of the provided Wikitext-2 configs. Each "turn" took 4-5 minutes on a V100 GPU, so taking 1000 turns as specified in the config would take ~3 days. It filled almost all of the V100's memory, so 16GB is probably required for the provided configs? Or would 12GB cut it? (8GB did not, I've tried).

Are the figures above typical for all the datasets and provided configs, or do they differ significantly?
(or perhaps my setup was wrong and performance should be higher?)

Finally, is there any chance of the trained models getting published?

melisgl · 2020-03-02T12:12:13Z

On PTB, training took about 15 hours for the base LSTM. Three days for the Mogrifier on Wikitext-2 sounds about right.

Whether 12GB is enough depends on the model size. All I can say without trying is that 16GB is enough. If you have a gpu with 8GB, you can still probably train the model by adding a line like this to train.sh:
accum_batch_size=32
where 32 must be a divisor of the real batch size (which is 64, if I recall correctly).

tpietruszka · 2020-03-03T11:34:14Z

Thank you for the information! Do you think it would be OK to add it to the Mogrifier README?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Requirements, training times, ... trained models? #3

Requirements, training times, ... trained models? #3

tpietruszka commented Feb 25, 2020

melisgl commented Mar 2, 2020 •

edited

Loading

tpietruszka commented Mar 3, 2020

Requirements, training times, ... trained models? #3

Requirements, training times, ... trained models? #3

Comments

tpietruszka commented Feb 25, 2020

melisgl commented Mar 2, 2020 • edited Loading

tpietruszka commented Mar 3, 2020

melisgl commented Mar 2, 2020 •

edited

Loading