-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Closed
Description
The ZeRO 3 example does not run. The main problem appears to be that the InitContext function does not actually exist despite being called by pretrain_gpt2.py. I have tried to introduce some changes to get it to run (incl. changing the batch size, the initialization function, and some of the inputs to the initialization function) but gave up after it threw the error variable beta1 is referenced before assignment. I think that has to do with something wonky in the optimizer?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels