-
Notifications
You must be signed in to change notification settings - Fork 27k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Deepspeed] add many more models to the model zoo test #12695
Conversation
d201464
to
06751a0
Compare
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work, thanks a lot @stas00 !
nice work @stas00, have you tested Perceiver with DeepSpeed. |
Would be glad to do that, @sameeravithana- in order to do that I need is a Trainer-based example script that I can test with. As you can see from this map: transformers/tests/deepspeed/test_model_zoo.py Lines 231 to 270 in 4a419d4
I have each model tested by one of HF Trainer examples. Is there one that can be used with perceiver? |
…2695) * model zoo take 2 * add deberta * new param for zero2 * doc update * doc update * add layoutlm * bump deepspeed * add deberta-v2, funnel, longformer * new models * style * add t5_v1 * update TAPAS status * reorg problematic models * move doc to another PR * style * fix checkpoint check test * making progress on more models running * cleanup * new version * cleanup
…2695) * model zoo take 2 * add deberta * new param for zero2 * doc update * doc update * add layoutlm * bump deepspeed * add deberta-v2, funnel, longformer * new models * style * add t5_v1 * update TAPAS status * reorg problematic models * move doc to another PR * style * fix checkpoint check test * making progress on more models running * cleanup * new version * cleanup
…2695) * model zoo take 2 * add deberta * new param for zero2 * doc update * doc update * add layoutlm * bump deepspeed * add deberta-v2, funnel, longformer * new models * style * add t5_v1 * update TAPAS status * reorg problematic models * move doc to another PR * style * fix checkpoint check test * making progress on more models running * cleanup * new version * cleanup
This PR continues figuring out how to make various models work with Deepspeed (a lot of fixes happen on the Deepspeed side), most models just work out of the box - the main purpose of this PR is to test as many models as possible. so there are no fixes to add.
Thanks to @LysandreJik for creating the tiny test models for many of HF models!
Some models I couldn't cover for a variety of reasons unrelated to Deepspeed (missing tokenizers, missing tiny models, missing example scripts to exercise these). But their status is documented in the script. Over time more will be tested.
Blocking events - all resolved:
forward
#13665 (fixes positional embeddings: m2m_100 and others)