Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't load t5-11b from pre-trained #6517

Closed
saareliad opened this issue Aug 16, 2020 · 3 comments
Closed

Can't load t5-11b from pre-trained #6517

saareliad opened this issue Aug 16, 2020 · 3 comments

Comments

@saareliad
Copy link

Environment info

  • transformers version: 3.0.2
  • Platform:
  • Python version: 3.8.2
  • PyTorch version 1.6

Who can help

T5: @patrickvonplaten

Information

The model I am using: T5

To reproduce

Steps to reproduce the behavior:

import transformers
transformers.T5ForConditionalGeneration.from_pretrained("t5-11b")


OSError: Can't load weights for 't5-11b'. Make sure that:

- 't5-11b' is a correct model identifier listed on 'https://huggingface.co/models'

- or 't5-11b' is the correct path to a directory containing a file named one of pytorch_model.bin, tf_model.h5, model.ckpt.

Expected behavior

the model should be loaded.

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Aug 17, 2020

Hey @saareliad,
can you try:

t5 = transformers.T5ForConditionalGeneration.from_pretrained('t5-11b', use_cdn = False)

Also, see: #5423

But the model cannot really be run before we take a closer look at: #3578.

@julien-c
Copy link
Member

julien-c commented Aug 17, 2020

@patrickvonplaten mind adding a big disclaimer to the model card for this particular checkpoint? About what you just said (CDN limitation + model parallelism)

@saareliad
Copy link
Author

Thanks @patrickvonplaten ,
Our work successfully adds (several types of) model parallellism and trains T5 and several other large transformers and is integrated with HF for quite a while.

Will opensource it soon :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants