Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading mBART Large 50 MMT (many-to-many) is slow #10364

Closed
2 of 4 tasks
xhluca opened this issue Feb 24, 2021 · 5 comments
Closed
2 of 4 tasks

Loading mBART Large 50 MMT (many-to-many) is slow #10364

xhluca opened this issue Feb 24, 2021 · 5 comments

Comments

@xhluca
Copy link
Contributor

xhluca commented Feb 24, 2021

Environment info

I'm installing the library directly from master and running it in a kaggle notebook.

  • transformers version: 4.4.0.dev0
  • Platform: Linux-5.4.89+-x86_64-with-debian-buster-sid
  • Python version: 3.7.9
  • PyTorch version (GPU?): 1.7.0 (False)
  • Tensorflow version (GPU?): 2.4.1 (False)
  • Using GPU in script?: No
  • Using distributed or parallel set-up in script?: No

Who can help

Information

Model I am using (Bert, XLNet ...): mBART-Large 50 MMT (many-to-many)

The problem arises when using:

  • the official example scripts: (give details below)
  • my own modified scripts: (give details below)

After caching the weights of the model, load it with from_pretrained is significantly slower compared with torch.load.

The tasks I am working on is:

  • an official GLUE/SQUaD task: (give the name)
  • my own task or dataset: (give details below)

Machine Translation

To reproduce

Here's the kaggle notebook reproducing the issue. Here's a colab notebook showing essentially the same thing.

Steps to reproduce the behavior:

  1. Load model with model = MBartForConditionalGeneration.load_pretrained("facebook/mbart-large-50-many-to-many-mmt")
  2. Save model with model.save_pretrained('./my-model')
  3. Save model with torch.save(model, 'model.pt')
  4. Reload and time with MBartForConditionalGeneration.load_pretrained('./my-model')
  5. Load with torch.load('model.pt')

The step above can be reproduced inside a kaggle notebook:

model = MBartForConditionalGeneration.load_pretrained("facebook/mbart-large-50-many-to-many-mmt")
model.save_pretrained('./my-model/')
torch.save(model, 'model.pt')
%time model = MBartForConditionalGeneration.from_pretrained("./my-model/")
%time torch_model = torch.load('model.pt')

We will notice that loading with from_pretrained (step 4) is significantly slower than torch.load (step 5); the former takes over 1 minute and the latter just a few seconds (or around 20s if it hasn't been previously loaded in memory; see notebook).

Expected behavior

The model should take less than 1 minute to load if it has already been cached (see step 1)

@xhluca xhluca changed the title Loading mBART Large 50 MMT (many-to-many) is extremely slow Loading mBART Large 50 MMT (many-to-many) is slow Feb 24, 2021
@LysandreJik
Copy link
Member

Related: #9205

@xhluca
Copy link
Contributor Author

xhluca commented Feb 24, 2021

Thanks. I'll rerun the benchmarks once patrick makes the changes.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@xhluca
Copy link
Contributor Author

xhluca commented Apr 14, 2021

Has there been an updated to #9205 timeline?

@github-actions
Copy link

github-actions bot commented May 9, 2021

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants