Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference for TFMarianMTModel (en to Romance language translation) is slow and inaccurate #18149

Closed
2 of 4 tasks
danielenricocahall opened this issue Jul 15, 2022 · 11 comments · Fixed by #21368
Closed
2 of 4 tasks
Assignees
Labels
bug WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress

Comments

@danielenricocahall
Copy link

danielenricocahall commented Jul 15, 2022

System Info

System
macOS Monterey 12.2.1

transformers==4.20.1
tensorflow==2.9.1

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

from transformers import TFMarianMTModel, MarianTokenizer
model_name = "Helsinki-NLP/opus-mt-en-ROMANCE"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = TFMarianMTModel.from_pretrained(model_name)
text_in = ['>>fr<< hello']
batch = tokenizer(text_in, return_tensors='tf', padding=True)
translated = model.generate(**batch) 

Output:

- Qu'est-ce qu'il y a, là-bas, là-bas, là---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Expected behavior

I would expect similar performance to the PyTorch model.

Inference requires about 120s on my machine and outputs an incorrect translation. In contrast, the PyTorch model (replacing TFMarianMTModel with MarianMTModel and changing return_tensors to 'pt' in the code snippet) returns the correct translation ("Bonjour") and inference requires about 6s on my machine.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@LysandreJik
Copy link
Member

Sorry for missing this! Could you take a look at this, @gante, @Rocketknight1, @ydshieh?

@LysandreJik LysandreJik reopened this Aug 24, 2022
@ydshieh
Copy link
Collaborator

ydshieh commented Aug 24, 2022

Let me take a look on the quality issue. And possibly @gante or @Rocketknight1 for the speed issue, let's discuss it :-)

@ydshieh ydshieh self-assigned this Aug 24, 2022
@ydshieh
Copy link
Collaborator

ydshieh commented Aug 24, 2022

Actually, the performance issue comes from the quality issue. The TF version didn't stop the generation until 512 tokens.

[[65000    25  2092     7   179    15   276   185     7   227    32     9
      2  2538    15  5716     2  2538    15  5716     2  2538    15    15
     15    15    15    15    15    15    15    15    15    15    15    15 ............
     0]], shape=(1, 512), dtype=int32)

@ydshieh
Copy link
Collaborator

ydshieh commented Aug 24, 2022

I believe the current PT / TF checkpoints for "Helsinki-NLP/opus-mt-en-ROMANCE" doesn't contain the same weight.
As if I change from

model = TFMarianMTModel.from_pretrained(model_name)

to

model = TFMarianMTModel.from_pretrained(model_name, from_pt=True)

I could get

[[65000 21959     3     0 65000 65000 65 ....] (still `512` tokens)

while the PyTorch version gives

tensor([[65000, 21959,     3,     0]])

So:

  • we probably need to check which checkpoint is the correct one, and uploaded the new checkpoint
  • investigate why TFMarianMTModel doesn't stop earlier.

@ydshieh
Copy link
Collaborator

ydshieh commented Aug 24, 2022

After a double check (see code below, where I use from_pt=True), I believe the current PT checkpoint is the correct one, but not the TF checkpoint.

@gante Would you like to have a look too, upload a new TF checkpoint, and see why TFMarianMTModel doesn't stop the generation earlier as MarianMTModel does?

from transformers import MarianMTModel, MarianTokenizer, TFMarianMTModel

model_name = "Helsinki-NLP/opus-mt-en-ROMANCE"
tokenizer = MarianTokenizer.from_pretrained(model_name)

# text_in = ['>>fr<< hello']
text_in = ['>>fr<< Hello, I am a student.']

model = MarianMTModel.from_pretrained(model_name)

batch = tokenizer(text_in, return_tensors='pt', padding=True)
translated = model.generate(**batch)
o = tokenizer.batch_decode(translated, skip_special_tokens=True)

print(translated)
print(o)

model = TFMarianMTModel.from_pretrained(model_name, from_pt=True)

batch = tokenizer(text_in, return_tensors='tf', padding=True)
translated = model.generate(**batch)
o = tokenizer.batch_decode(translated, skip_special_tokens=True)

print(translated)
print(o)


text_in = ['>>it<< I love dogs and cats.']


model = MarianMTModel.from_pretrained(model_name)

batch = tokenizer(text_in, return_tensors='pt', padding=True)
translated = model.generate(**batch)
o = tokenizer.batch_decode(translated, skip_special_tokens=True)

print(translated)
print(o)

model = TFMarianMTModel.from_pretrained(model_name, from_pt=True)

batch = tokenizer(text_in, return_tensors='tf', padding=True)
translated = model.generate(**batch)
o = tokenizer.batch_decode(translated, skip_special_tokens=True)

print(translated)
print(o)

@gante
Copy link
Member

gante commented Aug 24, 2022

Hi there @ydshieh @danielenricocahall 👋

None of the Marian models can be successfully converted to TF -- they all fail when validating the hidden layers and outputs of the models. This is a shame since there are a ton of Marian models for translation :(

This means there is something wrong with either the model architecture or with weight cross-loading. I haven't looked into it, other than noticing the issue when attempting to convert the weights from Helsinki-NLP

@danielenricocahall
Copy link
Author

Thank you for looking into it @ydshieh and @gante !!! This is great information.

@gante
Copy link
Member

gante commented Sep 5, 2022

@danielenricocahall a fix was merged and new weights were pushed -- if you run from main, the translations should be much better now 🙌

@huggingface huggingface deleted a comment from github-actions bot Sep 29, 2022
@ydshieh
Copy link
Collaborator

ydshieh commented Sep 29, 2022

cc @gante

We still have the generation issue

from transformers import MarianMTModel, MarianTokenizer, TFMarianMTModel

model_name = "Helsinki-NLP/opus-mt-en-ROMANCE"
tokenizer = MarianTokenizer.from_pretrained(model_name)
text_in = ['>>fr<< hello']

# PT generates a few tokens then stops early -> very fast 
model = MarianMTModel.from_pretrained(model_name)
batch = tokenizer(text_in, return_tensors='pt', padding=True)
translated = model.generate(**batch)
o = tokenizer.batch_decode(translated, skip_special_tokens=True)

print(translated)
print(o)

# TF generates 512 tokens, although the decoded version gives the same result as PT -> very slow
model = TFMarianMTModel.from_pretrained(model_name, from_pt=False)
batch = tokenizer(text_in, return_tensors='tf', padding=True)
translated = model.generate(**batch)
o = tokenizer.batch_decode(translated, skip_special_tokens=True)

print(translated)
print(o)

@huggingface huggingface deleted a comment from github-actions bot Oct 25, 2022
@ydshieh ydshieh added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Oct 25, 2022
@jamie0725
Copy link

@ydshieh Hi, I am experiencing the same issue. Expected the TF version would be faster than the PT version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress
Projects
None yet
5 participants