Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deberta gives completelly wrong and random output tested on multiple machines and multiple versions of transformers #16456

Closed
Oxi84 opened this issue Mar 28, 2022 · 3 comments · Fixed by #22105

Comments

@Oxi84
Copy link

Oxi84 commented Mar 28, 2022

I do not understand why would you add this model here write lengthy documentation and let it not work at all.

Its prediction is: The capital of France is plunge

I have tried on multiple machines and multiple version of transformers and the results are just random.

from transformers import pipeline
unmasker = pipeline('fill-mask', model='deberta-base')
the_out = unmasker("The capital of France is [MASK].")
print("the_out",the_out)

As you can see the deberta results is completely wrong, there is some big error in porting it to transformers.

the_out [{'score': 0.001861382625065744, 'token': 18929, 'token_str': 'ABC', 'sequence': 'The capital of France isABC.'}, {'score': 0.0012871784856542945, 'token': 15804, 'token_str': ' plunge', 'sequence': 'The capital of France is plunge.'}, {'score': 0.001228992477990687, 'token': 47366, 'token_str': 'amaru', 'sequence': 'The capital of France isamaru.'}, {'score': 0.0010126306442543864, 'token': 46703, 'token_str': 'bians', 'sequence': 'The capital of France isbians.'}, {'score': 0.0008897537481971085, 'token': 43107, 'token_str': 'insured', 'sequence': 'The capital of France isinsured.'}]

from transformers import pipeline
unmasker = pipeline('fill-mask', model='bert-base-uncased')
the_out = unmasker("The capital of France is [MASK].")
print("the_out",the_out)

The bert result is good:

the_out [{'score': 0.41678911447525024, 'token': 3000, 'token_str': 'paris', 'sequence': 'the capital of france is paris.'}, {'score': 0.07141649723052979, 'token': 22479, 'token_str': 'lille', 'sequence': 'the capital of france is lille.'}, {'score': 0.06339272856712341, 'token': 10241, 'token_str': 'lyon', 'sequence': 'the capital of france is lyon.'}, {'score': 0.04444753751158714, 'token': 16766, 'token_str': 'marseille', 'sequence': 'the capital of france is marseille.'}, {'score': 0.030297178775072098, 'token': 7562, 'token_str': 'tours', 'sequence': 'the capital of france is tours.'}]

@LysandreJik

@nbroad1881
Copy link
Contributor

I don't think the pre-trained language model weights are available for deberta-base, so the output looks completely random. See here for more info: #15216

@Oxi84
Copy link
Author

Oxi84 commented Mar 31, 2022

I see but it does not also work for base-v3 as well I will check other models and see.

@github-actions
Copy link

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants