Deberta gives completelly wrong and random output tested on multiple machines and multiple versions of transformers #16456

Oxi84 · 2022-03-28T17:56:12Z

I do not understand why would you add this model here write lengthy documentation and let it not work at all.

Its prediction is: The capital of France is plunge

I have tried on multiple machines and multiple version of transformers and the results are just random.

from transformers import pipeline
unmasker = pipeline('fill-mask', model='deberta-base')
the_out = unmasker("The capital of France is [MASK].")
print("the_out",the_out)

As you can see the deberta results is completely wrong, there is some big error in porting it to transformers.

the_out [{'score': 0.001861382625065744, 'token': 18929, 'token_str': 'ABC', 'sequence': 'The capital of France isABC.'}, {'score': 0.0012871784856542945, 'token': 15804, 'token_str': ' plunge', 'sequence': 'The capital of France is plunge.'}, {'score': 0.001228992477990687, 'token': 47366, 'token_str': 'amaru', 'sequence': 'The capital of France isamaru.'}, {'score': 0.0010126306442543864, 'token': 46703, 'token_str': 'bians', 'sequence': 'The capital of France isbians.'}, {'score': 0.0008897537481971085, 'token': 43107, 'token_str': 'insured', 'sequence': 'The capital of France isinsured.'}]

from transformers import pipeline
unmasker = pipeline('fill-mask', model='bert-base-uncased')
the_out = unmasker("The capital of France is [MASK].")
print("the_out",the_out)

The bert result is good:

the_out [{'score': 0.41678911447525024, 'token': 3000, 'token_str': 'paris', 'sequence': 'the capital of france is paris.'}, {'score': 0.07141649723052979, 'token': 22479, 'token_str': 'lille', 'sequence': 'the capital of france is lille.'}, {'score': 0.06339272856712341, 'token': 10241, 'token_str': 'lyon', 'sequence': 'the capital of france is lyon.'}, {'score': 0.04444753751158714, 'token': 16766, 'token_str': 'marseille', 'sequence': 'the capital of france is marseille.'}, {'score': 0.030297178775072098, 'token': 7562, 'token_str': 'tours', 'sequence': 'the capital of france is tours.'}]

@LysandreJik

The text was updated successfully, but these errors were encountered:

nbroad1881 · 2022-03-29T21:56:03Z

I don't think the pre-trained language model weights are available for deberta-base, so the output looks completely random. See here for more info: #15216

Oxi84 · 2022-03-31T20:46:13Z

I see but it does not also work for base-v3 as well I will check other models and see.

github-actions · 2022-04-28T15:01:49Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

github-actions bot closed this as completed May 7, 2022

nbroad1881 mentioned this issue Aug 17, 2022

Deberta MaskedLM Corrections #18674

Closed

8 tasks

ArthurZucker mentioned this issue Mar 13, 2023

[Deberta/Deberta-v2] Refactor code base to support compile, export, and fix LLM #22105

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Deberta gives completelly wrong and random output tested on multiple machines and multiple versions of transformers #16456

Deberta gives completelly wrong and random output tested on multiple machines and multiple versions of transformers #16456

Oxi84 commented Mar 28, 2022 •

edited

Loading

nbroad1881 commented Mar 29, 2022

Oxi84 commented Mar 31, 2022 •

edited

Loading

github-actions bot commented Apr 28, 2022

Deberta gives completelly wrong and random output tested on multiple machines and multiple versions of transformers #16456

Deberta gives completelly wrong and random output tested on multiple machines and multiple versions of transformers #16456

Comments

Oxi84 commented Mar 28, 2022 • edited Loading

nbroad1881 commented Mar 29, 2022

Oxi84 commented Mar 31, 2022 • edited Loading

github-actions bot commented Apr 28, 2022

Oxi84 commented Mar 28, 2022 •

edited

Loading

Oxi84 commented Mar 31, 2022 •

edited

Loading