-
Notifications
You must be signed in to change notification settings - Fork 27.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TFLongformerForMaskedMLM example throws ValueError "shapes are incompatible" #11488
Comments
Hi! The model is working fine here, but the problem is that "[MASK]" and "Paris" are being tokenized as different numbers of tokens, which is where your shape error is coming from. Can you link me to the exact script you got this example from? |
It's under this headline, here's the permalink: https://huggingface.co/transformers/model_doc/longformer.html#tflongformerformaskedlm |
ah so it's probably just updating |
I checked and you're absolutely right, the example as written does not work. I did some digging and the problem is that the mask sequence for this model is actually '<mask>' and not '[MASK]'. Therefore, 'Paris' actually does get correctly tokenized as one token but '[MASK]' does not get recognized as a special character and is 'spelled out' with three word-piece tokens instead. (You can see what splits the tokenizer chose by using The example should work if you replace '[MASK]' with '<mask>'. Can you try that and let me know? If it works, we can make a PR to fix this example! |
So now the following example:
yields:
So at least we're doing something right, but there's still this weird |
Ah, yes! The Ġ character is used to indicate word breaks. If you want to see the pure string output without it, try using the Other than that, though, your example looks good! I talked with people on the team and we can't use it directly, annoyingly - the examples are all built from the same template, so we can't easily change just one. Still, we can pass some arguments to make sure our example works for Longformer in future. The relevant bit is here. If you'd like to try it yourself, you can submit a PR to add the argument |
@Rocketknight1 I added a PR (#11559) |
Closing this because we have the PR now! |
An official example of the
TFLongFormerX
page does not work.Environment info
transformers
version: 2.4.1Who can help
@patrickvonplaten (Longformer)
@Rocketknight1 (tensorflow)
@sgugger (maintained examples )
Information
Model I am using: Longformer
The problem arises when using:
To reproduce
Steps to reproduce the behavior:
docker run -it --rm python:3.8 bash
(no gpus attached)python3 -m pip install pip --upgrade
python3 -m pip install transformers tensorflow
python3
-> launch interactive shellThis throws following error:
The text was updated successfully, but these errors were encountered: