-
Notifications
You must be signed in to change notification settings - Fork 27.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Embedding index getting out of range while running camemebert model #4153
Comments
I am running into the same error on my own script. Interestingly it only appears on CPU... Did you find a solution? |
No, I want to get a French Q&A pipeline, surprinsingly, with the hugging face pipeline everything works great, I can plug the code in a local server and make requests on it. But when I try to use the same code in a docker envrionement to ship it, it fail with this error (only in french with camembert, classic bert works fine) I get the error locally as well if I try not to use the hugging face pipeline but write my own inference (as described above) |
I can confirm it's working on GPU local (and even in a docker) but still stuck on CPU |
I actually figured out my error. I was adding special tokens to the tokenizer (like begin-of-sequence) but did not resize the models token embeddings via: |
It's patched now, please install from source and there should be no error anymore! |
Hi @LysandreJik, I'm concious that I used it with a non QA model but it was to try the base model supported by hugging face. I tried as well with I will install the latest version and try it. Thnaks a lot for the fast support ! |
I tried your fix but it lead to key errors :
|
Could you provide a reproducible script? I can't reproduce. |
My problem here was surely linked with #4674 everything seems to work now, thanks a lot |
🐛 Bug
Information
Model I am using (Bert, XLNet ...):
Camembert
Language I am using the model on (English, Chinese ...):
French
The problem arises when using:
The tasks I am working on is:
To reproduce
Steps to reproduce the behavior:
Initialisation :
bert = CamembertModel.from_pretrained("camembert-base")
bert_tok = CamembertTokenizer.from_pretrained("camembert-base")
Inference : like https://huggingface.co/transformers/usage.html#question-answering
It works by removing the context argument (text_pair argument) but I need it to do question answering with other models and it lead to the same error with pipelines
Expected behavior
Run inference without any error
Environment info
transformers
version: 2.8.0The text was updated successfully, but these errors were encountered: