Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Wrong shape for input_ids" error when running basic example on Windows #452

Open
danielplatt opened this issue Sep 25, 2020 · 10 comments
Open

Comments

@danielplatt
Copy link

danielplatt commented Sep 25, 2020

When running the following code:

model = SentenceTransformer('bert-base-nli-mean-tokens')
sentences = ['This framework generates embeddings for each input sentence',
             'Sentences are passed as a list of string.',
             'The quick brown fox jumps over the lazy dog.']
sentence_embeddings = model.encode(sentences)

I get the following error:

ValueError: Wrong shape for input_ids (shape torch.Size([39])) or attention_mask (shape torch.Size([39]))

Traceback (most recent call last):
  File "X", line 5 in <module>
    sentence_embeddings = model.encode(sentences)
  File "X\SentenceTransformer.py", line 158, in encode
    out_features = self.forward(features)
  File "X\lib\site-packages\torch\nn\modules\container.py", line 117, in forward
    input = module(input)
  File "X\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "X\sentence_transformers\models\BERT.py", line 33, in forward
    output_states = self.bert(**features)
  File "X\lib\site-packages\torch\nn\modules\module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "X\lib\site-packages\transformers\modeling_bert.py", line 804, in forward
    extended_attention_mask: torch.Tensor = self.get_extended_attention_mask(attention_mask, input_shape, device)
  File "X\lib\site-packages\transformers\modeling_utils.py", line 262, in get_extended_attention_mask
    input_shape, attention_mask.shape
ValueError: Wrong shape for input_ids (shape torch.Size([39])) or attention_mask (shape torch.Size([39]))

(Notice I replaced file paths by X, so those are not the real file paths)

Using Windows 10, Anaconda, PyTorch 1.6, transformers 3.1.0, sentence-transformers 0.3.6. What should I do to run the above example?

@nreimers
Copy link
Member

I sadly don't know why this happens. Everything looks right.

@danielplatt
Copy link
Author

If I use the model 'distilbert-base-nli-mean-tokens' instead, I get the error IndexError: Dimension out of range (expected to be in range of [-1, 0], but got 1). Same for Windows and Linux. Is it possible that it's a hardware issue?

@nreimers
Copy link
Member

Never experienced that issue.

Maybe you can try some other release version of sentence-transfomers / transformers / pytorch and see if it solves the issue?

@PhilipMay
Copy link
Contributor

I think I had that error yesterday. The reason was that my transformers version was the newest. Try to reinstall sentence_transformers so transformers will be downgraded again: pip install sentence_transformers -U

@danielplatt
Copy link
Author

I think I had that error yesterday. The reason was that my transformers version was the newest. Try to reinstall sentence_transformers so transformers will be downgraded again: pip install sentence_transformers -U

Thanks for the message. I tried it, but it seems sbert.net is not available right now, so I cannot download the models from there. But if I clone the repo and take the models from there, the error stays the same.

@orenpapers
Copy link

@PhilipMay @danielplatt @nreimers Any solution to this error? Also gets it

@abmitra84
Copy link

abmitra84 commented Oct 19, 2020

I have the same problem in linux. It is an existing problem on transformers thread (cisnlp/simalign#10 (comment)).. @nreimers Kindly have a look. I am having the same issue

pytorch 1.6
sentence_transformers and transformers both were latest versions

@D3nn3
Copy link

D3nn3 commented Dec 15, 2020

Have the same problem. Downgrading the transformer package to 3.0.2 as mentioned in this issue (huggingface/transformers#5907) might help but not when training models with PyTorch/XLA since there was a bug in the transformer version when trying to do inference on CPU (huggingface/transformers#5636).

So it seems that currently there is no way to train models with PyTorch/XLA and using sentence-transformers afterwords. @nreimers did you already have time to look into the problem?

@NikolaiGulatz
Copy link

Is there any solution for this?

@D3nn3
Copy link

D3nn3 commented Dec 16, 2020

I have the same problem in linux. It is an existing problem on transformers thread (cisnlp/simalign#10 (comment)).. @nreimers Kindly have a look. I am having the same issue

pytorch 1.6
sentence_transformers and transformers both were latest versions

I checked out the link you provided and used their fix which also works here. It seems like a simple transformation of the tensors is all that's needed. Don't know what changed in the newer transformers version that this is needed though. Just need two lines at the beginning of the forward method in the Transformer.py module:

for _, feature in features.items():
    feature.resize_(1, len(feature))

The method looks like this after my fix:

def forward(self, features):
    """Returns token_embeddings, cls_token"""
    # Added this piece of code =>
    for _, feature in features.items():
        feature.resize_(1, len(feature))
    # <= Added this piece of code

    output_states = self.auto_model(**features)
    output_tokens = output_states[0]

    cls_tokens = output_tokens[:, 0, :]  # CLS token is first token
    features.update(
        {
            "token_embeddings": output_tokens,
            "cls_token_embeddings": cls_tokens,
            "attention_mask": features["attention_mask"],
        }
    )

    if self.auto_model.config.output_hidden_states:
        all_layer_idx = 2
        if (
            len(output_states) < 3
        ):  # Some models only output last_hidden_states and all_hidden_states
            all_layer_idx = 1

        hidden_states = output_states[all_layer_idx]
        features.update({"all_layer_embeddings": hidden_states})

    return features

Using a model trained from scratch with the latest transformers version works but I can't guarantee that this doesn't break something else. I'd also guess that this will break functionality with older versions of the transformers package (not tested). It could be a good starting point though in finding a permanent solution. Hope this helps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants