TFLongformerForMaskedMLM example throws ValueError "shapes are incompatible" #11488

fredo838 · 2021-04-28T10:30:31Z

An official example of the TFLongFormerX page does not work.

Environment info

transformers version: 2.4.1
Platform: ubuntu 20.04
Python version: python3.8
PyTorch version (GPU?): N/A
Tensorflow version (GPU?): 2.4.1
Using GPU in script?: no
Using distributed or parallel set-up in script?: no

Who can help

@patrickvonplaten (Longformer)
@Rocketknight1 (tensorflow)
@sgugger (maintained examples )

Information

Model I am using: Longformer

The problem arises when using:

[x ] the official example scripts: (give details below)
my own modified scripts: (give details below)

To reproduce

Steps to reproduce the behavior:

docker run -it --rm python:3.8 bash (no gpus attached)
python3 -m pip install pip --upgrade
python3 -m pip install transformers tensorflow
python3 -> launch interactive shell
run following lines:

from transformers import LongformerTokenizer, TFLongformerForMaskedLM
import tensorflow as tf
tokenizer = LongformerTokenizer.from_pretrained('allenai/longformer-base-4096')
model = TFLongformerForMaskedLM.from_pretrained('allenai/longformer-base-4096')
inputs = tokenizer("The capital of France is [MASK].", return_tensors="tf")
inputs["labels"] = tokenizer("The capital of France is Paris.", return_tensors="tf")["input_ids"]
outputs = model(inputs)
# loss = outputs.loss
# logits = outputs.logits

This throws following error:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.8/site-packages/tensorflow/python/keras/engine/base_layer.py", line 1012, in __call__
    outputs = call_fn(inputs, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/transformers/models/longformer/modeling_tf_longformer.py", line 2140, in call
    loss = None if inputs["labels"] is None else self.compute_loss(inputs["labels"], prediction_scores)
  File "/usr/local/lib/python3.8/site-packages/transformers/modeling_tf_utils.py", line 158, in compute_loss
    reduced_logits = tf.boolean_mask(tf.reshape(logits, (-1, shape_list(logits)[2])), active_loss)
  File "/usr/local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 1831, in boolean_mask_v2
    return boolean_mask(tensor, mask, name, axis)
  File "/usr/local/lib/python3.8/site-packages/tensorflow/python/util/dispatch.py", line 201, in wrapper
    return target(*args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/tensorflow/python/ops/array_ops.py", line 1751, in boolean_mask
    shape_tensor[axis:axis + ndims_mask].assert_is_compatible_with(shape_mask)
  File "/usr/local/lib/python3.8/site-packages/tensorflow/python/framework/tensor_shape.py", line 1134, in assert_is_compatible_with
    raise ValueError("Shapes %s and %s are incompatible" % (self, other))
ValueError: Shapes (11,) and (9,) are incompatible

The text was updated successfully, but these errors were encountered:

Rocketknight1 · 2021-04-29T12:05:50Z

Hi! The model is working fine here, but the problem is that "[MASK]" and "Paris" are being tokenized as different numbers of tokens, which is where your shape error is coming from. Can you link me to the exact script you got this example from?

fredo838 · 2021-04-29T14:49:55Z

It's under this headline, here's the permalink: https://huggingface.co/transformers/model_doc/longformer.html#tflongformerformaskedlm

fredo838 · 2021-04-29T14:51:29Z

ah so it's probably just updating inputs["labels"] = tokenizer("The capital of France is Paris.", return_tensors="tf")["input_ids"] to inputs["labels"] = tokenizer("The capital of [MASK] is Paris.", return_tensors="tf")["input_ids"], no?

Rocketknight1 · 2021-04-29T20:00:47Z

I checked and you're absolutely right, the example as written does not work. I did some digging and the problem is that the mask sequence for this model is actually '<mask>' and not '[MASK]'. Therefore, 'Paris' actually does get correctly tokenized as one token but '[MASK]' does not get recognized as a special character and is 'spelled out' with three word-piece tokens instead. (You can see what splits the tokenizer chose by using tokenizer.convert_ids_to_tokens() on the tokenized inputs).

The example should work if you replace '[MASK]' with '<mask>'. Can you try that and let me know? If it works, we can make a PR to fix this example!

fredo838 · 2021-04-30T07:37:18Z

So now the following example:

import tensorflow as tf
tokenizer = LongformerTokenizer.from_pretrained('allenai/longformer-base-4096')
model = TFLongformerForMaskedLM.from_pretrained('allenai/longformer-base-4096')
inputs = tokenizer("The capital of France is <mask>.", return_tensors="tf")
inputs["labels"] = tokenizer("The capital of France is Paris.", return_tensors="tf")["input_ids"]
outputs = model(inputs)
loss = outputs.loss
logits = outputs.logits
preds = tf.argmax(logits, axis=2)
predicted_tokens = tokenizer.convert_ids_to_tokens(tf.squeeze(preds))
print("predicted_tokens: ", predicted_tokens)

yields:

['<s>', 'The', 'Ġcapital', 'Ġof', 'ĠFrance', 'Ġis', 'ĠParis', '.', '</s>']

So at least we're doing something right, but there's still this weird Ġ character on every non-first token.

Rocketknight1 · 2021-04-30T12:38:25Z

Ah, yes! The Ġ character is used to indicate word breaks. If you want to see the pure string output without it, try using the decode() method instead of convert_ids_to_tokens().

Other than that, though, your example looks good! I talked with people on the team and we can't use it directly, annoyingly - the examples are all built from the same template, so we can't easily change just one. Still, we can pass some arguments to make sure our example works for Longformer in future.

The relevant bit is here. If you'd like to try it yourself, you can submit a PR to add the argument mask='<mask>' to the add_code_sample_docstrings decorator. If that sounds like a lot of work, just let me know and I'll make the PR and credit you for spotting it!

fredo838 · 2021-05-03T11:15:37Z

@Rocketknight1 I added a PR (#11559)

Rocketknight1 · 2021-05-03T11:39:00Z

Closing this because we have the PR now!

fredo838 changed the title ~~simple TFLongformerForMaskedMLM throws ValueError "shapes are incompatible"~~ TFLongformerForMaskedMLM example throws ValueError "shapes are incompatible" Apr 28, 2021

fredo838 mentioned this issue May 3, 2021

fix the mlm longformer example by changing [MASK] to <mask> #11559

Merged

Rocketknight1 closed this as completed May 3, 2021

patrickvonplaten reopened this May 3, 2021

patrickvonplaten linked a pull request May 3, 2021 that will close this issue

fix the mlm longformer example by changing [MASK] to <mask> #11559

Merged

Rocketknight1 closed this as completed in #11559 May 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TFLongformerForMaskedMLM example throws ValueError "shapes are incompatible" #11488

TFLongformerForMaskedMLM example throws ValueError "shapes are incompatible" #11488

fredo838 commented Apr 28, 2021 •

edited

Loading

Rocketknight1 commented Apr 29, 2021

fredo838 commented Apr 29, 2021

fredo838 commented Apr 29, 2021

Rocketknight1 commented Apr 29, 2021 •

edited

Loading

fredo838 commented Apr 30, 2021 •

edited

Loading

Rocketknight1 commented Apr 30, 2021

fredo838 commented May 3, 2021 •

edited

Loading

Rocketknight1 commented May 3, 2021

TFLongformerForMaskedMLM example throws ValueError "shapes are incompatible" #11488

TFLongformerForMaskedMLM example throws ValueError "shapes are incompatible" #11488

Comments

fredo838 commented Apr 28, 2021 • edited Loading

Environment info

Who can help

Information

To reproduce

Rocketknight1 commented Apr 29, 2021

fredo838 commented Apr 29, 2021

fredo838 commented Apr 29, 2021

Rocketknight1 commented Apr 29, 2021 • edited Loading

fredo838 commented Apr 30, 2021 • edited Loading

Rocketknight1 commented Apr 30, 2021

fredo838 commented May 3, 2021 • edited Loading

Rocketknight1 commented May 3, 2021

fredo838 commented Apr 28, 2021 •

edited

Loading

Rocketknight1 commented Apr 29, 2021 •

edited

Loading

fredo838 commented Apr 30, 2021 •

edited

Loading

fredo838 commented May 3, 2021 •

edited

Loading