Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

forward() got an unexpected keyword argument 'cross_attn_head_mask' #18

Closed
mrm8488 opened this issue Jun 22, 2021 · 20 comments
Closed

forward() got an unexpected keyword argument 'cross_attn_head_mask' #18

mrm8488 opened this issue Jun 22, 2021 · 20 comments

Comments

@mrm8488
Copy link

mrm8488 commented Jun 22, 2021

----> 1 paraphrase_t5("Kyle Lowry scored 33 points and Norman Powell added 23 to lift the Toronto Raptors to a 122-125 victory over the Boston Celtics on Wednesday night.")

4 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
   1049         if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1050                 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051             return forward_call(*input, **kwargs)
   1052         # Do not call functions when jit is used
   1053         full_backward_hooks, non_full_backward_hooks = [], []

TypeError: forward() got an unexpected keyword argument 'cross_attn_head_mask'
@Anku5hk
Copy link

Anku5hk commented Jun 24, 2021

I am having the same issue on latest version of transformers(4.8.0), a workaround would be to downgrade transformers to 4.4.2 and it'll works. But still needs fixing.

@kagrze
Copy link
Contributor

kagrze commented Jun 24, 2021

It's enough to downgrade to 4.6.1. Version 4.4.2 doesn't have this patch: huggingface/transformers#10651

@MysteryVaibhav
Copy link

I still hit this error. If I downgrade transformers to 4.6.1, then I get an error ImportError: cannot import name 'AutoModelForSeq2SeqLM'
Anyone figured this out?

@sam-writer
Copy link
Contributor

This is actually not a small thing. Here's what I think is going on:

First, check out this transformers PR:
huggingface/transformers#11621

before that PR,

T5 does not use head masks properly (there’s an old glitch that the decoder uses encoder’s head_mask instead of cross_attn_head_mask)

and I think fastT5's current implementation relies on that old behavior, in that the decoders use the encoder's attention mask. So AFAICT, there needs to be a non-negligible rewrite to use newer versions of transformers

@MysteryVaibhav
Copy link

@sam-writer Thanks, I got it working by installing fastt5 on a clean conda slate

@sam-writer
Copy link
Contributor

This is actually not a small thing. Here's what I think is going on:

First, check out this transformers PR: huggingface/transformers#11621

before that PR,

T5 does not use head masks properly (there’s an old glitch that the decoder uses encoder’s head_mask instead of cross_attn_head_mask)

and I think fastT5's current implementation relies on that old behavior, in that the decoders use the encoder's attention mask. So AFAICT, there needs to be a non-negligible rewrite to use newer versions of transformers

@kagrze @Ki6an please correct me if I am wrong

@Oxi84
Copy link

Oxi84 commented Nov 28, 2021

I get the same error with the newest verion of transfroemrs and 4.6.1 does not work at all.

@Oxi84
Copy link

Oxi84 commented Nov 28, 2021

Also with 4.4.2 it does not work at all.

So no way to get it to work now.

@aseifert
Copy link
Contributor

aseifert commented Nov 28, 2021

Yeah, I'm having the same problem, I can't get it to work at all. I set up a huggingface space to demonstrate the issue: https://huggingface.co/spaces/aseifert/fastt5 (this demo app will show the output of pip freeze before it throws a ImportError: cannot import name 'AutoModelForSeq2SeqLM' from 'transformers' (unknown location) due to running from fastT5 import export_and_get_onnx_model, get_onnx_model

Underlying code: https://huggingface.co/spaces/aseifert/fastt5/blob/main/app.py
requirements.txt: https://huggingface.co/spaces/aseifert/fastt5/blob/main/requirements.txt
logs from building the container: https://huggingface.co/spaces/aseifert/fastt5/logs/build

@Ki6an
Copy link
Owner

Ki6an commented Nov 28, 2021

@aseifert you forgot to import AutoTokenizer in your code
first, import it and try again,

from transformers import AutoTokenizer

I tried it on my machine and it gives the following output without any error.

You are using a model of type mt5 to instantiate a model of type t5. This is not supported for all configurations of models and can yield errors.
Exporting to onnx... |################################| 3/3
Quantizing... |################################| 3/3
Setting up onnx model...
You are using a model of type mt5 to instantiate a model of type t5. This is not supported for all configurations of models and can yield errors.
Done!
Dieго vienでಗೆ about08 су су су су су су су су су су су су

the output is similar to the output of the original model.
also tried on colab and still getting the same result.

@aseifert
Copy link
Contributor

@Ki6an thank you so much for looking into this! Indeed I forgot to include the AutoTokenizer. However, adding it doesn't resolve the problem in the huggingface space environment (cf. links above). It's weird …

@sam-writer
Copy link
Contributor

@aseifert I assume the huggingface space is using a version of transformers > 4.6.1, which fastt5 can't work with currently

I think there are 2 issues here, @Ki6an :

  1. people trying to use fastt5 with a version of transformers which it does not support (anything > 4.6.1)
  2. what I am trying to point out, which is the reason that fastt5 is stuck on v4.6.1 of transformers

FWIW, I am currently working on a fix so that fastt5 will be able to support transformers v4.7.0 and above

@aseifert
Copy link
Contributor

aseifert commented Dec 2, 2021

@sam-writer I pinned transformers to 4.6.1 in the requirements, and I check this by writing the output of pip freeze to the streamlit app, which indeed shows transformers==4.6.1

very good to hear that you are working on fixing fastt5 for higher versions of transformers, thanks for that! in my case however, something is off even with 4.6.1

@sam-writer
Copy link
Contributor

@aseifert fwiw I get a 503 when I try to look at the Spaces log

when I look at @Ki6an 's colab, it looks like it is doing the same thing as your space, right? If that is the case, then it has to do the environment (HF Spaces) not this library... does that sound right?

@aseifert
Copy link
Contributor

aseifert commented Dec 2, 2021

@sam-writer You are right, it does look like something specific to the env. I opened an issue here huggingface/transformers#14604

Thanks!

@Ki6an
Copy link
Owner

Ki6an commented Dec 3, 2021

@sam-writer @aseifert I was able to fix this issue, the changes to transformers were made right after I created a PR. and that was causing the issue.

it's a simple one-line fix 97d5505

haven't tested it enough, let me know if you face any issues.

@Ki6an Ki6an closed this as completed Dec 3, 2021
@sam-writer
Copy link
Contributor

@Ki6an I will check it out. I was expecting there to also be changes to generate_onnx_representation, so I will play around with that

@Ki6an
Copy link
Owner

Ki6an commented Dec 4, 2021

@sam-writer cross_attn_head_mask is an optional parameter like head_mask and decoder_head_mask, currently we are exporting decoder to onnx with only one optional parameter decoder_attention_mask .

https://github.com/huggingface/transformers/blob/73ec4340ec651ca1fe4f8ead9206297a4d4ed79c/src/transformers/models/t5/modeling_t5.py#L1150

we could add these additional params (while exporting) if they improve the speed or accuracy.

@sam-writer
Copy link
Contributor

It would probably be good to have some script for testing the accuracy of the ONNX version... I think I saw something like this in the TensorRT demo from NVIDIA

@smyja
Copy link

smyja commented Aug 15, 2022

I am still experiencing this problem

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants