Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T5 conversion issue #7

Closed
chessgecko opened this issue Aug 11, 2022 · 3 comments
Closed

T5 conversion issue #7

chessgecko opened this issue Aug 11, 2022 · 3 comments

Comments

@chessgecko
Copy link
Contributor

Not sure where this issue belongs, but figured I'd put it here in case anyone else has the same issue

when running generate on a converted t5 model I got the following error:
AttributeError: 'Parameter' object has no attribute 'CB'

It turned out that T5ForConditionalGeneration.named_parameters() didn't iterate over lm_head, so I was able to fix it by changing

https://github.com/huggingface/transformers/blob/c8b6ae858d61e5bc10e388d095aa74f7690d1021/src/transformers/utils/bitsandbytes.py#L139-L142

    # otherwise they have an attached head
    list_modules = list(model.named_parameters())
    last_name = list_modules[-1][0]
    return last_name.split(".")[0]

to

return "lm_head"

not sure if it's just my version of torch or where to put the issue but env:
torch: 1.13.0a0+340c412
cuda: 11.7
bnb: 0.31.8
transformers: 4.22.0.dev0

@younesbelkada
Copy link
Collaborator

younesbelkada commented Aug 11, 2022

Hi @chessgecko ,

Great catch ! This should be accessible once this PR gets merged: huggingface/transformers#18579
If you want to already run it on colab I made a colab demo that we have planned to make it public next week but I'll share it here: https://colab.research.google.com/drive/1YORPWx4okIHXnjW7MSAidXN29mPVNT7F?usp=sharing

Hope this helps and thanks a lot!

@younesbelkada
Copy link
Collaborator

Closing the issue as huggingface/transformers#18579 has been merged

@Oxi84
Copy link

Oxi84 commented Aug 19, 2022

Should this work for all T5 models or just t5-3b-sharded?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants