Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not warn about unexpected decoder weights when loading T5EncoderModel and LongT5EncoderModel #26211

Merged

Conversation

fleonce
Copy link
Contributor

@fleonce fleonce commented Sep 18, 2023

What does this PR do?

Adds [r"decoder"] to both T5EncoderModel and LongT5EncoderModel, as both models do not have any decoder layers and loading pretrained model checkpoints like t5-small will give warnings about keys found in the checkpoint but not in the model itself. To prevent this issue, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for both model classes.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

@ArthurZucker @younesbelkada

Both T5EncoderModel and LongT5EncoderModel do not have any decoder layers, so
loading a pretrained model checkpoint such as t5-small will give warnings about
keys found in the model checkpoint that are not in the model itself.

To prevent this log warning, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for
both T5EncoderModel and LongT5EncoderModel
Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for this PR!
The explanation makes sense to me! However I was not able to repro on the main branch:

>>> import transformers
>>> model = transformers.T5EncoderModel.from_pretrained("t5-small")
>>> 

Can you share a reproducible snippet?

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

@fleonce
Copy link
Contributor Author

fleonce commented Sep 18, 2023

Thank you for the follow-up question!
The problem can be reproduced when the log level for the whole library has been set to INFO:

import transformers
transformers.logging.set_verbosity_info()
m = transformers.T5EncoderModel.from_pretrained('t5-small')

I was not using the latest version of transformers when I was encountering this issue initially, now it is hidden by default because verbosity info is required to show those kinds of warnings, unless T5EncoderModel is contained in model.config.architectures, but only T5ForConditionalGeneration is in there. This seems to be caused by commit 096f2cf (

warner = logger.warning if model.__class__.__name__ in archs else logger.info
)

The problem persists however, but we wont see the warning by default

Copy link
Contributor

@younesbelkada younesbelkada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Managed to reproduce, this makes sense, thanks !

Copy link
Collaborator

@ArthurZucker ArthurZucker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@ArthurZucker ArthurZucker merged commit 216dff7 into huggingface:main Sep 28, 2023
blbadger pushed a commit to blbadger/transformers that referenced this pull request Nov 8, 2023
…del and LongT5EncoderModel (huggingface#26211)

Ignore decoder weights when using T5EncoderModel and LongT5EncoderModel

Both T5EncoderModel and LongT5EncoderModel do not have any decoder layers, so
loading a pretrained model checkpoint such as t5-small will give warnings about
keys found in the model checkpoint that are not in the model itself.

To prevent this log warning, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for
both T5EncoderModel and LongT5EncoderModel
EduardoPach pushed a commit to EduardoPach/transformers that referenced this pull request Nov 18, 2023
…del and LongT5EncoderModel (huggingface#26211)

Ignore decoder weights when using T5EncoderModel and LongT5EncoderModel

Both T5EncoderModel and LongT5EncoderModel do not have any decoder layers, so
loading a pretrained model checkpoint such as t5-small will give warnings about
keys found in the model checkpoint that are not in the model itself.

To prevent this log warning, r"decoder" has been added to _keys_to_ignore_on_load_unexpected for
both T5EncoderModel and LongT5EncoderModel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants