Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix decode_input_ids to bare T5Model and improve doc #18791

Merged
merged 8 commits into from
Sep 6, 2022

Conversation

ekagra-ranjan
Copy link
Contributor

@ekagra-ranjan ekagra-ranjan commented Aug 28, 2022

What does this PR do?

  • Fix 1: use the tokenizer to obtain the labels as tensors. docs/source/en/model_doc/t5.mdx
  • Fix 2: src/transformers/models/t5/
    • Present case: T5 prepends the decoder_input_ids with pad token. This preprocessing is handled internally by T5ForConditionalGeneration by shifting the labels to the right.
    • Issue: This preprocessing needs to be done manually while using bare T5Model. This is missing from the example which uses bare T5Model.
    • Proposed Fix: Added a preprocessing step in the example so that the input matches with what T5 expects at its decoder. The PR reuses the _shift_right() method which is an internal function to T5. Please let me know if we can rename _shift_right() to shift_right() or if there is a better way to handle this.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

@sgugger @patrickvonplaten @patil-suraj

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Aug 28, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your PR! @patrickvonplaten is better suited to review it as he knows T5 better than I do :-)

... padding="longest",
... max_length=max_target_length,
... truncation=True,
... return_tensors="pt",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!

@patrickvonplaten
Copy link
Contributor

Thanks for the fixes!

ekagra-ranjan and others added 3 commits September 5, 2022 20:34
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
@ekagra-ranjan
Copy link
Contributor Author

@patrickvonplaten Thanks for the review! Applied your suggestions.

@patrickvonplaten patrickvonplaten merged commit f85acb4 into huggingface:main Sep 6, 2022
oneraghavan pushed a commit to oneraghavan/transformers that referenced this pull request Sep 26, 2022
* use tokenizer to output tensor

* add preprocessing for decoder_input_ids for bare T5Model

* add preprocessing to tf and flax

* linting

* linting

* Update src/transformers/models/t5/modeling_flax_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_tf_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants