Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TF: T5 can now handle a padded past (i.e. XLA generation) #17969

Merged
merged 2 commits into from
Jul 4, 2022

Conversation

gante
Copy link
Member

@gante gante commented Jun 30, 2022

What does this PR do?

In TF T5, we now fetch the correct slice of position_bias -- the same way we do it in FLAX. The key difference is that FLAX relies on an external variable for the generated length that gets incremented every time past gets updated, and here the same value is obtained dynamically from the past array (latest filled past index = generated length - 1, where latest filled past index corresponds to the maximum index with non-0 values).

All slow tests are passing and we no longer have length restrictions on the XLA beam search test, which means that:

  1. Although the code for eager execution was changed, all outputs remain the same;
  2. XLA generation matches non-XLA generation.

@gante
Copy link
Member Author

gante commented Jun 30, 2022

related issue: #17935

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 30, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@Rocketknight1 Rocketknight1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@patrickvonplaten
Copy link
Contributor

Great job on finding and fixing the bug here @gante - cool that T5 works now :-)

@gante gante merged commit f098268 into huggingface:main Jul 4, 2022
@gante gante deleted the xla_t5 branch July 4, 2022 18:47
viclzhu pushed a commit to viclzhu/transformers that referenced this pull request Jul 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants