Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generation: fix handling of special tokens #31254

Merged
merged 8 commits into from
Jun 6, 2024

Conversation

zucchini-nlp
Copy link
Member

What does this PR do?

Fixes #31251. For backwards compatibility we have to check all special tokens firstly in kwargs, and then in self.config. Earlier we fixed that for decoder_start_token_id, this PR extends it for all special tokens.

Verified that it works with the provided script and the script from last fix on decoder_start_token_id

@zucchini-nlp zucchini-nlp requested a review from ArthurZucker June 5, 2024 09:22
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@zucchini-nlp
Copy link
Member Author

I had to fix the test by passing special token in config because previously all special tokens were updated inside generation strategy code blocks. I will see how to make it BC, not ready for merge yet

@zucchini-nlp
Copy link
Member Author

I think it's better to just raise warning and ask users to provide attention masks, which I already commited. Another option would be to prepare special tokens twice, once in generate and the second time juts before pre-fill.

I prefer the first options as it's less redundant code and more intuitive, lmk if we want full BC even with repeating code

src/transformers/generation/utils.py Outdated Show resolved Hide resolved
generation_config.eos_token_id, self.generation_config.eos_token_id, device=device
)
pad_token_id = _tensor_or_none(
generation_config.pad_token_id, self.generation_config.eos_token_id, device=device
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be self.generation_config.pad_token_id

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch!

@zucchini-nlp zucchini-nlp merged commit 5fabd1e into huggingface:main Jun 6, 2024
21 checks passed
zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024
* fix special tokens in generatioon

* fix test

* add warning

* fix the check

* warn once

* fix
zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request Jun 14, 2024
* fix special tokens in generatioon

* fix test

* add warning

* fix the check

* warn once

* fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants