Generation: fix handling of special tokens #31254

zucchini-nlp · 2024-06-05T09:22:18Z

What does this PR do?

Fixes #31251. For backwards compatibility we have to check all special tokens firstly in kwargs, and then in self.config. Earlier we fixed that for decoder_start_token_id, this PR extends it for all special tokens.

Verified that it works with the provided script and the script from last fix on decoder_start_token_id

HuggingFaceDocBuilderDev · 2024-06-05T09:57:16Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

zucchini-nlp · 2024-06-05T10:09:56Z

I had to fix the test by passing special token in config because previously all special tokens were updated inside generation strategy code blocks. I will see how to make it BC, not ready for merge yet

zucchini-nlp · 2024-06-05T10:34:41Z

I think it's better to just raise warning and ask users to provide attention masks, which I already commited. Another option would be to prepare special tokens twice, once in generate and the second time juts before pre-fill.

I prefer the first options as it's less redundant code and more intuitive, lmk if we want full BC even with repeating code

src/transformers/generation/utils.py

CISC · 2024-06-05T13:18:06Z

src/transformers/generation/utils.py

+            generation_config.eos_token_id, self.generation_config.eos_token_id, device=device
+        )
+        pad_token_id = _tensor_or_none(
+            generation_config.pad_token_id, self.generation_config.eos_token_id, device=device


Should be self.generation_config.pad_token_id

good catch!

* fix special tokens in generatioon * fix test * add warning * fix the check * warn once * fix

zucchini-nlp added 2 commits June 5, 2024 11:18

fix special tokens in generatioon

48c7d2e

Merge remote-tracking branch 'upstream/main' into patch_special_tokens

7bb1008

zucchini-nlp requested a review from ArthurZucker June 5, 2024 09:22

fix test

7afe6c2

add warning

a3ea0cd

fix the check

b17221a

ArthurZucker approved these changes Jun 5, 2024

View reviewed changes

src/transformers/generation/utils.py Outdated Show resolved Hide resolved

warn once

e617d34

CISC reviewed Jun 5, 2024

View reviewed changes

zucchini-nlp added 2 commits June 5, 2024 15:55

fix

7960852

Merge branch 'main' into patch_special_tokens

14fd38c

zucchini-nlp merged commit 5fabd1e into huggingface:main Jun 6, 2024
21 checks passed

zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request Jun 11, 2024

Generation: fix handling of special tokens (huggingface#31254)

eb3231d

* fix special tokens in generatioon * fix test * add warning * fix the check * warn once * fix

zucchini-nlp added a commit to zucchini-nlp/transformers that referenced this pull request Jun 14, 2024

Generation: fix handling of special tokens (huggingface#31254)

3ef92e4

* fix special tokens in generatioon * fix test * add warning * fix the check * warn once * fix

sanchit-gandhi mentioned this pull request Jun 28, 2024

[generate] fix eos/pad id check on mps devices #31695

Merged

This was referenced Jul 15, 2024

Generate: end-to-end compilation #30788

Merged

Generate: store special token tensors under a unique variable name #31980

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generation: fix handling of special tokens #31254

Generation: fix handling of special tokens #31254

zucchini-nlp commented Jun 5, 2024

HuggingFaceDocBuilderDev commented Jun 5, 2024

zucchini-nlp commented Jun 5, 2024

zucchini-nlp commented Jun 5, 2024

CISC Jun 5, 2024

zucchini-nlp Jun 5, 2024

Generation: fix handling of special tokens #31254

Generation: fix handling of special tokens #31254

Conversation

zucchini-nlp commented Jun 5, 2024

What does this PR do?

HuggingFaceDocBuilderDev commented Jun 5, 2024

zucchini-nlp commented Jun 5, 2024

zucchini-nlp commented Jun 5, 2024

CISC Jun 5, 2024

Choose a reason for hiding this comment

zucchini-nlp Jun 5, 2024

Choose a reason for hiding this comment