Move `eos_token_id` to stopping criteria #29459

zucchini-nlp · 2024-03-05T12:25:37Z

What does this PR do?

This PR is a small step for torch.compile and generate compatibility. It moves EOS token to stopping criteria, so now we can loop while stopping_criteria and get rid of extra checks on EOS at the end of each generate method.

All the generate tests, including slow are passing.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline,
Pull Request section?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes? Here are the
documentation guidelines, and
here are tips on formatting docstrings.
Did you write any new necessary tests?

Who can review?

@gante

HuggingFaceDocBuilderDev · 2024-03-05T12:44:11Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

gante

Looking good 💪 A few comments to further refine the idea

In the body of generate, we set pad_token_id to eos_token_id when the latter exists and the former is None.

As such, we can further modify the decoding functions as follows:

the block

            # finished sentences should have their next token be a padding token
            if eos_token_id is not None:
                if pad_token_id is None:
                    raise ValueError("If `eos_token_id` is defined, make sure that `pad_token_id` is defined.")
                next_tokens = next_tokens * unfinished_sequences + pad_token_id * (1 - 
unfinished_sequences)

can become

            # finished sentences should have their next token be a padding token
            if pad_token_id is None:
                next_tokens = next_tokens * unfinished_sequences + pad_token_id * (1 - unfinished_sequences)

because the case that raises the exception can never be triggered from generate (although we should add an integration test that ensures this remains true, if it doesn't already exist!).

see my in-code comment :)

src/transformers/generation/stopping_criteria.py

src/transformers/generation/utils.py

tests/generation/test_stopping_criteria.py

tests/generation/test_utils.py

gante · 2024-03-05T14:40:22Z

Ah, the pipeline tests are ignored by default (just like slow tests). You need to add RUN_PIPELINE_TESTS=1 to run them :)

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

zucchini-nlp · 2024-03-05T18:47:58Z

Done for all comments. I checked that users have to possibility of messing up when calling the these methods directly and ran again all tests (+slow, +pipeline)

gante

LGTM, thank you for iterating 🙌

src/transformers/generation/stopping_criteria.py

tests/generation/test_utils.py

ArthurZucker

Thanks, would be nice to make sure our assumption is always correct, nice cleanup otherwise!

src/transformers/generation/__init__.py

src/transformers/generation/stopping_criteria.py

ArthurZucker · 2024-03-06T02:54:53Z

src/transformers/generation/stopping_criteria.py

+        eos_token_id (`Union[int, List[int]]`):
+            The id of the *end-of-sequence* token. Optionally, use a list to set multiple *end-of-sequence* tokens.


would be nice to also support list of lists ? To have stopping tokens (if the eos is ["<","eos",">"])

Hmm, is it really true for existing models to have eos tokens as a sequence? I guess you are referring to custom eos tokens, when users want to stop at "" but they haven't trained the model with "" as special token. If that's the case, there is StopStringsCriteria PR coming or users are free to write their custom criteria

@gante wdyt?

Yeah, the case where we want to stop on a string will be covered by #28932 (and is way more complex)

This PR is exclusively to port the existing EOS logic into its own stopping criteria :)

Right I forgot it was being covered

ArthurZucker · 2024-03-06T02:55:52Z

src/transformers/generation/utils.py

+                "`eos_token_id` is deprecated in this function and will be removed in v4.41, use"
+                " `stopping_criteria=StoppingCriteriaList([EOSTokenCriteria(eos_token_id=eos_token_id)])` instead.",


It should rather be set in the generation_config I guess? Why can't we leave this as a kwarg and set the generation_config.eos_token_id ?

That's the general idea, we want to stop accepting eos as input argument. The warning is for backward compatibility, since we used to give priority to user defined eos_token_id from args before checking the generation_config.eos_token_id

^ as @zucchini-nlp wrote. This is merely for backward compatibility, if users want to keep calling the decoding methods directly (at their own risk, since we're deprecating their public API)

The decoding methods don't set up new logits processors nor stopping criteria. As such, The correct replacement for the EOS feature is to pass the new EOSTokenCriteria :)

The long-term goal is to disable calling decoding methods directly at all, so we can optimize the codebase.

ArthurZucker · 2024-03-06T02:56:58Z

src/transformers/generation/utils.py

-                .prod(dim=0)
-                .bool()
-            )
+            last_assistant_token_is_eos = stopping_criteria[-1](candidate_input_ids, None)


this means we the last stopping_criteria is always EOSTokenCriteria. It's not necessarily obvious and if people use custom passed criteria I am not sure this will always be the case?

right, will fix it

@ArthurZucker good catch, I missed this one too :)

@gante need you to look. I removed some code, given that assisted decoding works only with one batch, so we can make some assumptions about when to stop generating.

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

gante

One change in the speculative decoding diff and we're ready to go :)

gante · 2024-03-08T11:29:03Z

src/transformers/generation/utils.py

-                .prod(dim=0)
-                .bool()
-            )
+            is_done_candidate = stopping_criteria(candidate_input_ids, None)


🧠 (this is actually much more versatile than the previous version!)

src/transformers/generation/utils.py

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

gante · 2024-03-08T14:02:56Z

@zucchini-nlp needs a rebase with main and it should be ready to be merged

(@ArthurZucker I'm assuming this is ready to be merged, since all comments were addressed. I'm deciding to merge since this unblocks me on the torch.compile front, let us know if you like some post-merge changes :) )

amyeroberts · 2024-03-08T14:53:29Z

@gante Woah, hold on before merging! You still need a core maintainers approval: even if the comments have been addressed it's important to make sure the amendments are approved

gante · 2024-03-18T15:16:28Z

@zucchini-nlp this PR needs to be rebased with main to fix CI :)

gante · 2024-03-20T11:28:00Z

@ArthurZucker ping for approval, if all tasks are complete :)

ArthurZucker

LGTM! Make sure to run the slow tests for important models before merging! + testing generation with Llama and eos_token_id list!

src/transformers/generation/utils.py

ArthurZucker · 2024-03-26T13:13:09Z

src/transformers/generation/utils.py

@@ -2364,10 +2370,26 @@ def _greedy_search(
            )
            stopping_criteria = validate_stopping_criteria(stopping_criteria, max_length)
        pad_token_id = pad_token_id if pad_token_id is not None else self.generation_config.pad_token_id
-        eos_token_id = eos_token_id if eos_token_id is not None else self.generation_config.eos_token_id
+        if eos_token_id is not None:
+            warnings.warn(


same comment here

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

zucchini-nlp · 2024-03-26T15:22:35Z

Rebased main and ran all tests in generation again, including slow, ran a few generation models' tests. Everything is passing, can be merged now. @gante

No idea why the examples test is failing, does not have anything to do with this PR.

agnosticlines · 2024-04-19T10:52:44Z

Just a heads up this commit breaks transformers generation on Apple Silicon as isin is not implemented for the MPS backend

* add eos stopping criteria * minor fix * Update tests/generation/test_stopping_criteria.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * check eos is not None and fix tests * make style and fixup * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/generation/test_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update tests/generation/test_utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * Update src/transformers/generation/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/stopping_criteria.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * camel case everywhere * call stopping criteria list for candidate ids * make style and fixup * Empty commit * Empty commit to pass flaky test * set max length in PromptLookupCandidateGenerator * Update src/transformers/generation/utils.py Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> * lets fix this typo in docs * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/generation/utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update PR * empty commit --------- Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

zucchini-nlp added 2 commits March 5, 2024 13:23

add eos stopping criteria

584de18

minor fix

79a47c4

gante reviewed Mar 5, 2024

View reviewed changes

zucchini-nlp and others added 3 commits March 5, 2024 22:37

Update tests/generation/test_stopping_criteria.py

6c93f8b

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

check eos is not None and fix tests

f59b83f

make style and fixup

8ebad2d

gante approved these changes Mar 5, 2024

View reviewed changes

src/transformers/generation/stopping_criteria.py Outdated Show resolved Hide resolved

tests/generation/test_utils.py Outdated Show resolved Hide resolved

tests/generation/test_utils.py Outdated Show resolved Hide resolved

gante requested a review from ArthurZucker March 5, 2024 19:41

ArthurZucker reviewed Mar 6, 2024

View reviewed changes

zucchini-nlp and others added 14 commits March 6, 2024 13:52

Update src/transformers/generation/stopping_criteria.py

3e2507b

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Update tests/generation/test_utils.py

b77b6ab

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Update tests/generation/test_utils.py

edc76df

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Update src/transformers/generation/__init__.py

f71a687

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Update src/transformers/generation/stopping_criteria.py

387be0e

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Update src/transformers/generation/stopping_criteria.py

14ece04

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Update src/transformers/generation/stopping_criteria.py

bc3eea9

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

camel case everywhere

7518aea

call stopping criteria list for candidate ids

8e5ec57

make style and fixup

2544d12

Empty commit

12acbc4

Empty commit to pass flaky test

1107673

set max length in PromptLookupCandidateGenerator

1ffc554

Merge 'upstream/main' into stopping_criteria

ca0a414

gante reviewed Mar 8, 2024

View reviewed changes

Update src/transformers/generation/utils.py

ce093c1

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

Merge remote-tracking branch 'upstream/main' into stopping_crtiteria

aa7fae4

lets fix this typo in docs

5375d97

zucchini-nlp requested a review from ArthurZucker March 13, 2024 16:40

Merge remote-tracking branch 'upstream/main' into stopping_crtiteria

d103d29

Merge remote-tracking branch 'upstream/main' into stopping_crtiteria

9f59abb

ArthurZucker approved these changes Mar 26, 2024

View reviewed changes

zucchini-nlp and others added 5 commits March 26, 2024 19:31

Update src/transformers/generation/utils.py

48f33fc

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Update src/transformers/generation/utils.py

801af07

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Merge remote-tracking branch 'upstream/main' into stopping_crtiteria

a385c6d

update PR

7c00bb1

empty commit

530064d

gante merged commit 0efcf32 into huggingface:main Mar 27, 2024
21 checks passed

agnosticlines mentioned this pull request Apr 20, 2024

Request for Implementation of 'aten::isin.Tensor_Tensor_out' Operation on MPS Device pytorch/pytorch#124518

Closed

pcuenca mentioned this pull request Apr 21, 2024

Make EosTokenCriteria compatible with mps #30376

Merged

5 tasks

kha-white mentioned this pull request Jul 15, 2024

Problem in generating .mokuro and .html files. kha-white/mokuro#107

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move `eos_token_id` to stopping criteria #29459

Move `eos_token_id` to stopping criteria #29459

zucchini-nlp commented Mar 5, 2024

HuggingFaceDocBuilderDev commented Mar 5, 2024

gante left a comment

gante commented Mar 5, 2024

zucchini-nlp commented Mar 5, 2024

gante left a comment

ArthurZucker left a comment

ArthurZucker Mar 6, 2024

zucchini-nlp Mar 6, 2024

gante Mar 6, 2024

ArthurZucker Mar 7, 2024 •

edited

Loading

ArthurZucker Mar 6, 2024

zucchini-nlp Mar 6, 2024

gante Mar 6, 2024 •

edited

Loading

ArthurZucker Mar 6, 2024

zucchini-nlp Mar 6, 2024

gante Mar 6, 2024

zucchini-nlp Mar 7, 2024

gante left a comment

gante Mar 8, 2024

gante commented Mar 8, 2024

amyeroberts commented Mar 8, 2024 •

edited

Loading

gante commented Mar 18, 2024

gante commented Mar 20, 2024

ArthurZucker left a comment

ArthurZucker Mar 26, 2024

zucchini-nlp commented Mar 26, 2024

agnosticlines commented Apr 19, 2024

		eos_token_id (`Union[int, List[int]]`):
		The id of the end-of-sequence token. Optionally, use a list to set multiple end-of-sequence tokens.

		"`eos_token_id` is deprecated in this function and will be removed in v4.41, use"
		" `stopping_criteria=StoppingCriteriaList([EOSTokenCriteria(eos_token_id=eos_token_id)])` instead.",

Move eos_token_id to stopping criteria #29459

Move eos_token_id to stopping criteria #29459

Conversation

zucchini-nlp commented Mar 5, 2024

What does this PR do?

Before submitting

Who can review?

HuggingFaceDocBuilderDev commented Mar 5, 2024

gante left a comment

Choose a reason for hiding this comment

gante commented Mar 5, 2024

zucchini-nlp commented Mar 5, 2024

gante left a comment

Choose a reason for hiding this comment

ArthurZucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArthurZucker Mar 7, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante Mar 6, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gante commented Mar 8, 2024

amyeroberts commented Mar 8, 2024 • edited Loading

gante commented Mar 18, 2024

gante commented Mar 20, 2024

ArthurZucker left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zucchini-nlp commented Mar 26, 2024

agnosticlines commented Apr 19, 2024

Move `eos_token_id` to stopping criteria #29459

Move `eos_token_id` to stopping criteria #29459

ArthurZucker Mar 7, 2024 •

edited

Loading

gante Mar 6, 2024 •

edited

Loading

amyeroberts commented Mar 8, 2024 •

edited

Loading