[Bugfix] Fix Incremental Detokenization with `tokenizers == 0.22.0` #24159

faaany · 2025-09-03T07:31:57Z

Purpose

This PR updates the _protected_step method to incorporate the latest change of DecodeStream in tokenizers 0.22.0 to avoid the following UT failure:

pytest -rA tests/v1/engine/test_fast_incdec_prefix_err.py::test_fast_inc_detok_invalid_utf8_err_case

Error Log:

self = <vllm.v1.engine.detokenizer.FastIncrementalDetokenizer object at 0x7f0bae57d9f0>, next_token_id = 237167

    def _protected_step(self, next_token_id: int) -> Optional[str]:
        try:
>           token = self.stream.step(self.tokenizer, next_token_id)
E           Exception: Invalid prefix encountered while decoding stream. Token ID: 237167, Expected prefix: ' ', Actual string: '���»'

vllm/v1/engine/detokenizer.py:235: Exception

Since the latest transformers v4.56 would automatically install the latest tokenizers 0.22.0, we need to update the code logic introduced in PR #19449 to make it work for both tokenizers 0.21.4 and 0.22.0.

Test Result

With the fix in this PR, test_fast_inc_detok_invalid_utf8_err_case can pass:

========================================================================= test session starts ==========================================================================
platform linux -- Python 3.10.12, pytest-8.4.1, pluggy-1.6.0
rootdir: /workspace/vllm
configfile: pyproject.toml
plugins: hypothesis-6.136.6, typeguard-4.4.4, anyio-4.9.0
collected 1 item                                                                                                                                                       

tests/v1/engine/test_fast_incdec_prefix_err.py .                                                                                                                 [100%]

================================================================================ PASSES ================================================================================
______________________________________________________________ test_fast_inc_detok_invalid_utf8_err_case _______________________________________________________________
------------------------------------------------------------------------- Captured stdout call -------------------------------------------------------------------------
WARNING 09-03 16:38:16 [detokenizer.py:243] Encountered invalid prefix detokenization error for request test, resetting decode stream.
======================================================================= short test summary info ========================================================================
PASSED tests/v1/engine/test_fast_incdec_prefix_err.py::test_fast_inc_detok_invalid_utf8_err_case
========================================================================== 1 passed in 3.63s ===========================================================================

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Fanli Lin <fanli.lin@intel.com>

gemini-code-assist

Code Review

This pull request addresses a compatibility issue with tokenizers==0.22.0 which changed an error message format, causing failures in incremental detokenization. The changes correctly adapt the error handling to work with both old and new versions of the library by using a substring check instead of an exact match on the error message, and by using a keyword argument for DecodeStream which is now required. My review includes a suggestion to make the error message check even more robust by using startswith instead of in, to reduce the chance of incorrectly handling unrelated exceptions.

vllm/v1/engine/detokenizer.py

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Fanli Lin <fanli0116@gmail.com>

faaany · 2025-09-03T07:37:35Z

cc @jikunshang @yma11 @rogerxfeng8

yma11 · 2025-09-03T08:30:00Z

vllm/v1/engine/detokenizer.py

                " for request %s, resetting decode stream.", self.request_id)
-            self.stream = DecodeStream(self.skip_special_tokens)
+            self.stream = DecodeStream(
+                skip_special_tokens=self.skip_special_tokens)


Is this change necessary? DecodeStream introduce more args?

yes, as can be seen from: https://github.com/huggingface/tokenizers/pull/1856/files#diff-780be0b9b76e7260ed2be2249d17c4a879b7e2e98e9f30f26dc9c65501f775d1R672. Otherwise, we would get an error that ids should not be Boolean type.

jikunshang · 2025-09-03T09:46:26Z

cc @njhill Please take a review, thanks!

njhill

Thanks @faaany!

vllm/v1/engine/detokenizer.py

jikunshang · 2025-09-04T03:29:07Z

seems CI failed due to unrelated error, can we force merge this to unblock other PR?
cc @njhill @simon-mo

…llm-project#24159) Signed-off-by: Fanli Lin <fanli.lin@intel.com> Signed-off-by: Fanli Lin <fanli0116@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

…llm-project#24159) Signed-off-by: Fanli Lin <fanli.lin@intel.com> Signed-off-by: Fanli Lin <fanli0116@gmail.com> Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

fix tokenizers update

4adf155

Signed-off-by: Fanli Lin <fanli.lin@intel.com>

faaany requested review from WoosukKwon, alexm-redhat, comaniac, njhill, robertgshaw2-redhat and ywang96 as code owners September 3, 2025 07:31

mergify bot added the v1 label Sep 3, 2025

gemini-code-assist bot reviewed Sep 3, 2025

View reviewed changes

vllm/v1/engine/detokenizer.py Outdated Show resolved Hide resolved

faaany and others added 2 commits September 3, 2025 15:34

Update vllm/v1/engine/detokenizer.py

f3dadb9

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Fanli Lin <fanli0116@gmail.com>

Merge branch 'main' into fix_tokenizers_update

f8dbaf6

yma11 reviewed Sep 3, 2025

View reviewed changes

njhill approved these changes Sep 3, 2025

View reviewed changes

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 3, 2025

njhill mentioned this pull request Sep 3, 2025

[Bugfix] Invalid prefix exception with the new transformers library #24180

Closed

gshtras reviewed Sep 3, 2025

View reviewed changes

vllm/v1/engine/detokenizer.py Show resolved Hide resolved

robertgshaw2-redhat changed the title ~~fix incremental detokenization edge case error for tokenizers == 0.22.0~~ [Bugfix] Fix Incremental Detokenization with tokenizers == 0.22.0 Sep 3, 2025

njhill mentioned this pull request Sep 3, 2025

Update to Transformers 4.55.3 #24043

Closed

Merge branch 'main' into fix_tokenizers_update

095f441

jikunshang enabled auto-merge (squash) September 4, 2025 01:36

Merge branch 'main' into fix_tokenizers_update

c447eab

vllm-bot merged commit 2c301ee into vllm-project:main Sep 4, 2025
36 of 38 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Fix Incremental Detokenization with `tokenizers == 0.22.0` #24159

[Bugfix] Fix Incremental Detokenization with `tokenizers == 0.22.0` #24159

faaany commented Sep 3, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

faaany commented Sep 3, 2025

Uh oh!

yma11 Sep 3, 2025

Uh oh!

faaany Sep 3, 2025 •

edited

Loading

Uh oh!

jikunshang commented Sep 3, 2025

Uh oh!

njhill left a comment

Uh oh!

Uh oh!

jikunshang commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Uh oh!

[Bugfix] Fix Incremental Detokenization with tokenizers == 0.22.0 #24159

[Bugfix] Fix Incremental Detokenization with tokenizers == 0.22.0 #24159

Conversation

faaany commented Sep 3, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

faaany commented Sep 3, 2025

Uh oh!

yma11 Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

faaany Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jikunshang commented Sep 3, 2025

Uh oh!

njhill left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

jikunshang commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

[Bugfix] Fix Incremental Detokenization with `tokenizers == 0.22.0` #24159

[Bugfix] Fix Incremental Detokenization with `tokenizers == 0.22.0` #24159

faaany commented Sep 3, 2025 •

edited by github-actions bot

Loading

faaany Sep 3, 2025 •

edited

Loading