[Bugfix] Catch and log invalid token ids in detokenizer #24351

njhill · 2025-09-05T23:53:20Z

There is a token overflow issue with quantized models that happens occasionally: #21951

The exact cause is not yet clear and it appears to be difficult to reproduce.

This change just insulates against the error, skipping the problem token but still logging the exception. It also prints the token id in question (suspected to be negative) which should help with further diagnosis.

Signed-off-by: Nick Hill <nhill@redhat.com>

gemini-code-assist

Code Review

This pull request introduces a targeted bugfix to handle a rare OverflowError within the detokenizer. The change correctly isolates the OverflowError, logs the exception with the problematic token ID for future diagnostics, and ensures the system can continue processing by treating the failed token as None. This is a robust and appropriate way to handle an intermittent, hard-to-reproduce issue. The implementation is sound and effectively insulates the system from this specific failure mode.

…#24351) Signed-off-by: Nick Hill <nhill@redhat.com>

This is an update to the "workaround" added in vllm-project#24351. That PR insulates against occasional negative token ids that can be produced occasionally, though we still don't know the root cause. With the update to tokenizers 0.22.1, this error manifests as a TypeError rather than an OverflowError, so the patch needs to be updated to account for this. Signed-off-by: Nick Hill <nhill@redhat.com>

…#24351) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

[Bugfix] Catch and log invalid token ids in detokenizer

9c10ff7

Signed-off-by: Nick Hill <nhill@redhat.com>

njhill requested review from WoosukKwon, alexm-redhat, comaniac, robertgshaw2-redhat and ywang96 as code owners September 5, 2025 23:53

mergify bot added the v1 label Sep 5, 2025

gemini-code-assist bot reviewed Sep 5, 2025

View reviewed changes

njhill mentioned this pull request Sep 5, 2025

[Bug]: Incremental detokenization error when running llama-3.3-70b-fp8 model #21951

Open

njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 6, 2025

DarkLight1337 approved these changes Sep 6, 2025

View reviewed changes

vllm-bot merged commit 6432739 into vllm-project:main Sep 6, 2025
46 of 48 checks passed

njhill deleted the detokenizer-debug branch September 6, 2025 19:17

eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025

[Bugfix] Catch and log invalid token ids in detokenizer (vllm-project…

3f69dac

…#24351) Signed-off-by: Nick Hill <nhill@redhat.com>

skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025

[Bugfix] Catch and log invalid token ids in detokenizer (vllm-project…

9fa1d3d

…#24351) Signed-off-by: Nick Hill <nhill@redhat.com>

FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025

[Bugfix] Catch and log invalid token ids in detokenizer (vllm-project…

c10a740

…#24351) Signed-off-by: Nick Hill <nhill@redhat.com>

njhill mentioned this pull request Oct 8, 2025

[Bug]: TypeError: argument 'id': StreamInput must be either an integer or a list of integers #26438

Closed

1 task

njhill mentioned this pull request Oct 8, 2025

[Bugfix] Catch and log invalid token ids in detokenizer #2 #26445

Merged

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025

[Bugfix] Catch and log invalid token ids in detokenizer (vllm-project…

a22082e

…#24351) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[Bugfix] Catch and log invalid token ids in detokenizer (vllm-project…

b88b8f1

…#24351) Signed-off-by: Nick Hill <nhill@redhat.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Catch and log invalid token ids in detokenizer #24351

[Bugfix] Catch and log invalid token ids in detokenizer #24351

Uh oh!

njhill commented Sep 5, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

[Bugfix] Catch and log invalid token ids in detokenizer #24351

[Bugfix] Catch and log invalid token ids in detokenizer #24351

Uh oh!

Conversation

njhill commented Sep 5, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

njhill commented Sep 5, 2025 •

edited by github-actions bot

Loading