Skip to content

Conversation

@njhill
Copy link
Member

@njhill njhill commented Sep 5, 2025

There is a token overflow issue with quantized models that happens occasionally: #21951

The exact cause is not yet clear and it appears to be difficult to reproduce.

This change just insulates against the error, skipping the problem token but still logging the exception. It also prints the token id in question (suspected to be negative) which should help with further diagnosis.

Signed-off-by: Nick Hill <nhill@redhat.com>
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a targeted bugfix to handle a rare OverflowError within the detokenizer. The change correctly isolates the OverflowError, logs the exception with the problematic token ID for future diagnostics, and ensures the system can continue processing by treating the failed token as None. This is a robust and appropriate way to handle an intermittent, hard-to-reproduce issue. The implementation is sound and effectively insulates the system from this specific failure mode.

@njhill njhill added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 6, 2025
@vllm-bot vllm-bot merged commit 6432739 into vllm-project:main Sep 6, 2025
46 of 48 checks passed
@njhill njhill deleted the detokenizer-debug branch September 6, 2025 19:17
eicherseiji pushed a commit to eicherseiji/vllm that referenced this pull request Sep 9, 2025
skyloevil pushed a commit to skyloevil/vllm that referenced this pull request Sep 13, 2025
FeiDaLI pushed a commit to FeiDaLI/vllm that referenced this pull request Sep 25, 2025
njhill added a commit to njhill/vllm that referenced this pull request Oct 8, 2025
This is an update to the "workaround" added in vllm-project#24351.

That PR insulates against occasional negative token ids that can be produced occasionally, though we still don't know the root cause.

With the update to tokenizers 0.22.1, this error manifests as a TypeError rather than an OverflowError, so the patch needs to be updated to account for this.

Signed-off-by: Nick Hill <nhill@redhat.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 10, 2025
…#24351)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025
…#24351)

Signed-off-by: Nick Hill <nhill@redhat.com>
Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ready ONLY add when PR is ready to merge/full CI is needed v1

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants