[Bugfix] Padded Eagle Specdec with Chunked Prefill #26263

Flechman · 2025-10-05T22:39:11Z

Purpose

This adds test to #26231

Test Plan

Test Result

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

gemini-code-assist

Code Review

This pull request provides a bugfix for padded speculative decoding, specifically within the prepare_next_token_ids_padded function. The change removes a faulty special case for single-token generation (max_gen_len == 1) that did not correctly handle discarded requests. This could have resulted in invalid token IDs being used in subsequent steps. The new implementation generalizes the validity masking logic, making it simpler, more robust, and correct for all scenarios. The fix is sound and improves the correctness of the speculative decoding implementation.

gemini-code-assist · 2025-10-05T22:42:10Z

vllm/v1/spec_decode/eagle.py

+        valid_mask = (valid_sampled_token_ids_gpu != -1) & (
+            valid_sampled_token_ids_gpu < gpu_input_batch.vocab_size
+        )


This simplification is a great improvement. By removing the special case for max_gen_len == 1, the code is now more robust. The previous logic didn't account for discarded requests when max_gen_len == 1, which could lead to using an invalid token ID of -1. This unified approach correctly handles all cases.

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

benchislett · 2025-10-06T18:46:28Z

This bug has been fixed in #26231. I think it would still be nice to merge updated tests, so please update your PR with the fix from main if you wish to continue.

Flechman · 2025-10-07T08:34:37Z

@benchislett sounds good! Done.

tests/v1/e2e/test_spec_decode.py

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

benchislett

LGTM

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

…nto fix-eagle-prepare-padded

Flechman · 2025-10-13T14:43:22Z

@benchislett I enforced the test to run on GPUs. It passes locally with 4xH100.

Fix padded specdec when chunked prefill

4dda0af

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

Flechman requested review from benchislett and luccafong as code owners October 5, 2025 22:39

mergify bot added speculative-decoding v1 labels Oct 5, 2025

gemini-code-assist bot reviewed Oct 5, 2025

View reviewed changes

Flechman changed the title ~~[Bugfix] Padded Specdec with Chunked Prefill~~ [Bugfix] Padded Eagle Specdec with Chunked Prefill Oct 6, 2025

Add test

b43e38b

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

Merge branch 'main' into fix-eagle-prepare-padded

9c12233

benchislett reviewed Oct 7, 2025

View reviewed changes

tests/v1/e2e/test_spec_decode.py Outdated Show resolved Hide resolved

Fix test

37d8b4e

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

benchislett approved these changes Oct 8, 2025

View reviewed changes

benchislett added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 8, 2025

benchislett enabled auto-merge (squash) October 8, 2025 17:01

Flechman and others added 3 commits October 13, 2025 15:56

Merge branch 'main' into fix-eagle-prepare-padded

9e2d600

Run on GPU

ba459e2

Signed-off-by: Rémi Delacourt <remi@mistral.ai>

Merge branch 'fix-eagle-prepare-padded' of github.com:Flechman/vllm i…

5653a73

…nto fix-eagle-prepare-padded

auto-merge was automatically disabled October 13, 2025 14:36
Head branch was pushed to by a user without write access

Merge branch 'main' into fix-eagle-prepare-padded

ee2d558

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bugfix] Padded Eagle Specdec with Chunked Prefill #26263

[Bugfix] Padded Eagle Specdec with Chunked Prefill #26263

Flechman commented Oct 5, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Oct 5, 2025

Uh oh!

benchislett commented Oct 6, 2025

Uh oh!

Flechman commented Oct 7, 2025

Uh oh!

Uh oh!

benchislett left a comment

Uh oh!

Flechman commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

[Bugfix] Padded Eagle Specdec with Chunked Prefill #26263

Are you sure you want to change the base?

[Bugfix] Padded Eagle Specdec with Chunked Prefill #26263

Conversation

Flechman commented Oct 5, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Oct 5, 2025

Choose a reason for hiding this comment

Uh oh!

benchislett commented Oct 6, 2025

Uh oh!

Flechman commented Oct 7, 2025

Uh oh!

Uh oh!

benchislett left a comment

Choose a reason for hiding this comment

Uh oh!

Flechman commented Oct 13, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Flechman commented Oct 5, 2025 •

edited by github-actions bot

Loading