[Spec Decode][CI] Add e2e test for `examples/spec_decode.py` and prevent breaking Acceptance Length #24531

ekagra-ranjan · 2025-09-09T19:59:34Z

In the past, the examples/offline_inference/spec_decode.py has broken a few times due to changes in datasets or other places like this. This script is an important one since this allows measuring AL for SD methods.

This PR adds this script to CI and ensures 2 things

the example script is in working condition
the AL of default method, i.e., Eagle, is measured during CI as an e2e test since many SD code path use Eagle related components.

Testing

cmd
time python3 examples/offline_inference/spec_decode.py --test --method eagle --num_spec_tokens 3 --dataset-name hf --dataset-path philschmid/mt-bench --num-prompts 80 --temp 0 --top-p 1.0 --top-k -1 --tp 1 --enable-chunked-prefill

Output

Adding requests: 100%|███████████████████████████████████████████████████████████████████████████████████| 80/80 [00:00<00:00, 12256.43it/s]
Processed prompts: 100%|██████████████████████████| 80/80 [00:02<00:00, 38.61it/s, est. speed input: 3886.77 toks/s, output: 8174.11 toks/s]
--------------------------------------------------
total_num_output_tokens: 16936
num_drafts: 7403
num_draft_tokens: 22209
num_accepted_tokens: 9535
mean acceptance length: 2.29
--------------------------------------------------
acceptance at token 0: 0.68
acceptance at token 1: 0.39
acceptance at token 2: 0.21
Test passed!

real    0m31.999s
user    0m52.270s
sys     0m6.709s

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

gemini-code-assist

Code Review

This pull request introduces an end-to-end test for the spec_decode.py example script, integrating it into the CI pipeline to safeguard against regressions in speculative decoding acceptance length. The approach of refactoring the script for testability and adding assertions for a fixed test case is sound. My review focuses on improving the robustness and maintainability of these new tests. I've identified a missing assertion for a critical test parameter and a formatting issue in an assertion message that could hinder debugging. Addressing these points will make the new test more reliable.

examples/offline_inference/spec_decode.py

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

wwl2755

Great job for maintaining this! I think it is worthwhile to maintain the example scripts valid through CI, since they may be the places people get started from.

Link to #22992 for visibility.

.buildkite/test-pipeline.yaml

benchislett

It's not clear to me if this should be EAGLE1 or EAGLE3 or both, but in any case this is good to have.

examples/offline_inference/spec_decode.py

…d-e2e-test

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

mergify · 2025-09-21T23:16:52Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @ekagra-ranjan.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

…r-sd-e2e-test

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

…d-e2e-test

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io>

…ent breaking Acceptance Length (#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: yewentao256 <zhyanwentao@126.com>

…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: gaojc <1055866782@qq.com>

…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io>

…ent breaking Acceptance Length (vllm-project#24531) Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com> Co-authored-by: Roger Wang <hey@rogerw.io> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

add e2e test for examples/spec_decode.py and monitor AL

dbc4457

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

mergify bot added documentation Improvements or additions to documentation ci/build speculative-decoding labels Sep 9, 2025

gemini-code-assist bot reviewed Sep 9, 2025

View reviewed changes

examples/offline_inference/spec_decode.py Show resolved Hide resolved

examples/offline_inference/spec_decode.py Outdated Show resolved Hide resolved

lint

cfb88fd

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

wwl2755 approved these changes Sep 9, 2025

View reviewed changes

ekagra-ranjan changed the title ~~[Spec Decode] Add e2e test for examples/spec_decode.py and prevent breaking Acceptance Length~~ [Spec Decode][CI] Add e2e test for examples/spec_decode.py and prevent breaking Acceptance Length Sep 9, 2025

benchislett reviewed Sep 9, 2025

View reviewed changes

.buildkite/test-pipeline.yaml Outdated Show resolved Hide resolved

benchislett approved these changes Sep 10, 2025

View reviewed changes

wwl2755 reviewed Sep 11, 2025

View reviewed changes

examples/offline_inference/spec_decode.py Outdated Show resolved Hide resolved

ekagra-ranjan added 3 commits September 18, 2025 14:44

Merge branch 'main' of https://github.com/vllm-project/vllm into er-s…

265683f

…d-e2e-test

add eagle3

a7b75fe

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

lint

ea726fd

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

ywang96 added the ready ONLY add when PR is ready to merge/full CI is needed label Sep 19, 2025

Merge branch 'main' into er-sd-e2e-test

f003480

ywang96 enabled auto-merge (squash) September 19, 2025 19:10

mergify bot added the needs-rebase label Sep 21, 2025

ekagra-ranjan added 2 commits September 22, 2025 15:49

max model len OOM

08355b6

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

Merge branch 'er-sd-e2e-test' of github.com:ekagra-ranjan/vllm into e…

5054613

…r-sd-e2e-test

auto-merge was automatically disabled September 22, 2025 15:49
Head branch was pushed to by a user without write access

resolve conflict

c68a9b9

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

mergify bot removed the needs-rebase label Sep 22, 2025

ekagra-ranjan added 5 commits September 22, 2025 14:57

Merge branch 'main' into er-sd-e2e-test

628d242

Merge branch 'main' into er-sd-e2e-test

1f97e6d

Merge branch 'main' into er-sd-e2e-test

60cc92f

Merge branch 'main' of https://github.com/vllm-project/vllm into er-s…

1a9f6cc

…d-e2e-test

update AL

7745c2c

Signed-off-by: Ekagra Ranjan <3116519+ekagra-ranjan@users.noreply.github.com>

ywang96 approved these changes Sep 23, 2025

View reviewed changes

ywang96 merged commit 867ecdd into vllm-project:main Sep 23, 2025
78 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Spec Decode][CI] Add e2e test for `examples/spec_decode.py` and prevent breaking Acceptance Length #24531

[Spec Decode][CI] Add e2e test for `examples/spec_decode.py` and prevent breaking Acceptance Length #24531

Uh oh!

ekagra-ranjan commented Sep 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

wwl2755 left a comment

Uh oh!

Uh oh!

benchislett left a comment

Uh oh!

Uh oh!

mergify bot commented Sep 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Spec Decode][CI] Add e2e test for examples/spec_decode.py and prevent breaking Acceptance Length #24531

[Spec Decode][CI] Add e2e test for examples/spec_decode.py and prevent breaking Acceptance Length #24531

Uh oh!

Conversation

ekagra-ranjan commented Sep 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Testing

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

wwl2755 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

benchislett left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mergify bot commented Sep 21, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Spec Decode][CI] Add e2e test for `examples/spec_decode.py` and prevent breaking Acceptance Length #24531

[Spec Decode][CI] Add e2e test for `examples/spec_decode.py` and prevent breaking Acceptance Length #24531

ekagra-ranjan commented Sep 9, 2025 •

edited by github-actions bot

Loading