[Speculators][Speculative Decoding] Add Qwen Eagle3 Support #21835

dsikka · 2025-07-29T14:04:28Z

Purpose

Extends the Qwen3 definition to work as a target model for Eagle3
Requires: [Speculative Decoding] Add speculators config support #21345

Test Plan

Adds smoke tests to test the speculator with a quantized target model
Tested performance with the dense target model:

[0.713 0.469]
conditional
[0.713 0.657]

github-actions · 2025-07-29T14:04:36Z

👋 Hi! Thank you for contributing to the vLLM project.

💬 Join our developer Slack at https://slack.vllm.ai to discuss your PR in #pr-reviews, coordinate on features in #feat- channels, or join special interest groups in #sig- channels.

Just a reminder: PRs would not trigger full CI run by default. Instead, it would only run fastcheck CI which starts running only a small and essential subset of CI tests to quickly catch errors. You can run other CI tests on top of those by going to your fastcheck build on Buildkite UI (linked in the PR checks section) and unblock them. If you do not have permission to unblock, ping simon-mo or khluu to add you in our Buildkite org.

Once the PR is approved and ready to go, your PR reviewer(s) can run CI to test the changes comprehensively before merging.

To run CI, PR reviewers can either: Add ready label to the PR or enable auto-merge.

🚀

gemini-code-assist

Code Review

This pull request introduces support for Qwen models in speculative decoding by adding a generic 'speculators' configuration format. The changes are extensive, touching model configuration, argument parsing, and adding new config classes for speculators.

My review has identified a couple of critical bugs:

An incorrect loop logic in vllm/config.py that would lead to faulty validation of supported models.
A potential TypeError in vllm/transformers_utils/configs/speculators/base.py due to unsafe dictionary access.

I've also pointed out a weak assertion in a new test file that should be strengthened to prevent future regressions.

Please address these issues to ensure the stability and correctness of this new feature.

vllm/config.py

vllm/transformers_utils/configs/speculators/base.py

tests/speculative_decoding/speculators/test_eagle3.py

mergify · 2025-07-30T10:17:21Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @dsikka.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

tests/speculative_decoding/speculators/test_eagle3.py

vllm/model_executor/models/qwen2.py

mgoin

LGTM, thanks!

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

DarkLight1337 · 2025-08-02T02:43:28Z

vllm/model_executor/models/qwen3.py

        self.make_empty_intermediate_tensors = (
            self.model.make_empty_intermediate_tensors)

+    def set_aux_hidden_state_layers(self, layers: tuple[int]) -> None:


Let's add an explicit interface for supporting eagle3 (SupportsEagle3) in another PR

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Jinzhen Lin <linjinzhen@hotmail.com>

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Noam Gat <noamgat@gmail.com>

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Diego-Castan <diego.castan@ibm.com>

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

mergify bot added llama Related to Llama models qwen Related to Qwen models speculative-decoding labels Jul 29, 2025

gemini-code-assist bot reviewed Jul 29, 2025

View reviewed changes

vllm/config.py Outdated Show resolved Hide resolved

vllm/transformers_utils/configs/speculators/base.py Outdated Show resolved Hide resolved

tests/speculative_decoding/speculators/test_eagle3.py Outdated Show resolved Hide resolved

dsikka changed the title ~~[Speculators][Speculative Decoding] Add support for Qwen~~ [Speculators][Speculative Decoding] Add support for Qwen Eagle3 Support Jul 29, 2025

dsikka changed the title ~~[Speculators][Speculative Decoding] Add support for Qwen Eagle3 Support~~ [Speculators][Speculative Decoding] Add Qwen Eagle3 Support Jul 30, 2025

mergify bot added the needs-rebase label Jul 30, 2025

dsikka force-pushed the spec_eagle3_qwen branch from e347164 to f766c5a Compare August 1, 2025 12:50

mergify bot removed the needs-rebase label Aug 1, 2025

dsikka marked this pull request as ready for review August 1, 2025 13:40

dsikka requested review from WoosukKwon, hmellor, houseroad, mgoin, robertgshaw2-redhat, sighingnow, simon-mo, tlrmchlsmth and youkaichao as code owners August 1, 2025 13:40

mgoin reviewed Aug 1, 2025

View reviewed changes

tests/speculative_decoding/speculators/test_eagle3.py Outdated Show resolved Hide resolved

vllm/model_executor/models/qwen2.py Outdated Show resolved Hide resolved

dsikka requested a review from mgoin August 1, 2025 13:59

mgoin approved these changes Aug 1, 2025

View reviewed changes

mgoin added the ready ONLY add when PR is ready to merge/full CI is needed label Aug 1, 2025

dsikka added 5 commits August 1, 2025 17:44

support qwen

9f53493

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

extend qwen3 definition such that it is runnable with eagle3

8400b23

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

add test case; fix condition

62da274

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

format

6c4f64b

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

just use quant verifiers

f25048d

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

fix condition

95eb570

Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

dsikka force-pushed the spec_eagle3_qwen branch from 6612d13 to 95eb570 Compare August 1, 2025 17:44

DarkLight1337 reviewed Aug 2, 2025

View reviewed changes

vllm-bot merged commit 9f9c38c into vllm-project:main Aug 2, 2025
41 of 43 checks passed

kzjeef mentioned this pull request Aug 2, 2025

[Speculators][Speculative Decoding] Add Eagle3 Support For HunYuan Model #22080

Open

4 tasks

npanpaliya pushed a commit to odh-on-pz/vllm-upstream that referenced this pull request Aug 6, 2025

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support (vllm-pro…

48af7fa

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

noamgat pushed a commit to noamgat/vllm that referenced this pull request Aug 9, 2025

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support (vllm-pro…

19fe1ac

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Noam Gat <noamgat@gmail.com>

paulpak58 pushed a commit to paulpak58/vllm that referenced this pull request Aug 13, 2025

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support (vllm-pro…

b851ea6

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com> Signed-off-by: Paul Pak <paulpak58@gmail.com>

PapaGoose mentioned this pull request Aug 21, 2025

[Speculators][Speculative Decoding] Fix Qwen 2 Eagle3 Support #23337

Merged

epwalsh pushed a commit to epwalsh/vllm that referenced this pull request Aug 28, 2025

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support (vllm-pro…

e5f526d

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

zhewenl pushed a commit to zhewenl/vllm that referenced this pull request Aug 28, 2025

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support (vllm-pro…

a1f8e9c

…ject#21835) Signed-off-by: Dipika Sikka <dipikasikka1@gmail.com>

wwl2755 mentioned this pull request Sep 3, 2025

[Bug]: Support qwen3 Models in eagle3 Speculative Decoding #23464

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support #21835

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support #21835

Uh oh!

dsikka commented Jul 29, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

mgoin left a comment

Uh oh!

DarkLight1337 Aug 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support #21835

[Speculators][Speculative Decoding] Add Qwen Eagle3 Support #21835

Uh oh!

Conversation

dsikka commented Jul 29, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Uh oh!

github-actions bot commented Jul 29, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mergify bot commented Jul 30, 2025

Uh oh!

Uh oh!

Uh oh!

mgoin left a comment

Choose a reason for hiding this comment

Uh oh!

DarkLight1337 Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

dsikka commented Jul 29, 2025 •

edited by github-actions bot

Loading

DarkLight1337 Aug 2, 2025 •

edited

Loading