[BUG] Qwen3-next MTP. Fix attn metadata build bug #26564

vadiklyutiy · 2025-10-10T04:38:34Z

Purpose

After fixing #24486 Qwen3-next with FlashInfer full attn start working without MTP.
But with MTP it fails.

The reason we choose incorrect attn metadata type for draft model (choose GDN instead of full attn).

Fix it.

Test Result

Qwen3-next with MTP works now.

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

gemini-code-assist

Code Review

This pull request fixes a bug in attention metadata building for Qwen3-next with MTP by introducing a more robust method _get_attention_metadata_builder to find the correct attention metadata builder. The change is logical and effectively addresses the issue. I've added one high-severity comment to handle a potential IndexError in the new helper function if no draft attention layers are found.

chatgpt-codex-connector

💡 Codex Review

vllm/vllm/v1/spec_decode/eagle.py

Lines 1039 to 1044 in 8c913a5

    
           chosen_layer = self.attn_layer_names[0] 
        
           for kv_cache_group in self.runner.attn_groups: 
        
               for attn_group in kv_cache_group: 
        
                   if chosen_layer in attn_group.layer_names: 
        
                       builder = attn_group.get_metadata_builder()

Respect current ubatch when fetching attention metadata builder

The previous implementation selected the attention metadata builder using dbo_current_ubatch_id() so each DBO ubatch used its own builder instance. _get_attention_metadata_builder() now calls attn_group.get_metadata_builder() with no argument, which defaults to ubatch_id=0. When parallel_config.enable_dbo is true (the runner allocates two builders per group), both ubatches will now reuse builder 0 and can overwrite each other’s buffers while building metadata, leading to incorrect attention metadata or races during drafting. This helper should still select the builder corresponding to the current ubatch id.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

LucasWilkinson

Oh nice; theres already infra, LGTM!

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: bbartels <benjamin@bartels.dev>

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

fix attn metadata build bug for Qwen3-next MTP

8c913a5

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

vadiklyutiy requested review from benchislett and luccafong as code owners October 10, 2025 04:38

mergify bot added qwen Related to Qwen models speculative-decoding v1 labels Oct 10, 2025

gemini-code-assist bot reviewed Oct 10, 2025

View reviewed changes

chatgpt-codex-connector bot reviewed Oct 10, 2025

View reviewed changes

LucasWilkinson approved these changes Oct 10, 2025

View reviewed changes

LucasWilkinson enabled auto-merge (squash) October 10, 2025 05:12

github-actions bot added the ready ONLY add when PR is ready to merge/full CI is needed label Oct 10, 2025

fix CI bug

5120188

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

auto-merge was automatically disabled October 10, 2025 15:24
Head branch was pushed to by a user without write access

benchislett merged commit e94cfd5 into vllm-project:main Oct 10, 2025
45 checks passed

benchislett deleted the vadim/qwen3-next-mtp-fix branch October 10, 2025 18:59

Dhruvilbhatt pushed a commit to Dhruvilbhatt/vllm that referenced this pull request Oct 14, 2025

[BUG] Qwen3-next MTP. Fix attn metadata build bug (vllm-project#26564)

e46ad3e

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: Dhruvil Bhatt <bhattdbh@amazon.com>

bbartels pushed a commit to bbartels/vllm that referenced this pull request Oct 16, 2025

[BUG] Qwen3-next MTP. Fix attn metadata build bug (vllm-project#26564)

afcda4e

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: bbartels <benjamin@bartels.dev>

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[BUG] Qwen3-next MTP. Fix attn metadata build bug (vllm-project#26564)

b4c51ae

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

[BUG] Qwen3-next MTP. Fix attn metadata build bug (vllm-project#26564)

ff75606

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com>

xuebwang-amd pushed a commit to xuebwang-amd/vllm that referenced this pull request Oct 24, 2025

[BUG] Qwen3-next MTP. Fix attn metadata build bug (vllm-project#26564)

a08d48a

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[BUG] Qwen3-next MTP. Fix attn metadata build bug (vllm-project#26564)

a7978cc

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

0xrushi pushed a commit to 0xrushi/vllm that referenced this pull request Oct 26, 2025

[BUG] Qwen3-next MTP. Fix attn metadata build bug (vllm-project#26564)

fd07d8e

Signed-off-by: Vadim Gimpelson <vadim.gimpelson@gmail.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[BUG] Qwen3-next MTP. Fix attn metadata build bug #26564

[BUG] Qwen3-next MTP. Fix attn metadata build bug #26564

Uh oh!

vadiklyutiy commented Oct 10, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

LucasWilkinson left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	chosen_layer = self.attn_layer_names[0]

	for kv_cache_group in self.runner.attn_groups:
	for attn_group in kv_cache_group:
	if chosen_layer in attn_group.layer_names:
	builder = attn_group.get_metadata_builder()

Uh oh!

Uh oh!

[BUG] Qwen3-next MTP. Fix attn metadata build bug #26564

[BUG] Qwen3-next MTP. Fix attn metadata build bug #26564

Uh oh!

Conversation

vadiklyutiy commented Oct 10, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Result

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

LucasWilkinson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

vadiklyutiy commented Oct 10, 2025 •

edited by github-actions bot

Loading