[BugFix] Fix Qwen3-next break #3428

wxsIcey · 2025-10-13T14:16:04Z

What this PR does / why we need it?

Fix Qwen3NextGatedDeltaNet, caused by vllm-project/vllm#26437

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

def main():
    prompts = [
        "窗前明月光，",
        "The president of the United States is Mr.",
        "The capital of France is",
        "The future of AI is",
        "感时花溅泪，",
        "家书抵万金啥意思？",
        "plz tell me a story: ",
    ]

    # Create a sampling params object.
    sampling_params = SamplingParams(max_tokens=100, temperature=0.6, top_k=40, top_p=0.95)
    # Create an LLM.
    llm = LLM(
        model="/root/.cache/modelscope/hub/models/Qwen/Qwen3-Next-80B-A3B-Instruct",
              tensor_parallel_size=4,
              enforce_eager=True,
              trust_remote_code=True,
              max_model_len=256,
              gpu_memory_utilization=0.7,
              block_size=64
              )

    # Generate texts from the prompts.
    outputs = llm.generate(prompts, sampling_params)
    for output in outputs:
        prompt = output.prompt
        generated_text = output.outputs[0].text
        print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

vLLM version: v0.11.0rc3
vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0

github-actions · 2025-10-13T14:16:12Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

wxsIcey · 2025-10-13T14:17:03Z

needs #3719

gemini-code-assist

Code Review

This pull request addresses a bug with the Qwen3-next model on vLLM. The changes involve vendoring a Triton kernel for layer normalization and fixing a configuration issue in the Mamba model setup. The fixes appear to be correct and address the immediate problem. My primary feedback concerns the long-term maintainability of vendoring the Triton kernel. While this is a valid approach for a hotfix, it introduces risks of divergence from the upstream vLLM project. I've provided a comment with a suggestion to mitigate this risk.

vllm_ascend/ops/fla.py

wxsIcey · 2025-10-13T14:19:17Z

Incompatible with upstream changes because ''torch_npu._C._NPUDeviceProperties' object has no attribute 'multi_processor_count'

github-actions · 2025-10-14T13:33:03Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wxsIcey · 2025-10-17T06:33:57Z

Test Results:

Prompt: '窗前明月光，', Generated text: '疑是地上霜。\nA. 举头望明月\nB. 疑是地上霜\nC. 举头望明月\nD. 低头思故乡\n答案:\n C\n\n在1999年3月14日，以____为单位的欧元在11个欧洲国家开始使用\nA. 1989\nB. 1990\nC. 1998\nD. 199'
Prompt: 'The president of the United States is Mr.', Generated text: " Joe Biden. In 2021, President Biden declared that he would take steps to protect the rights of the LGBTQ+ community. He signed an executive order on January 20, 2021, directing federal agencies to ensure that all federal laws and policies are enforced in a way that protects the rights of LGBTQ+ individuals. This action was taken in response to the Trump administration's policies, which had rolled back many protections for LGBTQ+ individuals. The Biden administration has also taken"
Prompt: 'The capital of France is', Generated text: ' Paris. What is the capital city of France?\n\nThe capital of France is Paris.'
Prompt: 'The future of AI is', Generated text: ' a topic of intense debate and exploration, particularly as the field continues to evolve at an unprecedented pace. One of the most intriguing and influential figures in this domain is Yann LeCun, a French computer scientist and a leading figure in the development of artificial intelligence (AI) and deep learning. In 2021, LeCun delivered a keynote address that sparked intense debate within the AI community, challenging the conventional wisdom that scale alone—increasing the number of parameters in neural networks—'
Prompt: '感时花溅泪，', Generated text: '恨别鸟惊心。是写实还是写意？请结合你对本诗的理解，谈谈对这一问题的看法。\n\n这个问题涉及对杜甫《春望》一诗的解读，尤其是对“感时花溅泪，恨别鸟惊心”这一联的解读。我们先明确题干：“**感时花溅泪，恨别鸟惊心**”出自杜甫《春望》，而非“感”与“不”——题干中'
Prompt: '家书抵万金啥意思？', Generated text: '？？？？？？？？？？？？？\n\n“**海阔凭鱼跃，天高任鸟飞**”——这句话你可能听多了，但你问的这个“**书信**”其实是**杜甫的诗**，不是“书信”哦 😊\n\n我们来认真、温柔地聊聊～\n\n---\n\n🌟 **你问的其实是：**  \n> “**烽火连三月，家书抵万金**” ——'
Prompt: 'plz tell me a story: ', Generated text: ' a dog named Max, 1000 years old, who is a magical dog and can turn into a doggo\n\n**Title: Max, the Dog Who Lived a Thousand Years (And Still Barked Like a Goof)**\n\nLong, long ago—before the pyramids stood tall, before kings wore crowns made of gold instead of just *thinking* they looked cool in hats—there lived a dog.\n\nNot just *any* dog.\n\nA **great, goofy,

It seems there is a accuracy problem

github-actions · 2025-10-20T01:43:54Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

wxsIcey · 2025-10-20T07:07:44Z

Test Results:

Prompt: '窗前明月光，', Generated text: '疑是地上霜。\nA. 举头望明月\nB. 疑是地上霜\nC. 举头望明月\nD. 低头思故乡\n答案:\n C\n\n在1999年3月14日，以____为单位的欧元在11个欧洲国家开始使用\nA. 1989\nB. 1990\nC. 1998\nD. 199'
Prompt: 'The president of the United States is Mr.', Generated text: " Joe Biden. In 2021, President Biden declared that he would take steps to protect the rights of the LGBTQ+ community. He signed an executive order on January 20, 2021, directing federal agencies to ensure that all federal laws and policies are enforced in a way that protects the rights of LGBTQ+ individuals. This action was taken in response to the Trump administration's policies, which had rolled back many protections for LGBTQ+ individuals. The Biden administration has also taken"
Prompt: 'The capital of France is', Generated text: ' Paris. What is the capital city of France?\n\nThe capital of France is Paris.'
Prompt: 'The future of AI is', Generated text: ' a topic of intense debate and exploration, particularly as the field continues to evolve at an unprecedented pace. One of the most intriguing and influential figures in this domain is Yann LeCun, a French computer scientist and a leading figure in the development of artificial intelligence (AI) and deep learning. In 2021, LeCun delivered a keynote address that sparked intense debate within the AI community, challenging the conventional wisdom that scale alone—increasing the number of parameters in neural networks—'
Prompt: '感时花溅泪，', Generated text: '恨别鸟惊心。是写实还是写意？请结合你对本诗的理解，谈谈对这一问题的看法。\n\n这个问题涉及对杜甫《春望》一诗的解读，尤其是对“感时花溅泪，恨别鸟惊心”这一联的解读。我们先明确题干：“**感时花溅泪，恨别鸟惊心**”出自杜甫《春望》，而非“感”与“不”——题干中'
Prompt: '家书抵万金啥意思？', Generated text: '？？？？？？？？？？？？？\n\n“**海阔凭鱼跃，天高任鸟飞**”——这句话你可能听多了，但你问的这个“**书信**”其实是**杜甫的诗**，不是“书信”哦 😊\n\n我们来认真、温柔地聊聊～\n\n---\n\n🌟 **你问的其实是：**  \n> “**烽火连三月，家书抵万金**” ——'
Prompt: 'plz tell me a story: ', Generated text: ' a dog named Max, 1000 years old, who is a magical dog and can turn into a doggo\n\n**Title: Max, the Dog Who Lived a Thousand Years (And Still Barked Like a Goof)**\n\nLong, long ago—before the pyramids stood tall, before kings wore crowns made of gold instead of just *thinking* they looked cool in hats—there lived a dog.\n\nNot just *any* dog.\n\nA **great, goofy,

It seems there is a accuracy problem

This have been sovled by 6c65dd8

Signed-off-by: Icey <1790571317@qq.com>

wxsIcey · 2025-10-23T08:04:40Z

Incompatible with upstream changes because ''torch_npu._C._NPUDeviceProperties' object has no attribute 'multi_processor_count'

has been solved by #3549

Signed-off-by: Icey <1790571317@qq.com>

github-actions bot added the module:ops label Oct 13, 2025

gemini-code-assist bot reviewed Oct 13, 2025

View reviewed changes

vllm_ascend/ops/fla.py Outdated Show resolved Hide resolved

github-actions bot added the merge-conflicts label Oct 14, 2025

wxsIcey force-pushed the fix_qwen_next branch from b90d222 to ecc7122 Compare October 15, 2025 03:22

github-actions bot removed the merge-conflicts label Oct 15, 2025

wxsIcey force-pushed the fix_qwen_next branch from a139175 to 9757e24 Compare October 15, 2025 08:01

wxsIcey changed the title ~~[BugFix][main] Fix Qwen3-next because of vllm #26207~~ [BugFix][0.11.1] Fix Qwen3-next because of vllm #26207 Oct 16, 2025

wxsIcey changed the title ~~[BugFix][0.11.1] Fix Qwen3-next because of vllm #26207~~ [BugFix][0.11.1] Fix Qwen3-next Oct 17, 2025

wxsIcey changed the title ~~[BugFix][0.11.1] Fix Qwen3-next~~ [BugFix][main] Fix Qwen3-next break Oct 17, 2025

github-actions bot added the merge-conflicts label Oct 20, 2025

fix Qwen3NextGatedDeltaNet

917315e

Signed-off-by: Icey <1790571317@qq.com>

wxsIcey force-pushed the fix_qwen_next branch from 41a63f6 to 917315e Compare October 23, 2025 07:50

github-actions bot added merge-conflicts and removed merge-conflicts module:ops labels Oct 23, 2025

wxsIcey changed the title ~~[BugFix][main] Fix Qwen3-next break~~ [BugFix][0.11.1] Fix Qwen3-next break Oct 23, 2025

github-actions bot removed the merge-conflicts label Oct 23, 2025

compatible 0.11.0

840ec54

Signed-off-by: Icey <1790571317@qq.com>

wxsIcey added ready read for review ready-for-test start test by label for PR labels Oct 25, 2025

wxsIcey changed the title ~~[BugFix][0.11.1] Fix Qwen3-next break~~ [BugFix] Fix Qwen3-next break Oct 25, 2025

wxsIcey mentioned this pull request Oct 25, 2025

Upgrade to new vllm commit #3719

Merged

wangxiyuan approved these changes Oct 25, 2025

View reviewed changes

wxsIcey closed this Oct 25, 2025

wxsIcey reopened this Oct 25, 2025

MengqingCao approved these changes Oct 25, 2025

View reviewed changes

MengqingCao merged commit bb5f16d into vllm-project:main Oct 25, 2025
63 checks passed

MengqingCao mentioned this pull request Oct 27, 2025

[Bugfix][Qwen3-Next] Fix Qwen3-Next with the latest maintained vllm #3741

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] Fix Qwen3-next break #3428

[BugFix] Fix Qwen3-next break #3428

Uh oh!

wxsIcey commented Oct 13, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 13, 2025

Uh oh!

wxsIcey commented Oct 13, 2025 •

edited

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

wxsIcey commented Oct 13, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

Uh oh!

wxsIcey commented Oct 17, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Oct 20, 2025

Uh oh!

wxsIcey commented Oct 20, 2025

Uh oh!

wxsIcey commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[BugFix] Fix Qwen3-next break #3428

[BugFix] Fix Qwen3-next break #3428

Uh oh!

Conversation

wxsIcey commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Oct 13, 2025

Uh oh!

wxsIcey commented Oct 13, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

wxsIcey commented Oct 13, 2025

Uh oh!

github-actions bot commented Oct 14, 2025

Uh oh!

wxsIcey commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Oct 20, 2025

Uh oh!

wxsIcey commented Oct 20, 2025

Uh oh!

wxsIcey commented Oct 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wxsIcey commented Oct 13, 2025 •

edited

Loading

wxsIcey commented Oct 13, 2025 •

edited

Loading

wxsIcey commented Oct 17, 2025 •

edited

Loading