[main][Bugfix] Fix unable to load qwen3_moe quantized weights #2219

zhoux77899 · 2025-08-05T11:34:04Z

What this PR does / why we need it?

Fixes unable to load qwen3_moe quantized weights issue due to #1994

Does this PR introduce any user-facing change?

None

How was this patch tested?

Add a qwen3_moe W8A8 quantized model in tests/e2e/multicard/test_qwen3_moe.py

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@c494f96

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

github-actions · 2025-08-05T12:11:10Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

wangxiyuan · 2025-08-05T12:41:05Z

tests/e2e/multicard/test_qwen3_moe.py

+            max_model_len=8192,
+            dtype=dtype,
+            tensor_parallel_size=2,
+            quantization="ascend",


see: #2223 let's enable aclgraph for test

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

codecov · 2025-08-05T14:59:14Z

Codecov Report

❌ Patch coverage is 10.00000% with 18 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.65%. Comparing base (126cdfc) to head (303b1df).
⚠️ Report is 614 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/models/qwen3_moe.py	10.00%	18 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2219      +/-   ##
==========================================
- Coverage   76.75%   76.65%   -0.11%     
==========================================
  Files         113      113              
  Lines       12743    12763      +20     
==========================================
+ Hits         9781     9783       +2     
- Misses       2962     2980      +18

Flag	Coverage Δ
unittests	`76.65% <10.00%> (-0.11%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

…roject#2219) ### What this PR does / why we need it? Fixes unable to load `qwen3_moe` quantized weights issue due to vllm-project#1994 ### Does this PR introduce _any_ user-facing change? None ### How was this patch tested? Add a `qwen3_moe` W8A8 quantized model in `tests/e2e/multicard/test_qwen3_moe.py` - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@c494f96 --------- Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

zhoux77899 added 5 commits August 5, 2025 19:28

fix(models): rebase

c981c05

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

fix(lint): fix lint

92f246e

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

fix(lint): fix lint

8ee7b4b

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

fix(lint): fix lint

2a516da

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

fix(lint): fix lint

d44c74f

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

wangxiyuan reviewed Aug 5, 2025

View reviewed changes

test(ci): enable aclgraph in qwen3 moe e2e test

6af5f99

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

github-actions bot added the module:tests label Aug 5, 2025

test(ci): download model

303b1df

Signed-off-by: zhoux77899 <zhouxiang100@huawei.com>

wangxiyuan approved these changes Aug 6, 2025

View reviewed changes

wangxiyuan merged commit e31b31f into vllm-project:main Aug 6, 2025
25 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[main][Bugfix] Fix unable to load qwen3_moe quantized weights #2219

[main][Bugfix] Fix unable to load qwen3_moe quantized weights #2219

Uh oh!

zhoux77899 commented Aug 5, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 5, 2025

Uh oh!

wangxiyuan Aug 5, 2025

Uh oh!

zhoux77899 Aug 5, 2025

Uh oh!

codecov bot commented Aug 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[main][Bugfix] Fix unable to load qwen3_moe quantized weights #2219

[main][Bugfix] Fix unable to load qwen3_moe quantized weights #2219

Uh oh!

Conversation

zhoux77899 commented Aug 5, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Aug 5, 2025

Uh oh!

wangxiyuan Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

zhoux77899 Aug 5, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Aug 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

zhoux77899 commented Aug 5, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 5, 2025 •

edited

Loading