-
Notifications
You must be signed in to change notification settings - Fork 538
[Bugfix] Support Qwen3-MOE on aclgraph mode #1381
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix] Support Qwen3-MOE on aclgraph mode #1381
Conversation
9fcfc91 to
61eafae
Compare
61eafae to
3da51df
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1381 +/- ##
===========================================
+ Coverage 27.39% 52.34% +24.95%
===========================================
Files 56 78 +22
Lines 6191 9641 +3450
===========================================
+ Hits 1696 5047 +3351
- Misses 4495 4594 +99
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@ApsarasX Have you evaluated various combinations of |
|
@ApsarasX Thanks for your PR, I tried Qwen/Qwen3-30B-A3B on main branch, and Issue also exist on main branch. Run mode: Run partitial log: Send request: request return: This original error information for Qwen/Qwen3-30B-A3B as following, the same with #1368 |
|
@ApsarasX is it ready to merge? Any idea about #1368 (comment)? |
I don't think they are the same error, should be irrelevant. |
|
The @ApsarasX , could you please rebase and add me as a co-author? Thank you. |
3da51df to
477d5d1
Compare
I have added you as a co-author. Could you please handle these corner cases in the future. |
Yeah, already on my schedule. |
PR ready, please merge |
|
@Yikun Please review |
|
please add e2e test for qwen3-moe as well |
|
You can add the model test on https://github.com/vllm-project/vllm-ascend/blob/main/tests/e2e/singlecard/test_aclgraph.py#L32 By running the reduce layer model: |
Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Signed-off-by: ApsarasX <apsarax@outlook.com>
Signed-off-by: Yikun Jiang <yikunkero@gmail.com>
477d5d1 to
25f1182
Compare
|
Do a double confrim on: And added a e2e test for qwen aclgraph case. LGTM Thanks all @ApsarasX @yiz-liu @leo-pony @wangxiyuan |
### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>
What this PR does / why we need it?
Fix the shape of the
npu_moe_init_routinginput parameters to support aclgraph mode on qwen3-moeIn addition to this PR, resolving the
gatherv3error might be necessary. See related PR #1297 #1446Thanks to @yiz-liu for providing the idea
Does this PR introduce any user-facing change?
No
How was this patch tested?
Tested on Qwen3-30B-A3B
Closes: #1368