refactor select_experts of moe module #2150

shiyuan680 · 2025-07-31T13:07:59Z

What this PR does / why we need it?

this pr refactor select_experts of moe module
i merge implementations of quantitative and non-quantitative method in a new class
use such as vllm like ExpertsSelector.select_experts

Does this PR introduce any user-facing change?

How was this patch tested?

test in qwen3-moe and all ut.

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@e188592

github-actions · 2025-07-31T13:10:13Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2025-07-31T13:53:55Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

ApsarasX · 2025-07-31T15:11:11Z

I think this PR helps improve code modularity, but it would be better to use a clearer title and pass CI.

github-actions · 2025-08-07T01:18:04Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

weijinqian0 · 2025-08-07T09:25:38Z

vllm_ascend/ops/moe_layer/select_experts.py

+    def apply(self, router_logits: torch.Tensor, x: torch.Tensor):
+
+        return super().apply(router_logits, x)
+


Do not estimate requirements in advance; it may be necessary to remove both quantitative and non-quantitative implementations.

i will merge implementations of quantitative and non-quantitative, just has one class

codecov · 2025-08-12T08:31:50Z

Codecov Report

❌ Patch coverage is 97.70115% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 76.02%. Comparing base (992271b) to head (4a904e6).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/quantization/w4a8_dynamic.py	50.00%	1 Missing ⚠️
vllm_ascend/quantization/w8a8_dynamic.py	50.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2150      +/-   ##
==========================================
+ Coverage   75.74%   76.02%   +0.27%     
==========================================
  Files         118      119       +1     
  Lines       13525    13507      -18     
==========================================
+ Hits        10245    10269      +24     
+ Misses       3280     3238      -42

Flag	Coverage Δ
unittests	`76.02% <97.70%> (+0.27%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-08-12T13:16:15Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: yangcheng <yangcheng104@huawei.com>

### What this PR does / why we need it? this pr refactor select_experts of moe module i merge implementations of quantitative and non-quantitative method in a new class use such as vllm like ExpertsSelector.select_experts ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? test in qwen3-moe and all ut. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@e188592 Signed-off-by: yangcheng <yangcheng104@huawei.com> Co-authored-by: yangcheng (AJ) <y00806874@china.huawei.com>

…nalinaly

…nalinaly (#3406) I'd like to nominate 4 new maintainers for vllm-ascend: ---- Yizhou Liu [@yiz-liu](https://github.com/yiz-liu) ---- **Review Quality‌**: He has completed [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Ayiz-liu) and provided solutions or guides for [10+ issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3Ayiz-liu), which includes many quality review like [#issue-3428408401](#3002 (comment)), [#discussion_r2224572309](#1803 (comment)), [#issuecomment-2982470226](#1261 (comment)), [#issuecomment-2903621197](#836 (comment)), [#issuecomment-2857678691](#778 (comment)). **Sustained and High-Quality Contributions:** He has contributed more than [30+ commits](https://github.com/vllm-project/vllm-ascend/commits?author=yiz-liu) since Mar.2025, especially, aclgraph, DP, and EP related contributions are the main reason why I nominated him. As the owner of aclgraph support, he continuously improves aclgraph stability and performance as well as fixes key bugs. he laid the groundwork for EP-related functionality and delivered multiple foundational improvements **Community involvement:** He has a very good habit of logging issues：#1649 and is also very active and involved in [many issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aopen%20commenter%3Ayiz-liu%20-author%3Ayiz-liu) to help users resolve issues. ---- Peng Yu [@paulyu12](https://github.com/paulyu12) --- The main reasons for his nomination are his expertise and key contributions to the LORA and sustained and major contributions (initial support/doc/bugfix) around Lora. **Sustained and Major Contributions:** @paulyu12 starts his contribution with [Lora and Mulit-Lora support](697908f) since Apr 2025, he contributed about [10+ commits and bugfixes](697908f) on vllm-ascend. **Review Quality‌ and Community Involvement‌:** He also helped more than 10+ users address [Lora related issues](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Apaulyu12+-author%3Apaulyu12+is%3Aclosed). I believe his addition will further improve vLLM Ascend Lora support. ---- Jinqian Wei [@weijinqian0](https://github.com/weijinqian0) --- The main reasons for his nomination are his key contributions to the RL scene and the high quality of his code reviews. **Review Quality‌:** He has completed [60+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Aweijinqian0+is%3Aopen+-author%3Aweijinqian0) since June. 2025, include [#comment-3284055430](#2791 (comment)), [discussion_r2332166704](#2817 (comment)), [discussion_r2343289692](#2846 (comment)) high quality review. **Sustained and Quality Contributions:** He has Deep understanding of ‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions in RL scene (about [10+ PR merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Aweijinqian0+is%3Amerged+) and 10+ PRs merged as co-author. - Code Refactor: As a co-author, he participated in the refactoring of the MOE module #2150 #2706 #2867 - Performance Enhancement for RL: Participated as a co-author in the design and development of the solution, contributing to the planning of core capabilities. #1547 #2120 and so on. So I think he's a great addition to the vLLM Ascend Maintainer team. ---- Chuanyu Qin [@nalinaly](https://github.com/nalinaly) --- The main reason I nominated Qinchuanyu is because he is the initial designer of aclgraph and torch-npu, two key components of vllm-ascend. Considering aclgraph will eventually become the main path for vllm-ascend's graph model, I propose to nominate him. **Sustained and Major Contributions:** In fact, chuanyu actively helped the users/developers of vllm-ascend since Mar 2025 ([vllm-discuss#162](https://discuss.vllm.ai/t/can-ascend-officially-draft-a-documentation-on-the-vllm-ascend-adaptation-for-graph-mode/162/5)), and also helped early users of vllm-ascend understand aclgraph. He provided lots of help in the process of integrating aclgraph with vllm-ascend. **Community Involvement‌:** As speaker, he also presents help users understand aclgraph and torch_npu [《The design philosophy of torch_npu and the high performance principle of aclGraph》](https://github.com/PyTorch-China/pytorch-meetup/blob/main/beijing-2025/%E3%80%905%E3%80%91torch_npu%20%E7%9A%84%E8%AE%BE%E8%AE%A1%E5%93%B2%E5%AD%A6%E4%B8%8E%20aclGraph%20%E9%AB%98%E6%80%A7%E8%83%BD%E5%8E%9F%E7%90%86-%E7%A7%A6%E4%BC%A0%E7%91%9C-0920.pdf) ---- They have activate contribution to vllm-ascend or have rich experience for ascend AI. Welcome! - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

### What this PR does / why we need it? this pr refactor select_experts of moe module i merge implementations of quantitative and non-quantitative method in a new class use such as vllm like ExpertsSelector.select_experts ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? test in qwen3-moe and all ut. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@e188592 Signed-off-by: yangcheng <yangcheng104@huawei.com> Co-authored-by: yangcheng (AJ) <y00806874@china.huawei.com>

…nalinaly

…nalinaly (vllm-project#3406) I'd like to nominate 4 new maintainers for vllm-ascend: ---- Yizhou Liu [@yiz-liu](https://github.com/yiz-liu) ---- **Review Quality‌**: He has completed [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Ayiz-liu) and provided solutions or guides for [10+ issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3Ayiz-liu), which includes many quality review like [#issue-3428408401](vllm-project#3002 (comment)), [#discussion_r2224572309](vllm-project#1803 (comment)), [#issuecomment-2982470226](vllm-project#1261 (comment)), [#issuecomment-2903621197](vllm-project#836 (comment)), [#issuecomment-2857678691](vllm-project#778 (comment)). **Sustained and High-Quality Contributions:** He has contributed more than [30+ commits](https://github.com/vllm-project/vllm-ascend/commits?author=yiz-liu) since Mar.2025, especially, aclgraph, DP, and EP related contributions are the main reason why I nominated him. As the owner of aclgraph support, he continuously improves aclgraph stability and performance as well as fixes key bugs. he laid the groundwork for EP-related functionality and delivered multiple foundational improvements **Community involvement:** He has a very good habit of logging issues：vllm-project#1649 and is also very active and involved in [many issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aopen%20commenter%3Ayiz-liu%20-author%3Ayiz-liu) to help users resolve issues. ---- Peng Yu [@paulyu12](https://github.com/paulyu12) --- The main reasons for his nomination are his expertise and key contributions to the LORA and sustained and major contributions (initial support/doc/bugfix) around Lora. **Sustained and Major Contributions:** @paulyu12 starts his contribution with [Lora and Mulit-Lora support](vllm-project@697908f) since Apr 2025, he contributed about [10+ commits and bugfixes](vllm-project@697908f) on vllm-ascend. **Review Quality‌ and Community Involvement‌:** He also helped more than 10+ users address [Lora related issues](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Apaulyu12+-author%3Apaulyu12+is%3Aclosed). I believe his addition will further improve vLLM Ascend Lora support. ---- Jinqian Wei [@weijinqian0](https://github.com/weijinqian0) --- The main reasons for his nomination are his key contributions to the RL scene and the high quality of his code reviews. **Review Quality‌:** He has completed [60+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Aweijinqian0+is%3Aopen+-author%3Aweijinqian0) since June. 2025, include [#comment-3284055430](vllm-project#2791 (comment)), [discussion_r2332166704](vllm-project#2817 (comment)), [discussion_r2343289692](vllm-project#2846 (comment)) high quality review. **Sustained and Quality Contributions:** He has Deep understanding of ‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions in RL scene (about [10+ PR merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Aweijinqian0+is%3Amerged+) and 10+ PRs merged as co-author. - Code Refactor: As a co-author, he participated in the refactoring of the MOE module vllm-project#2150 vllm-project#2706 vllm-project#2867 - Performance Enhancement for RL: Participated as a co-author in the design and development of the solution, contributing to the planning of core capabilities. vllm-project#1547 vllm-project#2120 and so on. So I think he's a great addition to the vLLM Ascend Maintainer team. ---- Chuanyu Qin [@nalinaly](https://github.com/nalinaly) --- The main reason I nominated Qinchuanyu is because he is the initial designer of aclgraph and torch-npu, two key components of vllm-ascend. Considering aclgraph will eventually become the main path for vllm-ascend's graph model, I propose to nominate him. **Sustained and Major Contributions:** In fact, chuanyu actively helped the users/developers of vllm-ascend since Mar 2025 ([vllm-discuss#162](https://discuss.vllm.ai/t/can-ascend-officially-draft-a-documentation-on-the-vllm-ascend-adaptation-for-graph-mode/162/5)), and also helped early users of vllm-ascend understand aclgraph. He provided lots of help in the process of integrating aclgraph with vllm-ascend. **Community Involvement‌:** As speaker, he also presents help users understand aclgraph and torch_npu [《The design philosophy of torch_npu and the high performance principle of aclGraph》](https://github.com/PyTorch-China/pytorch-meetup/blob/main/beijing-2025/%E3%80%905%E3%80%91torch_npu%20%E7%9A%84%E8%AE%BE%E8%AE%A1%E5%93%B2%E5%AD%A6%E4%B8%8E%20aclGraph%20%E9%AB%98%E6%80%A7%E8%83%BD%E5%8E%9F%E7%90%86-%E7%A7%A6%E4%BC%A0%E7%91%9C-0920.pdf) ---- They have activate contribution to vllm-ascend or have rich experience for ascend AI. Welcome! - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

github-actions bot added the merge-conflicts label Jul 31, 2025

github-actions bot added module:ops module:quantization labels Jul 31, 2025

shiyuan680 force-pushed the refactor branch from 3f11265 to 29ed064 Compare August 5, 2025 07:56

github-actions bot removed the merge-conflicts label Aug 5, 2025

shiyuan680 force-pushed the refactor branch 2 times, most recently from 0d881c9 to dd7feef Compare August 6, 2025 07:12

github-actions bot added the module:tests label Aug 6, 2025

shiyuan680 force-pushed the refactor branch 2 times, most recently from a06f159 to 71923e1 Compare August 6, 2025 08:27

github-actions bot added the merge-conflicts label Aug 7, 2025

weijinqian0 reviewed Aug 7, 2025

View reviewed changes

shiyuan680 force-pushed the refactor branch 2 times, most recently from b3ea828 to cb5f422 Compare August 7, 2025 12:08

github-actions bot removed module:tests merge-conflicts labels Aug 7, 2025

shiyuan680 changed the title ~~refactor fused_moe.py~~ refactor fused_moe.py of select_expert Aug 7, 2025

shiyuan680 force-pushed the refactor branch from cb5f422 to db6437f Compare August 7, 2025 12:25

github-actions bot added the module:tests label Aug 7, 2025

shiyuan680 force-pushed the refactor branch 7 times, most recently from 53f315c to f59fa14 Compare August 8, 2025 06:06

shiyuan680 force-pushed the refactor branch 13 times, most recently from 9dff67c to ea006b1 Compare August 12, 2025 08:13

shiyuan680 force-pushed the refactor branch 2 times, most recently from a1c9235 to bc3fb6e Compare August 12, 2025 12:06

github-actions bot added the merge-conflicts label Aug 12, 2025

shiyuan680 force-pushed the refactor branch from bc3fb6e to 2519860 Compare August 13, 2025 01:36

github-actions bot removed the merge-conflicts label Aug 13, 2025

refactor

4a904e6

Signed-off-by: yangcheng <yangcheng104@huawei.com>

shiyuan680 force-pushed the refactor branch from 2519860 to 4a904e6 Compare August 13, 2025 01:43

shiyuan680 mentioned this pull request Aug 13, 2025

[RFC]: Refactoring fused_moe #2321

Open

wangxiyuan approved these changes Aug 14, 2025

View reviewed changes

wangxiyuan merged commit e14f2ef into vllm-project:main Aug 14, 2025
25 checks passed

wangxiyuan mentioned this pull request Oct 13, 2025

[Community] Nominate new maintainers: @yiz-liu @paulyu12 @weijinqian0 @nalinaly #3406

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

refactor select_experts of moe module #2150

refactor select_experts of moe module #2150

Uh oh!

shiyuan680 commented Jul 31, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

ApsarasX commented Jul 31, 2025

Uh oh!

github-actions bot commented Aug 7, 2025

Uh oh!

weijinqian0 Aug 7, 2025

Uh oh!

shiyuan680 Aug 7, 2025

Uh oh!

codecov bot commented Aug 12, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

		def apply(self, router_logits: torch.Tensor, x: torch.Tensor):

		return super().apply(router_logits, x)

Uh oh!

refactor select_experts of moe module #2150

refactor select_experts of moe module #2150

Uh oh!

Conversation

shiyuan680 commented Jul 31, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

github-actions bot commented Jul 31, 2025

Uh oh!

ApsarasX commented Jul 31, 2025

Uh oh!

github-actions bot commented Aug 7, 2025

Uh oh!

weijinqian0 Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

shiyuan680 Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Aug 12, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

shiyuan680 commented Jul 31, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 12, 2025 •

edited

Loading