[main] [refactor] refactor common_fused_moe.py #2706

Pr0Wh1teGivee · 2025-09-02T12:36:42Z

What this PR does / why we need it?

Move prepare/finalize operation from moe_comm_method to /ops/moe/fused_moe_prepare_and_finalize
Adapt to token_dispatcher in moe_comm_method
Move moe_comm_method/experts_selector/token_dispatcher/fused_moe_prepare_and_finalize to /ops/moe

Does this PR introduce any user-facing change?

no

How was this patch tested?

e2e & ut

vLLM version: v0.10.1.1
vLLM main: vllm-project/vllm@f4962a6

github-actions · 2025-09-02T12:36:51Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request refactors the MoE communication methods by introducing a TokenDispatcher abstraction, which is a good move to centralize logic and reduce code duplication. The changes are well-structured and improve the overall design. I've identified a couple of areas for improvement: making a base class method abstract to enforce implementation by subclasses, and removing some redundant code in an __init__ method. Addressing these points will make the implementation more robust and maintainable.

gemini-code-assist · 2025-09-02T12:38:39Z

vllm_ascend/distributed/moe_comm_method.py

+    def get_token_dispatcher(self):
+        pass


The get_token_dispatcher method in the base class MoECommMethod is implemented with pass, which means it returns None. The __init__ method calls this and assigns the result to self.token_dispatcher. Later, permute and unpermute methods attempt to use self.token_dispatcher, which will result in an AttributeError if a subclass does not override get_token_dispatcher. To prevent this and enforce the correct implementation in subclasses, this method should be decorated with @abstractmethod.

Suggested change

def get_token_dispatcher(self):

pass

@abstractmethod

def get_token_dispatcher(self):

pass

gemini-code-assist · 2025-09-02T12:38:39Z

vllm_ascend/distributed/moe_comm_method.py

+        num_experts = kwargs["num_experts"]
+        num_local_experts = kwargs["num_local_experts"]
+        from vllm_ascend.ops.moe_dispatcher.token_dispatcher import TokenDispatcherWithAll2AllV
+        # from vllm_ascend.ops.moe_dispatcher.token_dispatcher import \
+        #     get_token_dispatcher
+        # self.token_dispatcher = get_token_dispatcher(
+        #     "TokenDispatcherWithAll2AllV")
+        self.token_dispatcher = TokenDispatcherWithAll2AllV(
+            top_k=self.moe_config.experts_per_token,
+            num_experts=self.num_experts,
+            num_local_experts=self.num_local_experts)


The initialization of self.token_dispatcher and the extraction of num_experts and num_local_experts in AlltoAllCommImpl.__init__ are redundant. The call to super().__init__ already handles this logic by calling the get_token_dispatcher method, which is implemented in this class. This redundant code makes the class harder to maintain and can lead to inconsistencies. It should be removed for clarity and correctness.

github-actions · 2025-09-04T03:42:09Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

github-actions · 2025-09-08T01:23:08Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Co-Authored-By: weijinqian0 <12153182+weijinqian0@users.noreply.github.com> Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>

codecov · 2025-09-08T09:29:41Z

Codecov Report

❌ Patch coverage is 90.96916% with 41 lines in your changes missing coverage. Please review.
✅ Project coverage is 73.48%. Comparing base (2693196) to head (432b64a).
⚠️ Report is 7 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/ops/moe/moe_comm_method.py	69.51%	25 Missing ⚠️
...m_ascend/ops/moe/fused_moe_prepare_and_finalize.py	93.57%	7 Missing ⚠️
vllm_ascend/ops/common_fused_moe.py	40.00%	6 Missing ⚠️
vllm_ascend/quantization/w8a8_dynamic.py	33.33%	2 Missing ⚠️
vllm_ascend/ascend_forward_context.py	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2706      +/-   ##
==========================================
+ Coverage   72.61%   73.48%   +0.87%     
==========================================
  Files         154      157       +3     
  Lines       21319    21519     +200     
==========================================
+ Hits        15480    15813     +333     
+ Misses       5839     5706     -133

Flag	Coverage Δ
unittests	`73.48% <90.96%> (+0.87%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

wangxiyuan · 2025-09-08T12:09:44Z

it's very clear now.

MengqingCao · 2025-09-08T12:16:51Z

vllm_ascend/ops/moe/moe_comm_method.py

+                                                  TokenDispatcherWithMC2)
+
+
+class MoECommMethod(ABC):


nits: I think an abstract class should just contain the unified api, without the detailed implementation, and make more comments like the original MoECommMethod class. Maybe we could add a class MoECommMethodBase to inherit form MoECommMethod, which contains the detailed prepare and other member functions. We could do this in next pr

good point, will be fixed in my next pr.

MengqingCao · 2025-09-08T12:19:28Z

vllm_ascend/ops/moe/fused_moe_prepare_and_finalize.py

+
+class FusedMoEPrepareAndFinalize(ABC):
+
+    def __init__(self, moe_config: FusedMoEConfig):


Let's make more comments of FusedMoEPrepareAndFinalize and all the member functions of it

same, comments will be added in my next pr.

### What this PR does / why we need it? 1. Move prepare/finalize operation from moe_comm_method to /ops/moe/fused_moe_prepare_and_finalize 2. Adapt to token_dispatcher in moe_comm_method 3. Move moe_comm_method/experts_selector/token_dispatcher/fused_moe_prepare_and_finalize to /ops/moe ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? e2e & ut - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@f4962a6 Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com> Co-authored-by: weijinqian0 <12153182+weijinqian0@users.noreply.github.com> Signed-off-by: 1Fire4 <wangdingyi2@huawei.com>

### What this PR does / why we need it? 1. Move prepare/finalize operation from moe_comm_method to /ops/moe/fused_moe_prepare_and_finalize 2. Adapt to token_dispatcher in moe_comm_method 3. Move moe_comm_method/experts_selector/token_dispatcher/fused_moe_prepare_and_finalize to /ops/moe ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? e2e & ut - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@f4962a6 Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com> Co-authored-by: weijinqian0 <12153182+weijinqian0@users.noreply.github.com>

…nalinaly

…nalinaly (#3406) I'd like to nominate 4 new maintainers for vllm-ascend: ---- Yizhou Liu [@yiz-liu](https://github.com/yiz-liu) ---- **Review Quality‌**: He has completed [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Ayiz-liu) and provided solutions or guides for [10+ issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3Ayiz-liu), which includes many quality review like [#issue-3428408401](#3002 (comment)), [#discussion_r2224572309](#1803 (comment)), [#issuecomment-2982470226](#1261 (comment)), [#issuecomment-2903621197](#836 (comment)), [#issuecomment-2857678691](#778 (comment)). **Sustained and High-Quality Contributions:** He has contributed more than [30+ commits](https://github.com/vllm-project/vllm-ascend/commits?author=yiz-liu) since Mar.2025, especially, aclgraph, DP, and EP related contributions are the main reason why I nominated him. As the owner of aclgraph support, he continuously improves aclgraph stability and performance as well as fixes key bugs. he laid the groundwork for EP-related functionality and delivered multiple foundational improvements **Community involvement:** He has a very good habit of logging issues：#1649 and is also very active and involved in [many issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aopen%20commenter%3Ayiz-liu%20-author%3Ayiz-liu) to help users resolve issues. ---- Peng Yu [@paulyu12](https://github.com/paulyu12) --- The main reasons for his nomination are his expertise and key contributions to the LORA and sustained and major contributions (initial support/doc/bugfix) around Lora. **Sustained and Major Contributions:** @paulyu12 starts his contribution with [Lora and Mulit-Lora support](697908f) since Apr 2025, he contributed about [10+ commits and bugfixes](697908f) on vllm-ascend. **Review Quality‌ and Community Involvement‌:** He also helped more than 10+ users address [Lora related issues](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Apaulyu12+-author%3Apaulyu12+is%3Aclosed). I believe his addition will further improve vLLM Ascend Lora support. ---- Jinqian Wei [@weijinqian0](https://github.com/weijinqian0) --- The main reasons for his nomination are his key contributions to the RL scene and the high quality of his code reviews. **Review Quality‌:** He has completed [60+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Aweijinqian0+is%3Aopen+-author%3Aweijinqian0) since June. 2025, include [#comment-3284055430](#2791 (comment)), [discussion_r2332166704](#2817 (comment)), [discussion_r2343289692](#2846 (comment)) high quality review. **Sustained and Quality Contributions:** He has Deep understanding of ‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions in RL scene (about [10+ PR merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Aweijinqian0+is%3Amerged+) and 10+ PRs merged as co-author. - Code Refactor: As a co-author, he participated in the refactoring of the MOE module #2150 #2706 #2867 - Performance Enhancement for RL: Participated as a co-author in the design and development of the solution, contributing to the planning of core capabilities. #1547 #2120 and so on. So I think he's a great addition to the vLLM Ascend Maintainer team. ---- Chuanyu Qin [@nalinaly](https://github.com/nalinaly) --- The main reason I nominated Qinchuanyu is because he is the initial designer of aclgraph and torch-npu, two key components of vllm-ascend. Considering aclgraph will eventually become the main path for vllm-ascend's graph model, I propose to nominate him. **Sustained and Major Contributions:** In fact, chuanyu actively helped the users/developers of vllm-ascend since Mar 2025 ([vllm-discuss#162](https://discuss.vllm.ai/t/can-ascend-officially-draft-a-documentation-on-the-vllm-ascend-adaptation-for-graph-mode/162/5)), and also helped early users of vllm-ascend understand aclgraph. He provided lots of help in the process of integrating aclgraph with vllm-ascend. **Community Involvement‌:** As speaker, he also presents help users understand aclgraph and torch_npu [《The design philosophy of torch_npu and the high performance principle of aclGraph》](https://github.com/PyTorch-China/pytorch-meetup/blob/main/beijing-2025/%E3%80%905%E3%80%91torch_npu%20%E7%9A%84%E8%AE%BE%E8%AE%A1%E5%93%B2%E5%AD%A6%E4%B8%8E%20aclGraph%20%E9%AB%98%E6%80%A7%E8%83%BD%E5%8E%9F%E7%90%86-%E7%A7%A6%E4%BC%A0%E7%91%9C-0920.pdf) ---- They have activate contribution to vllm-ascend or have rich experience for ascend AI. Welcome! - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

### What this PR does / why we need it? 1. Move prepare/finalize operation from moe_comm_method to /ops/moe/fused_moe_prepare_and_finalize 2. Adapt to token_dispatcher in moe_comm_method 3. Move moe_comm_method/experts_selector/token_dispatcher/fused_moe_prepare_and_finalize to /ops/moe ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? e2e & ut - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@f4962a6 Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com> Co-authored-by: weijinqian0 <12153182+weijinqian0@users.noreply.github.com>

…nalinaly

…nalinaly (vllm-project#3406) I'd like to nominate 4 new maintainers for vllm-ascend: ---- Yizhou Liu [@yiz-liu](https://github.com/yiz-liu) ---- **Review Quality‌**: He has completed [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Ayiz-liu) and provided solutions or guides for [10+ issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3Ayiz-liu), which includes many quality review like [#issue-3428408401](vllm-project#3002 (comment)), [#discussion_r2224572309](vllm-project#1803 (comment)), [#issuecomment-2982470226](vllm-project#1261 (comment)), [#issuecomment-2903621197](vllm-project#836 (comment)), [#issuecomment-2857678691](vllm-project#778 (comment)). **Sustained and High-Quality Contributions:** He has contributed more than [30+ commits](https://github.com/vllm-project/vllm-ascend/commits?author=yiz-liu) since Mar.2025, especially, aclgraph, DP, and EP related contributions are the main reason why I nominated him. As the owner of aclgraph support, he continuously improves aclgraph stability and performance as well as fixes key bugs. he laid the groundwork for EP-related functionality and delivered multiple foundational improvements **Community involvement:** He has a very good habit of logging issues：vllm-project#1649 and is also very active and involved in [many issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aopen%20commenter%3Ayiz-liu%20-author%3Ayiz-liu) to help users resolve issues. ---- Peng Yu [@paulyu12](https://github.com/paulyu12) --- The main reasons for his nomination are his expertise and key contributions to the LORA and sustained and major contributions (initial support/doc/bugfix) around Lora. **Sustained and Major Contributions:** @paulyu12 starts his contribution with [Lora and Mulit-Lora support](vllm-project@697908f) since Apr 2025, he contributed about [10+ commits and bugfixes](vllm-project@697908f) on vllm-ascend. **Review Quality‌ and Community Involvement‌:** He also helped more than 10+ users address [Lora related issues](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Apaulyu12+-author%3Apaulyu12+is%3Aclosed). I believe his addition will further improve vLLM Ascend Lora support. ---- Jinqian Wei [@weijinqian0](https://github.com/weijinqian0) --- The main reasons for his nomination are his key contributions to the RL scene and the high quality of his code reviews. **Review Quality‌:** He has completed [60+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Aweijinqian0+is%3Aopen+-author%3Aweijinqian0) since June. 2025, include [#comment-3284055430](vllm-project#2791 (comment)), [discussion_r2332166704](vllm-project#2817 (comment)), [discussion_r2343289692](vllm-project#2846 (comment)) high quality review. **Sustained and Quality Contributions:** He has Deep understanding of ‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions in RL scene (about [10+ PR merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Aweijinqian0+is%3Amerged+) and 10+ PRs merged as co-author. - Code Refactor: As a co-author, he participated in the refactoring of the MOE module vllm-project#2150 vllm-project#2706 vllm-project#2867 - Performance Enhancement for RL: Participated as a co-author in the design and development of the solution, contributing to the planning of core capabilities. vllm-project#1547 vllm-project#2120 and so on. So I think he's a great addition to the vLLM Ascend Maintainer team. ---- Chuanyu Qin [@nalinaly](https://github.com/nalinaly) --- The main reason I nominated Qinchuanyu is because he is the initial designer of aclgraph and torch-npu, two key components of vllm-ascend. Considering aclgraph will eventually become the main path for vllm-ascend's graph model, I propose to nominate him. **Sustained and Major Contributions:** In fact, chuanyu actively helped the users/developers of vllm-ascend since Mar 2025 ([vllm-discuss#162](https://discuss.vllm.ai/t/can-ascend-officially-draft-a-documentation-on-the-vllm-ascend-adaptation-for-graph-mode/162/5)), and also helped early users of vllm-ascend understand aclgraph. He provided lots of help in the process of integrating aclgraph with vllm-ascend. **Community Involvement‌:** As speaker, he also presents help users understand aclgraph and torch_npu [《The design philosophy of torch_npu and the high performance principle of aclGraph》](https://github.com/PyTorch-China/pytorch-meetup/blob/main/beijing-2025/%E3%80%905%E3%80%91torch_npu%20%E7%9A%84%E8%AE%BE%E8%AE%A1%E5%93%B2%E5%AD%A6%E4%B8%8E%20aclGraph%20%E9%AB%98%E6%80%A7%E8%83%BD%E5%8E%9F%E7%90%86-%E7%A7%A6%E4%BC%A0%E7%91%9C-0920.pdf) ---- They have activate contribution to vllm-ascend or have rich experience for ascend AI. Welcome! - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

github-actions bot added the module:ops label Sep 2, 2025

gemini-code-assist bot reviewed Sep 2, 2025

View reviewed changes

github-actions bot added module:quantization merge-conflicts labels Sep 2, 2025

Pr0Wh1teGivee changed the title ~~refactor common_fused_moe.py~~ [main] [refactor] refactor common_fused_moe.py Sep 4, 2025

Pr0Wh1teGivee force-pushed the prepare_finalize branch from 36570df to 4f0f1e4 Compare September 4, 2025 12:13

github-actions bot added module:core and removed merge-conflicts labels Sep 4, 2025

Pr0Wh1teGivee force-pushed the prepare_finalize branch from 03e135f to 885d9ce Compare September 5, 2025 01:49

github-actions bot added the module:tests label Sep 5, 2025

Pr0Wh1teGivee force-pushed the prepare_finalize branch 3 times, most recently from 727453f to a72549b Compare September 5, 2025 15:30

github-actions bot added the merge-conflicts label Sep 8, 2025

Pr0Wh1teGivee force-pushed the prepare_finalize branch from 4c4f55a to 8ceca57 Compare September 8, 2025 01:29

github-actions bot removed the merge-conflicts label Sep 8, 2025

Pr0Wh1teGivee force-pushed the prepare_finalize branch 7 times, most recently from 01e4390 to c13b334 Compare September 8, 2025 06:45

refactor common_fused_moe.py

432b64a

Co-Authored-By: weijinqian0 <12153182+weijinqian0@users.noreply.github.com> Signed-off-by: weichen <calvin_zhu0210@outlook.com> Signed-off-by: Pr0Wh1teGivee <calvin_zhu0210@outlook.com>

Pr0Wh1teGivee force-pushed the prepare_finalize branch from c13b334 to 432b64a Compare September 8, 2025 07:34

wangxiyuan approved these changes Sep 8, 2025

View reviewed changes

wangxiyuan merged commit a041d4f into vllm-project:main Sep 8, 2025
25 checks passed

MengqingCao reviewed Sep 8, 2025

View reviewed changes

wangxiyuan mentioned this pull request Oct 13, 2025

[Community] Nominate new maintainers: @yiz-liu @paulyu12 @weijinqian0 @nalinaly #3406

Merged


		class FusedMoEPrepareAndFinalize(ABC):

		def __init__(self, moe_config: FusedMoEConfig):

Uh oh!

[main] [refactor] refactor common_fused_moe.py #2706

[main] [refactor] refactor common_fused_moe.py #2706

Uh oh!

Conversation

Pr0Wh1teGivee commented Sep 2, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Sep 2, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 2, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

github-actions bot commented Sep 8, 2025

Uh oh!

codecov bot commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

wangxiyuan commented Sep 8, 2025

Uh oh!

Uh oh!

MengqingCao Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Pr0Wh1teGivee Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Pr0Wh1teGivee Sep 8, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Pr0Wh1teGivee commented Sep 2, 2025 •

edited by github-actions bot

Loading

codecov bot commented Sep 8, 2025 •

edited

Loading