[Fix] Fix SharedFusedMoE #2817

yiz-liu · 2025-09-09T04:42:47Z

What this PR does / why we need it?

Really strange that register_oot doesn't work with SharedFusedMoE, so we have to add this patch, for now.

Does this PR introduce any user-facing change?

None.

How was this patch tested?

This PR won't have any effect in DeepSeek since we currently still stick with the old CustomDeepseekV2.

vLLM version: v0.10.1.1
vLLM main: vllm-project/vllm@0cdd213

Introduces `AscendSharedFusedMoE` to handle the specific Mixture-of-Experts architecture of DeepSeek-V2, which includes a shared expert alongside other experts. This new class processes the shared expert and the regular MoE experts separately, ensuring correct tensor parallel communication for the shared expert's output. The DeepSeek-V2 model is updated to use this new implementation. Additionally, a redundant communication reduction method is removed from the base `AscendFusedMoE` class for a cleaner implementation. Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

github-actions · 2025-09-09T04:42:55Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

gemini-code-assist

Code Review

This pull request introduces a fix for SharedFusedMoE by providing an Ascend-specific implementation, AscendSharedFusedMoE, and applying it via monkey-patching to support models like DeepSeek-V2 and Llama4. The implementation of AscendSharedFusedMoE correctly handles tensor parallel communication for shared experts. However, the monkey-patch is applied in two different places, leading to code redundancy. My review includes a suggestion to consolidate the patch in a single, more appropriate location to improve maintainability.

vllm_ascend/models/deepseek_v2.py

…2 and Llama4 Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

weijinqian0 · 2025-09-09T06:31:17Z

vllm_ascend/patch/platform/patch_common/patch_shared_fused_moe.py

+from vllm_ascend.ops.common_fused_moe import AscendSharedFusedMoE
+
+deepseek_v2.SharedFusedMoE = AscendSharedFusedMoE
+llama4.SharedFusedMoE = AscendSharedFusedMoE


Looking at the latest code, a common class has been extracted, named SharedFusedMoE.

should be patch in worker module, since it'll be used by worker process?

Looking at the latest code, a common class has been extracted, named SharedFusedMoE.

Yes, I'm well aware of that, except patching that class directly (I assume you are suggesting shared_fused_moe.SharedFusedMoE = AscendSharedFusedMoE) will have no effect at all.

should be patch in worker module, since it'll be used by worker process?

Tested in both directory, both are fine, so we can stick to this for now.

wangxiyuan

UT is broken by other change.

### What this PR does / why we need it? Really strange that `register_oot` doesn't work with `SharedFusedMoE`, so we have to add this patch, for now. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? This PR won't have any effect in DeepSeek since we currently still stick with the old `CustomDeepseekV2`. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@0cdd213 --------- Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

### What this PR does / why we need it? Really strange that `register_oot` doesn't work with `SharedFusedMoE`, so we have to add this patch, for now. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? This PR won't have any effect in DeepSeek since we currently still stick with the old `CustomDeepseekV2`. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@0cdd213 --------- Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com> Signed-off-by: offline0806 <z00858301@china.huawei.com>

### What this PR does / why we need it? Really strange that `register_oot` doesn't work with `SharedFusedMoE`, so we have to add this patch, for now. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? This PR won't have any effect in DeepSeek since we currently still stick with the old `CustomDeepseekV2`. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@0cdd213 --------- Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

…nalinaly

…nalinaly (#3406) I'd like to nominate 4 new maintainers for vllm-ascend: ---- Yizhou Liu [@yiz-liu](https://github.com/yiz-liu) ---- **Review Quality‌**: He has completed [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Ayiz-liu) and provided solutions or guides for [10+ issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3Ayiz-liu), which includes many quality review like [#issue-3428408401](#3002 (comment)), [#discussion_r2224572309](#1803 (comment)), [#issuecomment-2982470226](#1261 (comment)), [#issuecomment-2903621197](#836 (comment)), [#issuecomment-2857678691](#778 (comment)). **Sustained and High-Quality Contributions:** He has contributed more than [30+ commits](https://github.com/vllm-project/vllm-ascend/commits?author=yiz-liu) since Mar.2025, especially, aclgraph, DP, and EP related contributions are the main reason why I nominated him. As the owner of aclgraph support, he continuously improves aclgraph stability and performance as well as fixes key bugs. he laid the groundwork for EP-related functionality and delivered multiple foundational improvements **Community involvement:** He has a very good habit of logging issues：#1649 and is also very active and involved in [many issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aopen%20commenter%3Ayiz-liu%20-author%3Ayiz-liu) to help users resolve issues. ---- Peng Yu [@paulyu12](https://github.com/paulyu12) --- The main reasons for his nomination are his expertise and key contributions to the LORA and sustained and major contributions (initial support/doc/bugfix) around Lora. **Sustained and Major Contributions:** @paulyu12 starts his contribution with [Lora and Mulit-Lora support](697908f) since Apr 2025, he contributed about [10+ commits and bugfixes](697908f) on vllm-ascend. **Review Quality‌ and Community Involvement‌:** He also helped more than 10+ users address [Lora related issues](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Apaulyu12+-author%3Apaulyu12+is%3Aclosed). I believe his addition will further improve vLLM Ascend Lora support. ---- Jinqian Wei [@weijinqian0](https://github.com/weijinqian0) --- The main reasons for his nomination are his key contributions to the RL scene and the high quality of his code reviews. **Review Quality‌:** He has completed [60+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Aweijinqian0+is%3Aopen+-author%3Aweijinqian0) since June. 2025, include [#comment-3284055430](#2791 (comment)), [discussion_r2332166704](#2817 (comment)), [discussion_r2343289692](#2846 (comment)) high quality review. **Sustained and Quality Contributions:** He has Deep understanding of ‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions in RL scene (about [10+ PR merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Aweijinqian0+is%3Amerged+) and 10+ PRs merged as co-author. - Code Refactor: As a co-author, he participated in the refactoring of the MOE module #2150 #2706 #2867 - Performance Enhancement for RL: Participated as a co-author in the design and development of the solution, contributing to the planning of core capabilities. #1547 #2120 and so on. So I think he's a great addition to the vLLM Ascend Maintainer team. ---- Chuanyu Qin [@nalinaly](https://github.com/nalinaly) --- The main reason I nominated Qinchuanyu is because he is the initial designer of aclgraph and torch-npu, two key components of vllm-ascend. Considering aclgraph will eventually become the main path for vllm-ascend's graph model, I propose to nominate him. **Sustained and Major Contributions:** In fact, chuanyu actively helped the users/developers of vllm-ascend since Mar 2025 ([vllm-discuss#162](https://discuss.vllm.ai/t/can-ascend-officially-draft-a-documentation-on-the-vllm-ascend-adaptation-for-graph-mode/162/5)), and also helped early users of vllm-ascend understand aclgraph. He provided lots of help in the process of integrating aclgraph with vllm-ascend. **Community Involvement‌:** As speaker, he also presents help users understand aclgraph and torch_npu [《The design philosophy of torch_npu and the high performance principle of aclGraph》](https://github.com/PyTorch-China/pytorch-meetup/blob/main/beijing-2025/%E3%80%905%E3%80%91torch_npu%20%E7%9A%84%E8%AE%BE%E8%AE%A1%E5%93%B2%E5%AD%A6%E4%B8%8E%20aclGraph%20%E9%AB%98%E6%80%A7%E8%83%BD%E5%8E%9F%E7%90%86-%E7%A7%A6%E4%BC%A0%E7%91%9C-0920.pdf) ---- They have activate contribution to vllm-ascend or have rich experience for ascend AI. Welcome! - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

### What this PR does / why we need it? Really strange that `register_oot` doesn't work with `SharedFusedMoE`, so we have to add this patch, for now. ### Does this PR introduce _any_ user-facing change? None. ### How was this patch tested? This PR won't have any effect in DeepSeek since we currently still stick with the old `CustomDeepseekV2`. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@0cdd213 --------- Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

…nalinaly

…nalinaly (vllm-project#3406) I'd like to nominate 4 new maintainers for vllm-ascend: ---- Yizhou Liu [@yiz-liu](https://github.com/yiz-liu) ---- **Review Quality‌**: He has completed [40+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Ayiz-liu) and provided solutions or guides for [10+ issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20commenter%3Ayiz-liu), which includes many quality review like [#issue-3428408401](vllm-project#3002 (comment)), [#discussion_r2224572309](vllm-project#1803 (comment)), [#issuecomment-2982470226](vllm-project#1261 (comment)), [#issuecomment-2903621197](vllm-project#836 (comment)), [#issuecomment-2857678691](vllm-project#778 (comment)). **Sustained and High-Quality Contributions:** He has contributed more than [30+ commits](https://github.com/vllm-project/vllm-ascend/commits?author=yiz-liu) since Mar.2025, especially, aclgraph, DP, and EP related contributions are the main reason why I nominated him. As the owner of aclgraph support, he continuously improves aclgraph stability and performance as well as fixes key bugs. he laid the groundwork for EP-related functionality and delivered multiple foundational improvements **Community involvement:** He has a very good habit of logging issues：vllm-project#1649 and is also very active and involved in [many issues](https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aopen%20commenter%3Ayiz-liu%20-author%3Ayiz-liu) to help users resolve issues. ---- Peng Yu [@paulyu12](https://github.com/paulyu12) --- The main reasons for his nomination are his expertise and key contributions to the LORA and sustained and major contributions (initial support/doc/bugfix) around Lora. **Sustained and Major Contributions:** @paulyu12 starts his contribution with [Lora and Mulit-Lora support](vllm-project@697908f) since Apr 2025, he contributed about [10+ commits and bugfixes](vllm-project@697908f) on vllm-ascend. **Review Quality‌ and Community Involvement‌:** He also helped more than 10+ users address [Lora related issues](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Apaulyu12+-author%3Apaulyu12+is%3Aclosed). I believe his addition will further improve vLLM Ascend Lora support. ---- Jinqian Wei [@weijinqian0](https://github.com/weijinqian0) --- The main reasons for his nomination are his key contributions to the RL scene and the high quality of his code reviews. **Review Quality‌:** He has completed [60+ reviews](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+commenter%3Aweijinqian0+is%3Aopen+-author%3Aweijinqian0) since June. 2025, include [#comment-3284055430](vllm-project#2791 (comment)), [discussion_r2332166704](vllm-project#2817 (comment)), [discussion_r2343289692](vllm-project#2846 (comment)) high quality review. **Sustained and Quality Contributions:** He has Deep understanding of ‌vLLM‌ and ‌vLLM Ascend‌ codebases and solid contributions in RL scene (about [10+ PR merged](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3Aweijinqian0+is%3Amerged+) and 10+ PRs merged as co-author. - Code Refactor: As a co-author, he participated in the refactoring of the MOE module vllm-project#2150 vllm-project#2706 vllm-project#2867 - Performance Enhancement for RL: Participated as a co-author in the design and development of the solution, contributing to the planning of core capabilities. vllm-project#1547 vllm-project#2120 and so on. So I think he's a great addition to the vLLM Ascend Maintainer team. ---- Chuanyu Qin [@nalinaly](https://github.com/nalinaly) --- The main reason I nominated Qinchuanyu is because he is the initial designer of aclgraph and torch-npu, two key components of vllm-ascend. Considering aclgraph will eventually become the main path for vllm-ascend's graph model, I propose to nominate him. **Sustained and Major Contributions:** In fact, chuanyu actively helped the users/developers of vllm-ascend since Mar 2025 ([vllm-discuss#162](https://discuss.vllm.ai/t/can-ascend-officially-draft-a-documentation-on-the-vllm-ascend-adaptation-for-graph-mode/162/5)), and also helped early users of vllm-ascend understand aclgraph. He provided lots of help in the process of integrating aclgraph with vllm-ascend. **Community Involvement‌:** As speaker, he also presents help users understand aclgraph and torch_npu [《The design philosophy of torch_npu and the high performance principle of aclGraph》](https://github.com/PyTorch-China/pytorch-meetup/blob/main/beijing-2025/%E3%80%905%E3%80%91torch_npu%20%E7%9A%84%E8%AE%BE%E8%AE%A1%E5%93%B2%E5%AD%A6%E4%B8%8E%20aclGraph%20%E9%AB%98%E6%80%A7%E8%83%BD%E5%8E%9F%E7%90%86-%E7%A7%A6%E4%BC%A0%E7%91%9C-0920.pdf) ---- They have activate contribution to vllm-ascend or have rich experience for ascend AI. Welcome! - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

github-actions bot added the module:ops label Sep 9, 2025

gemini-code-assist bot reviewed Sep 9, 2025

View reviewed changes

vllm_ascend/models/deepseek_v2.py Outdated Show resolved Hide resolved

feat(moe): Add patch for Shared Fused MoE integration with DeepSeek-V…

ad76e33

…2 and Llama4 Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

yiz-liu force-pushed the fix-shared-moe branch from 1a23f9e to ad76e33 Compare September 9, 2025 04:46

weijinqian0 reviewed Sep 9, 2025

View reviewed changes

wangxiyuan added ready read for review ready-for-test start test by label for PR labels Sep 9, 2025

wangxiyuan approved these changes Sep 9, 2025

View reviewed changes

wangxiyuan merged commit e13c4dd into vllm-project:main Sep 9, 2025
51 of 56 checks passed

yiz-liu deleted the fix-shared-moe branch September 9, 2025 10:20

wangxiyuan mentioned this pull request Oct 13, 2025

[Community] Nominate new maintainers: @yiz-liu @paulyu12 @weijinqian0 @nalinaly #3406

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Fix] Fix SharedFusedMoE #2817

[Fix] Fix SharedFusedMoE #2817

Uh oh!

yiz-liu commented Sep 9, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Sep 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

Uh oh!

weijinqian0 Sep 9, 2025

Uh oh!

wangxiyuan Sep 9, 2025

Uh oh!

yiz-liu Sep 9, 2025 •

edited

Loading

Uh oh!

yiz-liu Sep 9, 2025

Uh oh!

wangxiyuan left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Fix] Fix SharedFusedMoE #2817

[Fix] Fix SharedFusedMoE #2817

Uh oh!

Conversation

yiz-liu commented Sep 9, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Sep 9, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

weijinqian0 Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

yiz-liu Sep 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yiz-liu Sep 9, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yiz-liu commented Sep 9, 2025 •

edited by github-actions bot

Loading

yiz-liu Sep 9, 2025 •

edited

Loading