[PERF]support MERRouter #1421

Angazenn · 2025-06-25T07:39:38Z

What this PR does / why we need it?

This PR introduces an expert rearrange algorithm for PanguProMoE model. Different from the original grouped topk, it filters out the top experts that are allocated more tokens. Therefore, we can load less experts when calculating gmm.

We have test this algorithm for PanguProMoE-72B on 300I Duo platform and 800I A2 platform. On 300I Duo platform, we find that num_voted_experts set to 5 achieves both good performance and accuracy. While on 800I A2, we still set it to 8 to use original pangu grouped topk.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

codecov · 2025-06-26T10:19:37Z

Codecov Report

❌ Patch coverage is 25.00000% with 6 lines in your changes missing coverage. Please review.
✅ Project coverage is 31.64%. Comparing base (c30ddb8) to head (9321e0c).
⚠️ Report is 581 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/ops/fused_moe.py	20.00%	4 Missing ⚠️
vllm_ascend/ops/common_fused_moe.py	33.33%	2 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1421      +/-   ##
==========================================
+ Coverage   27.39%   31.64%   +4.24%     
==========================================
  Files          56       60       +4     
  Lines        6191     6640     +449     
==========================================
+ Hits         1696     2101     +405     
- Misses       4495     4539      +44

Flag	Coverage Δ
unittests	`31.64% <25.00%> (+4.24%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

github-actions · 2025-06-27T16:14:55Z

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Signed-off-by: angazenn <zengyanjia@huawei.com>

### What this PR does / why we need it? This PR introduces an expert rearrange algorithm for PanguProMoE model. Different from the original grouped topk, it filters out the top experts that are allocated more tokens. Therefore, we can load less experts when calculating gmm. We have test this algorithm for PanguProMoE-72B on 300I Duo platform and 800I A2 platform. On 300I Duo platform, we find that `num_voted_experts` set to 5 achieves both good performance and accuracy. While on 800I A2, we still set it to 8 to use original pangu grouped topk. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested?  Signed-off-by: angazenn <zengyanjia@huawei.com> Co-authored-by: angazenn <zengyanjia@huawei.com> Signed-off-by: zhanghw0354 <zhanghaiwen_yewu@cmss.chinamobile.com>

### What this PR does / why we need it? This PR introduces an expert rearrange algorithm for PanguProMoE model. Different from the original grouped topk, it filters out the top experts that are allocated more tokens. Therefore, we can load less experts when calculating gmm. We have test this algorithm for PanguProMoE-72B on 300I Duo platform and 800I A2 platform. On 300I Duo platform, we find that `num_voted_experts` set to 5 achieves both good performance and accuracy. While on 800I A2, we still set it to 8 to use original pangu grouped topk. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested?  Signed-off-by: angazenn <zengyanjia@huawei.com> Co-authored-by: angazenn <zengyanjia@huawei.com>

Angazenn force-pushed the me branch from 78d62f9 to 41e47d8 Compare June 25, 2025 07:43

github-actions bot added the module:core label Jun 25, 2025

Angazenn force-pushed the me branch 5 times, most recently from de60ec9 to a83a608 Compare June 26, 2025 10:05

Angazenn force-pushed the me branch 2 times, most recently from ff174d1 to 4dd4478 Compare June 27, 2025 02:42

github-actions bot removed the module:core label Jun 27, 2025

Angazenn force-pushed the me branch from 4dd4478 to 10dc0a4 Compare June 27, 2025 02:59

github-actions bot added the module:ops label Jun 27, 2025

Angazenn force-pushed the me branch 3 times, most recently from 74f8b45 to b1d95e6 Compare June 27, 2025 12:16

github-actions bot added the merge-conflicts label Jun 27, 2025

Yikun mentioned this pull request Jun 27, 2025

[Release]: Release checklist for v0.9.1rc2 on main #1486

Closed

47 tasks

Angazenn force-pushed the me branch from d3f1045 to c5daea4 Compare June 27, 2025 16:35

github-actions bot removed the merge-conflicts label Jun 27, 2025

support MERRouter

9321e0c

Signed-off-by: angazenn <zengyanjia@huawei.com>

Angazenn force-pushed the me branch from c5daea4 to 9321e0c Compare June 28, 2025 01:17

Angazenn changed the title ~~[WIP]support MERRouter~~ [PERF]support MERRouter Jun 28, 2025

ganyi1996ppo approved these changes Jun 28, 2025

View reviewed changes

ganyi1996ppo merged commit c59d69d into vllm-project:main Jun 28, 2025
24 checks passed

Angazenn deleted the me branch September 8, 2025 03:16

Yikun mentioned this pull request Sep 20, 2025

[Bug]: Remove outofdate commits to improve perf test #3051

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[PERF]support MERRouter #1421

[PERF]support MERRouter #1421

Uh oh!

Angazenn commented Jun 25, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jun 26, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Jun 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[PERF]support MERRouter #1421

[PERF]support MERRouter #1421

Uh oh!

Conversation

Angazenn commented Jun 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

codecov bot commented Jun 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

github-actions bot commented Jun 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Angazenn commented Jun 25, 2025 •

edited

Loading

codecov bot commented Jun 26, 2025 •

edited

Loading