add gatherep select. #2740

momo609 · 2025-09-04T04:33:36Z

What this PR does / why we need it?

add gatherep select.

Does this PR introduce any user-facing change?

How was this patch tested?

vLLM version: v0.10.1.1
vLLM main: vllm-project/vllm@e599e2c

gemini-code-assist

Code Review

This pull request updates the token dispatcher selection logic for MoE models on Ascend hardware, making it dependent on the SoC version, token count, and expert parallelism size. My review has identified a critical issue where an unsupported SoC version could lead to a runtime error due to an unhandled case. Additionally, I've noted a high-severity maintainability concern regarding a magic number and code duplication for a key capacity parameter. Addressing these points will enhance the code's robustness and clarity.

gemini-code-assist · 2025-09-04T04:35:04Z

vllm_ascend/ascend_forward_context.py

+    if soc_version in {AscendSocVersion.A2}:
+        if num_tokens <= mc2_tokens_capacity and ep_size >=16:
+            return "TokenDispatcherWithMC2"
+        else:
+            return "TokenDispatcherWithAllGather"
+    elif soc_version in {AscendSocVersion.A3}:
+        if num_tokens <= mc2_tokens_capacity:
+            return "TokenDispatcherWithMC2"
+        else:
+            return "TokenDispatcherWithAll2AllV"


The function does not handle cases where soc_version is not AscendSocVersion.A2 or AscendSocVersion.A3. In such a scenario, the function will implicitly return None, which violates its str return type annotation and will likely cause a TypeError at runtime. To prevent this, you should add a case to handle unsupported SoC versions, for instance by raising a ValueError.

Suggested change

if soc_version in {AscendSocVersion.A2}:

if num_tokens <= mc2_tokens_capacity and ep_size >=16:

return "TokenDispatcherWithMC2"

else:

return "TokenDispatcherWithAllGather"

elif soc_version in {AscendSocVersion.A3}:

if num_tokens <= mc2_tokens_capacity:

return "TokenDispatcherWithMC2"

else:

return "TokenDispatcherWithAll2AllV"

if soc_version in {AscendSocVersion.A2}:

if num_tokens <= mc2_tokens_capacity and ep_size >=16:

return "TokenDispatcherWithMC2"

else:

return "TokenDispatcherWithAllGather"

elif soc_version in {AscendSocVersion.A3}:

if num_tokens <= mc2_tokens_capacity:

return "TokenDispatcherWithMC2"

else:

return "TokenDispatcherWithAll2AllV"

else:

raise ValueError(f"Unsupported soc_version: {soc_version}")

gemini-code-assist · 2025-09-04T04:35:04Z

vllm_ascend/ascend_forward_context.py

        from vllm_ascend.ops.moe_dispatcher.token_dispatcher import \
            get_token_dispatcher
-        dispatcher_name = get_dispatcher_name(ep_size, with_prefill)
+        mc2_tokens_capacity = 512 * vllm_config.parallel_config.tensor_parallel_size


The value 512 used to calculate mc2_tokens_capacity is a magic number, which makes the code harder to understand and maintain. This same calculation is also present in vllm_ascend/worker/model_runner_v1.py at line 369. To improve clarity and avoid potential inconsistencies, this value should be extracted into a named constant and defined in a central location, such as vllm_ascend/ascend_config.py, so it can be reused across the codebase.

github-actions · 2025-09-04T04:54:25Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

codecov · 2025-09-04T12:12:15Z

Codecov Report

❌ Patch coverage is 95.83333% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 72.95%. Comparing base (4c90fa7) to head (195ca4d).
⚠️ Report is 11 commits behind head on main.

Files with missing lines	Patch %	Lines
vllm_ascend/ops/moe_dispatcher/token_dispatcher.py	66.66%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2740      +/-   ##
==========================================
- Coverage   72.99%   72.95%   -0.04%     
==========================================
  Files         153      154       +1     
  Lines       21331    21418      +87     
==========================================
+ Hits        15571    15626      +55     
- Misses       5760     5792      +32

Flag	Coverage Δ
unittests	`72.95% <95.83%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>

### What this PR does / why we need it? add gatherep select. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@e599e2c Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>

momo609 · 2025-09-14T10:47:59Z

vllm_ascend/ascend_forward_context.py

-        return "TokenDispatcherWithAll2AllV"
-
-    if with_prefill:
+    elif envs_ascend.VLLM_ENABLE_FUSED_EXPERTS_ALLGATHER_EP and ep_size > 1:


TODO: this logic will be consolidated with moe_common_method, without relying on environment variable checks.

### What this PR does / why we need it? add gatherep select. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@e599e2c Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Signed-off-by: offline0806 <z00858301@china.huawei.com>

### What this PR does / why we need it? add gatherep select. - vLLM version: v0.10.1.1 - vLLM main: vllm-project/vllm@e599e2c Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com> Co-authored-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>

gemini-code-assist bot reviewed Sep 4, 2025

View reviewed changes

github-actions bot added the module:core label Sep 4, 2025

momo609 force-pushed the gatherep3 branch 3 times, most recently from 466f84a to 3160093 Compare September 4, 2025 11:28

github-actions bot added the module:ops label Sep 4, 2025

momo609 force-pushed the gatherep3 branch 2 times, most recently from ee26a93 to dd66e51 Compare September 5, 2025 02:19

github-actions bot added the module:tests label Sep 5, 2025

add gatherep select.

195ca4d

Signed-off-by: wangxiaoxin-sherie <wangxiaoxin7@huawei.com>

momo609 force-pushed the gatherep3 branch from dd66e51 to 195ca4d Compare September 5, 2025 05:05

wangxiyuan approved these changes Sep 5, 2025

View reviewed changes

wangxiyuan merged commit 2693196 into vllm-project:main Sep 8, 2025
25 of 28 checks passed

momo609 commented Sep 14, 2025

View reviewed changes

Yikun mentioned this pull request Sep 20, 2025

[Bug]: Remove outofdate commits to improve perf test #3051

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

add gatherep select. #2740

add gatherep select. #2740

Uh oh!

momo609 commented Sep 4, 2025 •

edited by github-actions bot

Loading

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Uh oh!

gemini-code-assist bot Sep 4, 2025

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

codecov bot commented Sep 4, 2025 •

edited

Loading

Uh oh!

Uh oh!

momo609 Sep 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

add gatherep select. #2740

add gatherep select. #2740

Uh oh!

Conversation

momo609 commented Sep 4, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Sep 4, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Sep 4, 2025

Uh oh!

codecov bot commented Sep 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

momo609 Sep 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

momo609 commented Sep 4, 2025 •

edited by github-actions bot

Loading

codecov bot commented Sep 4, 2025 •

edited

Loading