Skip to content

Conversation

@sdmyzlp
Copy link
Contributor

@sdmyzlp sdmyzlp commented Jun 25, 2025

What this PR does / why we need it?

After #1229 , multistream moe with both mc2 and TP is broken, due to all_gather operation in moe layer used to communicate the summation of router_hidden_states and shared_hidden_states, but only router_hidden_states now, causing a shape mismatch.
Fix this by adding yet another all_gather operation for shared_hidden_states.

By the way, all these all_gather operation exists due to a joint effect of:

  1. attn use all_reduce instead of all_gather + reduce_scatter for TP communication and,
  2. moe transfer TP to EP, necessitating split operations at the entry of moe layer.

This problem will be addressed by #1194 -like optimizations, after that is merged, both split and all_gather here can be removed.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Tested both offline, and by graph mode mla e2e testcase.

@sdmyzlp sdmyzlp force-pushed the br_fix_multi_stream_moe branch 5 times, most recently from 2743781 to e322c53 Compare June 25, 2025 07:02
@codecov
Copy link

codecov bot commented Jun 25, 2025

Codecov Report

Attention: Patch coverage is 0% with 18 lines in your changes missing coverage. Please review.

Project coverage is 27.21%. Comparing base (c30ddb8) to head (e289203).
Report is 24 commits behind head on main.

Files with missing lines Patch % Lines
vllm_ascend/ops/fused_moe.py 0.00% 15 Missing ⚠️
vllm_ascend/models/deepseek_v2.py 0.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1417      +/-   ##
==========================================
- Coverage   27.39%   27.21%   -0.18%     
==========================================
  Files          56       56              
  Lines        6191     6217      +26     
==========================================
- Hits         1696     1692       -4     
- Misses       4495     4525      +30     
Flag Coverage Δ
unittests 27.21% <0.00%> (-0.18%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sdmyzlp sdmyzlp force-pushed the br_fix_multi_stream_moe branch 3 times, most recently from 67c5113 to e289203 Compare June 25, 2025 11:46
sdmyzlp added 2 commits June 25, 2025 20:10
Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

sdmyzlp added 2 commits June 25, 2025 21:28
Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
Signed-off-by: sdmyzlp <lrwei2@petalmail.com>
@sdmyzlp sdmyzlp force-pushed the br_fix_multi_stream_moe branch from e289203 to c7f4b60 Compare June 25, 2025 13:28
@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant