[refactor] Refactoring AscendFusedMoE #1169

zzzzwwjj · 2025-06-11T07:11:44Z

What this PR does / why we need it?

This PR is used for resolved issue 1147

Move fused_moe code into one file fused_moe.py.
Integrate branch conditions into function get_fused_moe_state.

Does this PR introduce any user-facing change?

This PR has removed the env VLLM_ENABLE_MC2, because I think this env is useless, we can make judgments based on the current scenario without this env, it will only increase complexity.
This PR has removed the env USING_LCCL_COM, because this env has already expired.
additional_config.expert_tensor_parallel_size has already expired, and now we also use parameter enable_expert_parallel, consistent with the vLLM.

How was this patch tested?

wangxiyuan · 2025-06-11T12:20:38Z

I really like this change. Let's merge this first. @Yikun @ganyi1996ppo @jianzs please take this as the high priory. Thanks.

wangxiyuan · 2025-06-11T12:23:16Z

vllm_ascend/utils.py

We can add a sub module names fused_moe in ops. Then move FusedMoEState and get_fused_moe_state to that module's utils file.

wangxiyuan · 2025-06-11T12:50:17Z

vllm_ascend/worker/model_runner_v1.py

make 8 constant

wangxiyuan · 2025-06-11T12:56:15Z

additional_config.expert_tensor_parallel_size has already expired, and now we also use parameter enable_expert_parallel, consistent with the vLLM.

Any change in this PR related to this commit message?

wangxiyuan · 2025-06-11T12:59:03Z

VLLM_ENABLE_MC2 in example files can be removed as well.

jianzs · 2025-06-11T13:04:51Z

It's better to describe which communication kernel is chosen for different configurations.

wangxiyuan · 2025-06-11T13:18:19Z

@jianzs it's described clear in RFC. @zzzzwwjj maybe you can add more note in the code logic

whx-sjtu · 2025-06-12T13:59:24Z

vllm_ascend/worker/model_runner_v1.py

The current usage of MC2 kernel does not support non-uniform inputs, so padding is still required.

wangxiyuan · 2025-06-14T10:46:30Z

please rebase to main to make sure the torchair CI passed

github-actions bot added module:ops module:core module:quantization labels Jun 11, 2025

zzzzwwjj force-pushed the main branch 6 times, most recently from a79efd7 to 20e3bb6 Compare June 11, 2025 10:18

Yikun added long-term-test enable long term test for PR ready-for-test start test by label for PR labels Jun 11, 2025

wangxiyuan reviewed Jun 11, 2025

View reviewed changes

vllm_ascend/worker/model_runner_v1.py Outdated

Copy link

Collaborator

wangxiyuan Jun 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make 8 constant

wangxiyuan mentioned this pull request Jun 12, 2025

Support multistream of MLA vector operations #1135

Merged

whx-sjtu reviewed Jun 12, 2025

View reviewed changes

zzzzwwjj closed this Jun 15, 2025

zzzzwwjj force-pushed the main branch from 20e3bb6 to 0d2074a Compare June 15, 2025 10:11

wangxiyuan mentioned this pull request Jun 16, 2025

[refactor] Refactoring AscendFusedMoE #1229

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[refactor] Refactoring AscendFusedMoE #1169

[refactor] Refactoring AscendFusedMoE #1169

Uh oh!

zzzzwwjj commented Jun 11, 2025 •

edited

Loading

Uh oh!

wangxiyuan commented Jun 11, 2025

Uh oh!

wangxiyuan Jun 11, 2025

Uh oh!

wangxiyuan Jun 11, 2025

Uh oh!

wangxiyuan commented Jun 11, 2025 •

edited

Loading

Uh oh!

wangxiyuan commented Jun 11, 2025

Uh oh!

jianzs commented Jun 11, 2025

Uh oh!

wangxiyuan commented Jun 11, 2025

Uh oh!

whx-sjtu Jun 12, 2025

Uh oh!

wangxiyuan commented Jun 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[refactor] Refactoring AscendFusedMoE #1169

[refactor] Refactoring AscendFusedMoE #1169

Uh oh!

Conversation

zzzzwwjj commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

wangxiyuan commented Jun 11, 2025

Uh oh!

wangxiyuan Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan Jun 11, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan commented Jun 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wangxiyuan commented Jun 11, 2025

Uh oh!

jianzs commented Jun 11, 2025

Uh oh!

wangxiyuan commented Jun 11, 2025

Uh oh!

whx-sjtu Jun 12, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan commented Jun 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

zzzzwwjj commented Jun 11, 2025 •

edited

Loading

wangxiyuan commented Jun 11, 2025 •

edited

Loading