Skip to content

Conversation

@Pr0Wh1teGivee
Copy link
Contributor

@Pr0Wh1teGivee Pr0Wh1teGivee commented Sep 15, 2025

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

@github-actions
Copy link

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

  • A PR should do only one thing, smaller PRs enable faster reviews.
  • Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
  • Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant and valuable refactoring of the Mixture of Experts (MoE) implementation, unifying various approaches under a common interface. This improves modularity and maintainability by encapsulating communication logic within dedicated MoECommMethod classes. However, this refactoring appears to have left behind a duplicated file (vllm_ascend/ops/fused_moe.py), which is very similar to vllm_ascend/ops/common_fused_moe.py and could cause confusion. Additionally, I've identified a critical bug in the MoE communication method selection logic and the removal of some important safety assertions. Please address these points to ensure the stability and correctness of the new implementation.

moe_comm_method = "naivemulticast"

if model_type == "PanguProMoE":
moe_comm_method == "allgather"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a typo here. moe_comm_method == "allgather" is a comparison and has no effect. It should be an assignment: moe_comm_method = "allgather". This bug will cause PanguProMoE models to use an incorrect MoE communication method, potentially leading to runtime errors or incorrect behavior.

Suggested change
moe_comm_method == "allgather"
moe_comm_method = "allgather"

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bug here.

Comment on lines 94 to 93
global_redundant_expert_num: int = 0,
need_trans: bool = False) -> torch.Tensor:
# Check constraints
assert hidden_states.shape[1] == w1.shape[1], (
f"Hidden size mismatch {hidden_states.shape[1]} != {w1.shape[1]}")
assert topk_weights.shape == topk_ids.shape, "topk shape mismatch"
assert hidden_states.is_contiguous(
), "Hidden_states must be contiguous"
assert w1.stride(-1) == 1, "Stride of last dimension must be 1"
assert w2.stride(-1) == 1, "Stride of last dimension must be 1"
assert hidden_states.dtype in [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Several important assertions checking for tensor shape matching and contiguity have been removed. While the first assertion for hidden size mismatch was likely incorrect, the others (e.g., topk_weights.shape == topk_ids.shape, hidden_states.is_contiguous()) are valuable for ensuring correctness and preventing hard-to-debug errors from underlying NPU operations, which may have strict input requirements. Please consider restoring the valid assertions to maintain input validation and prevent potential silent failures.

@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

weijinqian_v1 and others added 2 commits September 16, 2025 11:20
Signed-off-by: weijinqian_v1 <weijinqian@huawei.com>
@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants