Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions vllm_ascend/ops/fused_moe.py
Original file line number Diff line number Diff line change
Expand Up @@ -413,8 +413,6 @@ def forward(self,
# When all_reduce_merge is in progress, shared_experts does not do all_reduce in mlp, but waits until shared_experts+router_experts are completed before doing all_reduce
shared_hidden_states = shared_experts(hidden_states)

mc2_mask = forward_context.mc2_mask
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This removal is correct as mc2_mask was already assigned at line 408. In the spirit of removing unused code, the variables quantized_x_for_share and dynamic_scale_for_share initialized at line 410 also appear to be unused. They are always None and are passed to self.quant_method.apply. Removing them and their usage in the apply call would further improve code clarity and maintainability.


enable_sp = _metadata_for_padding is not None and _metadata_for_padding.not_dummy_and_is_prefill
tp_size = get_tensor_model_parallel_world_size()
if enable_sp:
Expand Down
Loading