[CI][Quickfix] Fix AscendFusedMoE init error #2268

MengqingCao · 2025-08-07T11:29:54Z

What this PR does / why we need it?

Fix AscendFusedMoE init error. Use super().__init__() instead of super(FusedMoE, self).__init__() to ensure the member variables in base class could be called by the children class

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

CI passed with new existing test.

vLLM version: v0.10.0
vLLM main: vllm-project/vllm@766bc81

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions · 2025-08-07T11:30:01Z

👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:‌‌

A PR should do only one thing, smaller PRs enable faster reviews.
Every PR should include unit tests and end-to-end tests ‌to ensure it works and is not broken by other future PRs.
Write the commit message by fulfilling the PR description to help reviewer and future developers understand.

If CI fails, you can run linting and testing checks locally according Contributing and Testing.

Signed-off-by: MengqingCao <cmq0113@163.com>

codecov · 2025-08-07T13:18:57Z

Codecov Report

❌ Patch coverage is 87.50000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 76.09%. Comparing base (dceef08) to head (0642fa4).
⚠️ Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
tests/ut/ops/test_fused_ops.py	85.71%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #2268      +/-   ##
==========================================
- Coverage   76.10%   76.09%   -0.01%     
==========================================
  Files         114      114              
  Lines       13097    13103       +6     
==========================================
+ Hits         9967     9971       +4     
- Misses       3130     3132       +2

Flag	Coverage Δ
unittests	`76.09% <87.50%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

ApsarasX · 2025-08-07T17:03:38Z

vllm_ascend/ops/fused_moe.py

+            custom_routing_function=custom_routing_function,
+            scoring_func=scoring_func,
+            e_score_correction_bias=e_score_correction_bias,
+            activation=activation,


missing apply_router_weight_on_input?

actually there is no apply_router_weight_on_input in FusedMoE:
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/fused_moe/layer.py#L681-L705

actually there is no apply_router_weight_on_input in FusedMoE: https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/fused_moe/layer.py#L681-L705

Thanks for this reminder, I'll fix it soon

ganyi1996ppo · 2025-08-08T02:14:30Z

We need this pr merge asap to fix the ci issue

### What this PR does / why we need it? Fix AscendFusedMoE init error. Use `super().__init__()` instead of `super(FusedMoE, self).__init__()` to ensure the member variables in base class could be called by the children class ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@766bc81 --------- Signed-off-by: MengqingCao <cmq0113@163.com>

* enable mm allreduce test (vllm-project#2192) ### What this PR does / why we need it? This PR is to add e2e test for using npu_mm_all_reduce_base fusion kernel. ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? not involved - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@5d5d419 Signed-off-by: Ronald1995 <ronaldautomobile@163.com> * [main] remove torch.cat and replace it by List[0] (vllm-project#2153) ### What this PR does / why we need it? torch_npu.npu_grouped_matmul: https://www.hiascend.com/document/detail/zh/Pytorch/710/apiref/torchnpuCustomsapi/context/torch_npu-npu_grouped_matmul.md According to the document, when `split_item` is 2 or 3, `torch_npu.npu_grouped_matmul` will return a list which has one element. Therefore, the `torch.cat` after `torch_npu.npu_grouped_matmul` is unnecessary. ### Does this PR introduce _any_ user-facing change? not involved ### How was this patch tested? ut and e2e covered: `tests/ut/ops/test_fused_ops.py`, `tests/e2e/singlecard/ops/test_fused_moe.py` **performance**: (qwen3 30B, 2k->20k) base: Total Token throughput (tok/s): 667.76 remove cat: Total Token throughput (tok/s): 680.82 - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@fa00c5d Signed-off-by: huangxialu <huangxialu1@huawei.com> * [CI][Quickfix] Fix AscendFusedMoE init error (vllm-project#2268) ### What this PR does / why we need it? Fix AscendFusedMoE init error. Use `super().__init__()` instead of `super(FusedMoE, self).__init__()` to ensure the member variables in base class could be called by the children class ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@766bc81 --------- Signed-off-by: MengqingCao <cmq0113@163.com> * Fix accuracy test config and add DeepSeek-V2-Lite test (vllm-project#2261) ### What this PR does / why we need it? This PR fix accuracy test related to vllm-project#2073, users can now perform accuracy tests on multiple models simultaneously and generate different report files by running: ```bash cd ~/vllm-ascend pytest -sv ./tests/e2e/models/test_lm_eval_correctness.py \ --config-list-file ./tests/e2e/models/configs/accuracy.txt ``` ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? <img width="1648" height="511" alt="image" src="https://github.com/user-attachments/assets/1757e3b8-a6b7-44e5-b701-80940dc756cd" /> - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@766bc81 --------- Signed-off-by: Icey <1790571317@qq.com> * Fix accuracy test create PR (vllm-project#2274) ### What this PR does / why we need it? Fix create PR of accuracy test ### Does this PR introduce _any_ user-facing change? no ### How was this patch tested? Local testing: nv-action/vllm-benchmarks#87 - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@099c046 --------- Signed-off-by: Icey <1790571317@qq.com> * Add ut for test_communicator.py (vllm-project#2293) ### What this PR does / why we need it? Add ut for test_communicator.py - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@e5ebeeb Signed-off-by: yangqinghao-cmss <yangqinghao_yewu@cmss.chinamobile.com> * [CI] Fix broken CI (vllm-project#2302) 1. disable test_eagle_ccorrectness test, we'll reopen it once oom error fixed. 2. drop transformers version limit for main, since vLLM rely on >=4.55.0, see: vllm-project/vllm@65552b4 3. fix kv_connector_output bug, see: vllm-project/vllm@796bae0 - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@d1af8b7 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> * [2/N][Refactor] torchair model runner refactor (vllm-project#2204) There is lot of torchair code in model runner leading the code hard for maintenance. We'll create new torchair_model_runner to split torchair related logic. Following the workflow vllm-project#2203 What's this PR do: move `torchair` related logic into `_get_forward_metadata_across_dp` and override it in torchair model runner - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@1b99028 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> * [core] Support capture custom ops into aclgraph (vllm-project#2113) ### What this PR does / why we need it? Thanks to the PR vllm-project#426 make vllm-ascend support the aclgraph inference to reduce the host overhead. However, the capability of aclgraph strongly relies on the functionality provided by `torch.compile`, which is the key feature supported in torch 2.x . Therefore, capture custom op into aclgraph is only possible when it can be recognize and captured by `torch.compile`. In this PR, we register the meta implementation of current custom ops to enable the fx graph capture. And by doing that, insert those custom ops into aclgraph become a natural thing to the ascend runtime. ### Does this PR introduce _any_ user-facing change? No user face change. ### How was this patch tested? Tested in unittest, we will integrate the `rotary_embedding` op into a small custom model and use `torch.compile` and aclgraph to capture and replay it to verify its functionality. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@1b99028 --------- Signed-off-by: ganyi <pleaplusone.gy@gmail.com> * Bump actions/download-artifact from 4 to 5 (vllm-project#2311) Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 5. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ebf7605 Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> * [Perf][MTP] Optimize reject sampler in greedy situation. (vllm-project#2137) This PR port optimization in PR vllm-project#2002 to main and makes it cleaner. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@afa5b7c --------- Signed-off-by: whx-sjtu <2952154980@qq.com> * [3/N][Refactor] torchair model runner refactor (vllm-project#2207) There is lot of torchair code in model runner leading the code hard for maintenance. We'll create new torchair_model_runner to split torchair related logic. Following the workflow vllm-project#2203, this is the first PR. What's this PR do: create common function `_build_attention_metadata` and `_generate_dummy_run_hidden_states` for dummy_run - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ebf7605 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> * [Feat] chunkprefill mla support torchair graph (vllm-project#1772) chunkprefill mla only support eager mode now，we want to optimaze it by support torchair graph, the idea is simple, when all the request is running in decode, use torchair graph to deal with it, else when chunkprefill or prefill only, use the eager mode - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ebf7605 Signed-off-by: haojiangzheng <justineric096@gmail.com> Co-authored-by: haojiangzheng <justineric096@gmail.com> * [4/N][Refactor] torchair model runner refactor (vllm-project#2208) There is lot of torchair code in model runner leading the code hard for maintenance. We'll create new torchair_model_runner to split torchair related logic. Following the workflow vllm-project#2203, this is the first PR. What's this PR do: create common function `_convert_torch_foramt` for initialize_kv_cache - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@14a5d90 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> * Configure Gemini (vllm-project#2298) ### What this PR does / why we need it? This PR requests Gemini AI to review PRs. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? NA - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@14a5d90 Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> * ut: add ci guard for ut coverage (vllm-project#2317) ### What this PR does / why we need it? add ci guard for ut coverage, if ut coverage of patch pr is below 80%, the ci will failed/ ### Does this PR introduce _any_ user-facing change? not involved ### How was this patch tested? not involved - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@458e74e --------- Signed-off-by: Ronald1995 <ronaldautomobile@163.com> * [main][prefill optimization] Optimize parallel strategies to reduce communication overhead (vllm-project#2198) ### What this PR does / why we need it? 1.Shared Expert Sharding Strategy Update: Switched from TP-aligned to pure DP for shared experts, enabling more efficient execution. 2.O_Proj AllReduce → ReduceScatter: Reduced communication overhead by using ReduceScatter, made possible by pure DP sharding. 3.AllGather Postponed: Delayed to after QKV down projection to reduce synchronization impact during prefill. ### How was this patch tested? Adding ut case in `tests/ut/attention/test_mla_v1.py` #### How to run use parameter `--additional_config='{"enable_shared_expert_dp": true}'` ##### a.How to run eager mode eg: python -m vllm.entrypoints.openai.api_server --model=/model_path --trust-remote-code -tp 8 -dp 2 --enable_expert_parallel --port 8002 --max-model-len 5120 --max-num-batched-tokens 16384 --enforce-eager --disable-log-requests --additional_config='{"ascend_scheduler_config":{"enabled":true},"enable_shared_expert_dp": true,"chunked_prefill_for_mla":true}' ##### b.How to run graph mode eg: python -m vllm.entrypoints.openai.api_server --model=/model_path --trust-remote-code -tp 8 -dp 2 --enable_expert_parallel --port 8002 --max-model-len 5120 --max-num-batched-tokens 16384 --disable-log-requests --additional_config='{"ascend_scheduler_config":{"enabled":true},"enable_shared_expert_dp": true,"chunked_prefill_for_mla":true,"torchair_graph_config":{"enabled":true}}' - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@9edd1db --------- Signed-off-by: Wang Kunpeng <1289706727@qq.com> Signed-off-by: SlightwindSec <slightwindsec@gmail.com> Co-authored-by: SlightwindSec <slightwindsec@gmail.com> * [Doc] Update faq (vllm-project#2334) ### What this PR does / why we need it? - update determinitic calculation - update support device ### Does this PR introduce _any_ user-facing change? - Users should update ray and protobuf when using ray as distributed backend - Users should change to use `export HCCL_DETERMINISTIC=true` when enabling determinitic calculation ### How was this patch tested? N/A - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@ea1292a Signed-off-by: MengqingCao <cmq0113@163.com> * [5/N][Refactor] torchair model runner refactor (vllm-project#2216) There is lot of torchair code in model runner leading the code hard for maintenance. We'll create new torchair_model_runner to split torchair related logic. Following the workflow vllm-project#2203 What's this PR do: create common function `_capture_model` for capture_model - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@1891a26 Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> * [1/N][Feat] Support MoE models with ACL Graph and refactor MoE communication logic (vllm-project#2125) ### What this PR does / why we need it? This PR refactors the MoE (Mixture of Experts) communication logic by introducing a strategy pattern. It defines an abstract base class, `MoECommMethod`, which encapsulates different communication strategies for MoE layers. By decoupling the MoE implementation from any single communication method, this change makes it simpler to add, replace, or optimize communication strategies in the future. Plan / Roadmap 1. Introduce `MoECommMethod`, implement `AllGatherImpl`, and adapt ACL Graph handling to cover all scenarios (this PR). 2. Implement `MC2CommImpl` and `AllToAllCommImpl` to optimize performance in specific scenarios. 3. Enable W8A8 / Int8 models to use `unified_fused_experts`. Other notes * Data-parallel (DP) communication currently does not work with vLLM's dispatch/combine mechanisms; an alternative approach is required to resolve this incompatibility. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@f7ad6a1 --------- Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com> * [Doc] Add container image save/load FAQ for offline environments (vllm-project#2347) ### What this PR does / why we need it? Add Docker export/import guide for air-gapped environments ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? NA - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@d16aa3d Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> * [Bugfix] fix the oom when chunkprefill with long context like 64k (vllm-project#2319) The attn mask was declared in the mla.py，we don't need the splitfuse mask when mla chunkprefill, and this mask will cause memory problem when long context like 64k or 128k - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@14a5d90 --------- Signed-off-by: haojiangzheng <justineric096@gmail.com> --------- Signed-off-by: Ronald1995 <ronaldautomobile@163.com> Signed-off-by: huangxialu <huangxialu1@huawei.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Signed-off-by: yangqinghao-cmss <yangqinghao_yewu@cmss.chinamobile.com> Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com> Signed-off-by: ganyi <pleaplusone.gy@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: whx-sjtu <2952154980@qq.com> Signed-off-by: haojiangzheng <justineric096@gmail.com> Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Signed-off-by: Wang Kunpeng <1289706727@qq.com> Signed-off-by: SlightwindSec <slightwindsec@gmail.com> Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Ronald1995 <ronaldautomobile@163.com> Co-authored-by: huangxialu <huangxialu1@huawei.com> Co-authored-by: Mengqing Cao <cmq0113@163.com> Co-authored-by: Icey <1790571317@qq.com> Co-authored-by: yangqinghao-cmss <yangqinghao_yewu@cmss.chinamobile.com> Co-authored-by: wangxiyuan <wangxiyuan1007@gmail.com> Co-authored-by: Pleaplusone <pleaplusone.gy@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: whx <56632993+whx-sjtu@users.noreply.github.com> Co-authored-by: zhenghaojiang <zhjoneson@163.com> Co-authored-by: haojiangzheng <justineric096@gmail.com> Co-authored-by: jack <QwertyJack@users.noreply.github.com> Co-authored-by: Wang Kunpeng <1289706727@qq.com> Co-authored-by: SlightwindSec <slightwindsec@gmail.com> Co-authored-by: yiz-liu <136800916+yiz-liu@users.noreply.github.com>

… init (#2348) ### What this PR does / why we need it? Add the missing `apply_router_weight_on_input` in FusedMoE init Quick fix on #2268 (comment) ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@6807af8 Signed-off-by: MengqingCao <cmq0113@163.com>

… MoE layers (#3) * feat(performance): support `GroupedMatmulSwigluQuant` in `W8A8_DYNAMIC` quantized MoE layers Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(bug): fix bug Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * feat(ops): enable grouped_matmul_swiglu_quant by default Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(test): fix broken test Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(test): temporally skip broken test due to oom Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(test): change bias1 to tensor Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(bug): update group_list handling and weight scale in dynamic methods Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * feat(ops): replace all splited gmm and swiglu Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * feat(quantization): split w4a8 and w8a8 apply Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(test): replace w8a8 function in apply Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * feat(cumsum): add cumsum_group_list function for group list processing Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * [Doc] Add container image save/load FAQ for offline environments (vllm-project#2347) ### What this PR does / why we need it? Add Docker export/import guide for air-gapped environments ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? NA - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@d16aa3d Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> * [Bugfix] fix the oom when chunkprefill with long context like 64k (vllm-project#2319) The attn mask was declared in the mla.py，we don't need the splitfuse mask when mla chunkprefill, and this mask will cause memory problem when long context like 64k or 128k - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@14a5d90 --------- Signed-off-by: haojiangzheng <justineric096@gmail.com> * [Quickfix] Add the missing `apply_router_weight_on_input` in FusedMoE init (vllm-project#2348) ### What this PR does / why we need it? Add the missing `apply_router_weight_on_input` in FusedMoE init Quick fix on vllm-project#2268 (comment) ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@6807af8 Signed-off-by: MengqingCao <cmq0113@163.com> * [2/N][Refactor] Refactor V1 attention for better extensibility (vllm-project#1995) ### What this PR does / why we need it? Refactor V1 Attention for better extensibility (prepared for torchair attention refactor). **Main changes:** - Move different kinds of foward into their method respectively, e.g., `_forward_prefill_no_cache()`, `_forward_prefill_cache_hit()`, `_forward_decode_only()`, `_forward_v1_style()`. ### Does this PR introduce _any_ user-facing change? No. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@14a5d90 Signed-off-by: shen-shanshan <467638484@qq.com> * [Misc] Remove redundant imported `envs`, using `envs_ascend` instead (vllm-project#2193) ### What this PR does / why we need it? Remove redundant imported `envs`, using `envs_ascend` instead. ```python import vllm.envs as envs_vllm import vllm_ascend.envs as envs_ascend ``` - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@71683ca --------- Signed-off-by: shen-shanshan <467638484@qq.com> * feat(torchair): consider not using gmmswigluquant when torchair enabled Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(dtype): unify `w1_scale` dtype Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> * fix(lint): fix lint Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> --------- Signed-off-by: zhoux77899 <zhouxiang100@huawei.com> Signed-off-by: QwertyJack <7554089+QwertyJack@users.noreply.github.com> Signed-off-by: haojiangzheng <justineric096@gmail.com> Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: shen-shanshan <467638484@qq.com> Co-authored-by: jack <QwertyJack@users.noreply.github.com> Co-authored-by: zhenghaojiang <zhjoneson@163.com> Co-authored-by: Mengqing Cao <cmq0113@163.com> Co-authored-by: Shanshan Shen <467638484@qq.com>

@ApsarasX

I would like to nominate Wengang Chen (@ApsarasX https://github.com/ApsarasX) as a maintainer, starting with my +1. ## Reason Review Quality‌: He focuses on the vLLM Ascend Core module review with 100+ high quality review, such as [#2326 (comment)](#2326 (comment)), [#768 (comment)](#768 (comment)), [#2312 (comment)](#2312 (comment)), [#2268 (comment)](#2268 (comment)), [#2192 (comment)](#2192 (comment)), [#2156 (comment)](#2156 (comment)). This helped vLLM Ascend v0.9.x and v0.10.x to be released with high quality. Sustained and Quality Contributions: He has a very good habit of sharing his design ideas, development process, performance test results, such as [#966](#966), he contributed [many PRs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Amerged+), valuable bugfixes and also perf improvements. Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in [120+ PR and issues](https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX). He is also the speaker of [vLLM Beijing Meetup](https://mp.weixin.qq.com/s/7n8OYNrCC_I9SJaybHA_-Q). So I think he's a great addition to the vLLM Ascend Maintainer team. - ✅Review Quality‌: 108+ PR with valuable review https://github.com/vllm-project/vllm-ascend/pulls?q=commenter%3AApsarasX with many valuable review, like #2326 (comment) #768 (comment) #2312 (comment) #2268 (comment) #2192 (comment) #2156 (comment) - ✅ Sustained and Major Contributions https://github.com/vllm-project/vllm-ascend/pulls/ApsarasX - ✅ Quality Contribution‌: https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Aclosed Good quality with well documents [Perf] Refactor tensor disposal logic to reduce memory usage #966 - ✅Community Involvement‌: 7 issue: https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20author%3AApsarasX - 120+ PR and issue: https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

@ApsarasX

I would like to nominate Wengang Chen (@ApsarasX https://github.com/ApsarasX) as a maintainer, starting with my +1. ## Reason Review Quality‌: He focuses on the vLLM Ascend Core module review with 100+ high quality review, such as [vllm-project#2326 (comment)](vllm-project#2326 (comment)), [vllm-project#768 (comment)](vllm-project#768 (comment)), [vllm-project#2312 (comment)](vllm-project#2312 (comment)), [vllm-project#2268 (comment)](vllm-project#2268 (comment)), [vllm-project#2192 (comment)](vllm-project#2192 (comment)), [vllm-project#2156 (comment)](vllm-project#2156 (comment)). This helped vLLM Ascend v0.9.x and v0.10.x to be released with high quality. Sustained and Quality Contributions: He has a very good habit of sharing his design ideas, development process, performance test results, such as [vllm-project#966](vllm-project#966), he contributed [many PRs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Amerged+), valuable bugfixes and also perf improvements. Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in [120+ PR and issues](https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX). He is also the speaker of [vLLM Beijing Meetup](https://mp.weixin.qq.com/s/7n8OYNrCC_I9SJaybHA_-Q). So I think he's a great addition to the vLLM Ascend Maintainer team. - ✅Review Quality‌: 108+ PR with valuable review https://github.com/vllm-project/vllm-ascend/pulls?q=commenter%3AApsarasX with many valuable review, like vllm-project#2326 (comment) vllm-project#768 (comment) vllm-project#2312 (comment) vllm-project#2268 (comment) vllm-project#2192 (comment) vllm-project#2156 (comment) - ✅ Sustained and Major Contributions https://github.com/vllm-project/vllm-ascend/pulls/ApsarasX - ✅ Quality Contribution‌: https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Aclosed Good quality with well documents [Perf] Refactor tensor disposal logic to reduce memory usage vllm-project#966 - ✅Community Involvement‌: 7 issue: https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20author%3AApsarasX - 120+ PR and issue: https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

### What this PR does / why we need it? Fix AscendFusedMoE init error. Use `super().__init__()` instead of `super(FusedMoE, self).__init__()` to ensure the member variables in base class could be called by the children class ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@766bc81 --------- Signed-off-by: MengqingCao <cmq0113@163.com>

… init (vllm-project#2348) ### What this PR does / why we need it? Add the missing `apply_router_weight_on_input` in FusedMoE init Quick fix on vllm-project#2268 (comment) ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@6807af8 Signed-off-by: MengqingCao <cmq0113@163.com>

@ApsarasX

I would like to nominate Wengang Chen (@ApsarasX https://github.com/ApsarasX) as a maintainer, starting with my +1. ## Reason Review Quality‌: He focuses on the vLLM Ascend Core module review with 100+ high quality review, such as [vllm-project#2326 (comment)](vllm-project#2326 (comment)), [vllm-project#768 (comment)](vllm-project#768 (comment)), [vllm-project#2312 (comment)](vllm-project#2312 (comment)), [vllm-project#2268 (comment)](vllm-project#2268 (comment)), [vllm-project#2192 (comment)](vllm-project#2192 (comment)), [vllm-project#2156 (comment)](vllm-project#2156 (comment)). This helped vLLM Ascend v0.9.x and v0.10.x to be released with high quality. Sustained and Quality Contributions: He has a very good habit of sharing his design ideas, development process, performance test results, such as [vllm-project#966](vllm-project#966), he contributed [many PRs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Amerged+), valuable bugfixes and also perf improvements. Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in [120+ PR and issues](https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX). He is also the speaker of [vLLM Beijing Meetup](https://mp.weixin.qq.com/s/7n8OYNrCC_I9SJaybHA_-Q). So I think he's a great addition to the vLLM Ascend Maintainer team. - ✅Review Quality‌: 108+ PR with valuable review https://github.com/vllm-project/vllm-ascend/pulls?q=commenter%3AApsarasX with many valuable review, like vllm-project#2326 (comment) vllm-project#768 (comment) vllm-project#2312 (comment) vllm-project#2268 (comment) vllm-project#2192 (comment) vllm-project#2156 (comment) - ✅ Sustained and Major Contributions https://github.com/vllm-project/vllm-ascend/pulls/ApsarasX - ✅ Quality Contribution‌: https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Aclosed Good quality with well documents [Perf] Refactor tensor disposal logic to reduce memory usage vllm-project#966 - ✅Community Involvement‌: 7 issue: https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20author%3AApsarasX - 120+ PR and issue: https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

### What this PR does / why we need it? Fix AscendFusedMoE init error. Use `super().__init__()` instead of `super(FusedMoE, self).__init__()` to ensure the member variables in base class could be called by the children class ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@766bc81 --------- Signed-off-by: MengqingCao <cmq0113@163.com>

… init (vllm-project#2348) ### What this PR does / why we need it? Add the missing `apply_router_weight_on_input` in FusedMoE init Quick fix on vllm-project#2268 (comment) ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with existing test. - vLLM version: v0.10.0 - vLLM main: vllm-project/vllm@6807af8 Signed-off-by: MengqingCao <cmq0113@163.com>

@ApsarasX

I would like to nominate Wengang Chen (@ApsarasX https://github.com/ApsarasX) as a maintainer, starting with my +1. ## Reason Review Quality‌: He focuses on the vLLM Ascend Core module review with 100+ high quality review, such as [vllm-project#2326 (comment)](vllm-project#2326 (comment)), [vllm-project#768 (comment)](vllm-project#768 (comment)), [vllm-project#2312 (comment)](vllm-project#2312 (comment)), [vllm-project#2268 (comment)](vllm-project#2268 (comment)), [vllm-project#2192 (comment)](vllm-project#2192 (comment)), [vllm-project#2156 (comment)](vllm-project#2156 (comment)). This helped vLLM Ascend v0.9.x and v0.10.x to be released with high quality. Sustained and Quality Contributions: He has a very good habit of sharing his design ideas, development process, performance test results, such as [vllm-project#966](vllm-project#966), he contributed [many PRs](https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Amerged+), valuable bugfixes and also perf improvements. Community Involvement: Active involved in community discussion, he is collaborative and helps the users solve problems, involved in [120+ PR and issues](https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX). He is also the speaker of [vLLM Beijing Meetup](https://mp.weixin.qq.com/s/7n8OYNrCC_I9SJaybHA_-Q). So I think he's a great addition to the vLLM Ascend Maintainer team. - ✅Review Quality‌: 108+ PR with valuable review https://github.com/vllm-project/vllm-ascend/pulls?q=commenter%3AApsarasX with many valuable review, like vllm-project#2326 (comment) vllm-project#768 (comment) vllm-project#2312 (comment) vllm-project#2268 (comment) vllm-project#2192 (comment) vllm-project#2156 (comment) - ✅ Sustained and Major Contributions https://github.com/vllm-project/vllm-ascend/pulls/ApsarasX - ✅ Quality Contribution‌: https://github.com/vllm-project/vllm-ascend/pulls?q=is%3Apr+author%3AApsarasX+is%3Aclosed Good quality with well documents [Perf] Refactor tensor disposal logic to reduce memory usage vllm-project#966 - ✅Community Involvement‌: 7 issue: https://github.com/vllm-project/vllm-ascend/issues?q=is%3Aissue%20state%3Aclosed%20author%3AApsarasX - 120+ PR and issue: https://github.com/vllm-project/vllm-ascend/issues?q=commenter%3AApsarasX Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>

[CI][Quickfix] Fix AscendFusedMoE init error

2ddeeb3

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions bot added module:ops module:tests labels Aug 7, 2025

fix ut

0642fa4

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao force-pushed the cifix branch from 0638322 to 0642fa4 Compare August 7, 2025 12:49

ApsarasX reviewed Aug 7, 2025

View reviewed changes

ganyi1996ppo approved these changes Aug 8, 2025

View reviewed changes

wangxiyuan merged commit ad10837 into vllm-project:main Aug 8, 2025
32 of 33 checks passed

MengqingCao deleted the cifix branch August 13, 2025 06:17

MengqingCao mentioned this pull request Aug 13, 2025

[Quickfix] Add the missing apply_router_weight_on_input in FusedMoE init #2348

Merged

wangxiyuan mentioned this pull request Aug 18, 2025

Nominate ApsarasX as vllm-ascend maintainer #2419

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CI][Quickfix] Fix AscendFusedMoE init error #2268

[CI][Quickfix] Fix AscendFusedMoE init error #2268

Uh oh!

MengqingCao commented Aug 7, 2025 •

edited by github-actions bot

Loading

Uh oh!

github-actions bot commented Aug 7, 2025

Uh oh!

codecov bot commented Aug 7, 2025 •

edited

Loading

Uh oh!

ApsarasX Aug 7, 2025

Uh oh!

MengqingCao Aug 8, 2025

Uh oh!

ApsarasX Aug 11, 2025

Uh oh!

MengqingCao Aug 12, 2025

Uh oh!

ganyi1996ppo commented Aug 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[CI][Quickfix] Fix AscendFusedMoE init error #2268

[CI][Quickfix] Fix AscendFusedMoE init error #2268

Uh oh!

Conversation

MengqingCao commented Aug 7, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

github-actions bot commented Aug 7, 2025

Uh oh!

codecov bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

ApsarasX Aug 7, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

ApsarasX Aug 11, 2025

Choose a reason for hiding this comment

Uh oh!

MengqingCao Aug 12, 2025

Choose a reason for hiding this comment

Uh oh!

ganyi1996ppo commented Aug 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MengqingCao commented Aug 7, 2025 •

edited by github-actions bot

Loading

codecov bot commented Aug 7, 2025 •

edited

Loading