[Bugfix][Model] Fix fusedmoe and make modelrunner_v1 compatible with latest vllm #867

MengqingCao · 2025-05-15T07:01:02Z

What this PR does / why we need it?

this PR fix CI failure broken by vllm.

add moe_config for fused_moe
adjust the change for kv cache group from vllm. currently vllm-ascend doesn't support this feature. this is just a quick fix for backward compatibility

fix: #872

Signed-off-by: MengqingCao <cmq0113@163.com>

wangxiyuan · 2025-05-15T12:02:04Z

vllm_ascend/worker/model_runner_v1.py

+            max_model_len=self.max_model_len,
+            max_num_batched_tokens=self.max_num_tokens,
+            device=self.device,
+            pin_memory=self.pin_memory,


pin_memory=True

wangxiyuan · 2025-05-15T12:03:37Z

vllm_ascend/worker/model_runner_v1.py

                cache_config.cache_dtype]
-
+        self.attn_metadata_builders: list[AscendAttentionMetadataBuilder] = []
+        self.attn_backends: list[type[AscendAttentionBackend]] = []


wangxiyuan · 2025-05-15T12:04:04Z

vllm_ascend/worker/model_runner_v1.py

        self.scheduler_config = vllm_config.scheduler_config
        self.chunked_prefill_enabled = vllm_config.scheduler_config.chunked_prefill_enabled
        self.device = device
+        self.pin_memory = True


wangxiyuan · 2025-05-15T12:04:32Z

vllm_ascend/worker/model_runner_v1.py

+
        self.is_multimodal_model = self.model_config.is_multimodal_model
        self.block_size = vllm_config.cache_config.block_size
+        self.max_model_len = self.model_config.max_model_len


useless, use self.model_config.max_model_len for InputBatch

Signed-off-by: MengqingCao <cmq0113@163.com>

wangxiyuan · 2025-05-15T14:09:11Z

LGTM. let's merge this to unblock CI once the CI passed. Thanks for the fix.

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao · 2025-05-16T02:05:20Z

LGTM. let's merge this to unblock CI once the CI passed. Thanks for the fix.

Thanks, I make a small change in the latest commit, plz help to review it.

Yikun · 2025-05-16T03:56:50Z

@jianzs @ApsarasX Please take a look

ApsarasX · 2025-05-16T04:07:04Z

vllm_ascend/ops/fused_moe.py

-            self.local_num_experts = self.global_num_experts
-            self.expert_map = None
-
+            if vllm_version_is("0.8.5") or vllm_version_is("0.8.5.post1"):


I think this part of the code may not be needed, refer to the modification of this part in PR 863

However, the most urgent thing at present is to fix CI, which can be considered later

yes let' make ci happy first then solve the bug later

…latest vllm (vllm-project#867) ### What this PR does / why we need it? this PR fix CI failure broken by vllm. 1. add moe_config for fused_moe 2. adjust the change for kv cache group from vllm. currently vllm-ascend doesn't support this feature. this is just a quick fix for backward compatibility fix: vllm-project#872 --------- Signed-off-by: MengqingCao <cmq0113@163.com>

[Bugfix][Model] fix deepseek

65bb317

Signed-off-by: MengqingCao <cmq0113@163.com>

github-actions bot added the module:ops label May 15, 2025

fix kvcache group

7c2de25

Signed-off-by: MengqingCao <cmq0113@163.com>

wangxiyuan reviewed May 15, 2025

View reviewed changes

MengqingCao added 3 commits May 15, 2025 12:21

some fixes

3ecba1b

Signed-off-by: MengqingCao <cmq0113@163.com>

compatible with 085

2448ed2

Signed-off-by: MengqingCao <cmq0113@163.com>

tiny fix

70bec0f

Signed-off-by: MengqingCao <cmq0113@163.com>

wangxiyuan approved these changes May 15, 2025

View reviewed changes

make AscendUnquantizedFusedMoEMethod compatible with 085

a7a6587

Signed-off-by: MengqingCao <cmq0113@163.com>

MengqingCao changed the title ~~[Bugfix][Model] Fix deepseek~~ [Bugfix][Model] Fix fusedmoe and make modelrunner_v1 compatible with latest vllm May 16, 2025

Yikun added the ready read for review label May 16, 2025

ApsarasX reviewed May 16, 2025

View reviewed changes

ApsarasX approved these changes May 16, 2025

View reviewed changes

wangxiyuan merged commit 7a325b2 into vllm-project:main May 16, 2025
16 checks passed

MengqingCao deleted the fixds branch May 20, 2025 06:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix][Model] Fix fusedmoe and make modelrunner_v1 compatible with latest vllm #867

[Bugfix][Model] Fix fusedmoe and make modelrunner_v1 compatible with latest vllm #867

Uh oh!

MengqingCao commented May 15, 2025 •

edited by wangxiyuan

Loading

Uh oh!

wangxiyuan May 15, 2025

Uh oh!

wangxiyuan May 15, 2025

Uh oh!

wangxiyuan May 15, 2025

Uh oh!

wangxiyuan May 15, 2025

Uh oh!

wangxiyuan commented May 15, 2025

Uh oh!

MengqingCao commented May 16, 2025

Uh oh!

Yikun commented May 16, 2025

Uh oh!

ApsarasX May 16, 2025

Uh oh!

wangxiyuan May 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Bugfix][Model] Fix fusedmoe and make modelrunner_v1 compatible with latest vllm #867

[Bugfix][Model] Fix fusedmoe and make modelrunner_v1 compatible with latest vllm #867

Uh oh!

Conversation

MengqingCao commented May 15, 2025 • edited by wangxiyuan Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Uh oh!

wangxiyuan May 15, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan May 15, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan May 15, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan May 15, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan commented May 15, 2025

Uh oh!

MengqingCao commented May 16, 2025

Uh oh!

Yikun commented May 16, 2025

Uh oh!

ApsarasX May 16, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan May 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

MengqingCao commented May 15, 2025 •

edited by wangxiyuan

Loading