[BugFix] Fix chunked prefill bugs in engine v1 #844

rjg-lyh · 2025-05-14T02:39:45Z

What this PR does / why we need it?

Fix the bugs when run deepseek model in engine v1.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

CI passed with new added/existing test.

Yikun · 2025-05-14T11:56:59Z

diff --git a/vllm_ascend/patch/platform/patch_common/patch_vllm_config.py b/vllm_ascend/patch/platform/patch_common/patch_vllm_config.py
index 6d606d0..947ec7d 100644
--- a/vllm_ascend/patch/platform/patch_common/patch_vllm_config.py
+++ b/vllm_ascend/patch/platform/patch_common/patch_vllm_config.py
@@ -18,11 +18,10 @@
 # This file is a part of the vllm-ascend project.

 import torch
-
+import vllm.envs as envs
+from vllm.config import CompilationConfig, CompilationLevel, VllmConfig
 from vllm.logger import init_logger
-from vllm.config import (VllmConfig, CompilationConfig, CompilationLevel)
 from vllm.utils import random_uuid
-import vllm.envs as envs

 logger = init_logger(__name__)

Signed-off-by: rjg-lyh <1318825571@qq.com>

wangxiyuan · 2025-05-21T03:43:23Z

vllm_ascend/patch/platform/patch_main/patch_vllm_config.py

+        self.instance_id = random_uuid()[:5]
+
+
+VllmConfig.__post_init__ = __post_init__


patch vllm config is dengerous. it's always changed by vllm. How we make sure the capability?

related PR to fix the issue: vllm-project/vllm#18470

Signed-off-by: rjg-lyh <1318825571@qq.com>

### What this PR does / why we need it? add basic v1 mtp features please merge it after #874 and #844. ### Does this PR introduce _any_ user-facing change? now, we supported basic v1 mtp, only supported tp only、eager mode and k=1 we will continue to expand more scenarios. ### How was this patch tested? local tested Signed-off-by: XWFAlone <xuewenfei2@huawei.com> Co-authored-by: mengwei805 <mengwei25@huawei.com> Co-authored-by: JC-ut0 <xuyexiong@huawei.com>

### What this PR does / why we need it? add basic v1 mtp features please merge it after vllm-project#874 and vllm-project#844. ### Does this PR introduce _any_ user-facing change? now, we supported basic v1 mtp, only supported tp only、eager mode and k=1 we will continue to expand more scenarios. ### How was this patch tested? local tested Signed-off-by: XWFAlone <xuewenfei2@huawei.com> Co-authored-by: mengwei805 <mengwei25@huawei.com> Co-authored-by: JC-ut0 <xuyexiong@huawei.com> Signed-off-by: wangxiaoxin (A) <w00664509@china.huawei.com>

### What this PR does / why we need it? add basic v1 mtp features please merge it after vllm-project#874 and vllm-project#844. ### Does this PR introduce _any_ user-facing change? now, we supported basic v1 mtp, only supported tp only、eager mode and k=1 we will continue to expand more scenarios. ### How was this patch tested? local tested Signed-off-by: XWFAlone <xuewenfei2@huawei.com> Co-authored-by: mengwei805 <mengwei25@huawei.com> Co-authored-by: JC-ut0 <xuyexiong@huawei.com>

### What this PR does / why we need it? Fix the bugs when run deepseek model in engine v1. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI passed with new added/existing test. --------- Signed-off-by: rjg-lyh <1318825571@qq.com>

### What this PR does / why we need it? add basic v1 mtp features please merge it after vllm-project#874 and vllm-project#844. ### Does this PR introduce _any_ user-facing change? now, we supported basic v1 mtp, only supported tp only、eager mode and k=1 we will continue to expand more scenarios. ### How was this patch tested? local tested Signed-off-by: XWFAlone <xuewenfei2@huawei.com> Co-authored-by: mengwei805 <mengwei25@huawei.com> Co-authored-by: JC-ut0 <xuyexiong@huawei.com>

### What this PR does / why we need it? Fix the bugs when run deepseek model in engine v1. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? CI passed with new added/existing test. --------- Signed-off-by: rjg-lyh <1318825571@qq.com>

### What this PR does / why we need it? add basic v1 mtp features please merge it after vllm-project#874 and vllm-project#844. ### Does this PR introduce _any_ user-facing change? now, we supported basic v1 mtp, only supported tp only、eager mode and k=1 we will continue to expand more scenarios. ### How was this patch tested? local tested Signed-off-by: XWFAlone <xuewenfei2@huawei.com> Co-authored-by: mengwei805 <mengwei25@huawei.com> Co-authored-by: JC-ut0 <xuyexiong@huawei.com>

rjg-lyh changed the title ~~[BugFix] Fix chunked prefill bugs~~ [BugFix] Fix chunked prefill bugs in engine v1 May 14, 2025

rjg-lyh force-pushed the pr-bugfix-dsv branch 2 times, most recently from 1e2aeea to 6725d90 Compare May 14, 2025 08:02

github-actions bot added the module:core label May 14, 2025

rjg-lyh force-pushed the pr-bugfix-dsv branch 10 times, most recently from cb1df18 to 70dc428 Compare May 14, 2025 11:49

rjg-lyh force-pushed the pr-bugfix-dsv branch 3 times, most recently from dfef60e to e5e9548 Compare May 15, 2025 06:52

Yikun mentioned this pull request May 16, 2025

[Feature]: prefix cache and chunk prefill #323

Closed

XWFAlone mentioned this pull request May 17, 2025

[1/N][UT][v1 MTP] add basic v1 mtp features #890

Merged

[BugFix] Fix chunked prefill bugs

418ed3d

Signed-off-by: rjg-lyh <1318825571@qq.com>

rjg-lyh force-pushed the pr-bugfix-dsv branch from e5e9548 to 7ea34c1 Compare May 21, 2025 01:54

wangxiyuan reviewed May 21, 2025

View reviewed changes

allow deepseek models to enable chunked prefill on NPUs

f0dac90

Signed-off-by: rjg-lyh <1318825571@qq.com>

rjg-lyh force-pushed the pr-bugfix-dsv branch from 7ea34c1 to f0dac90 Compare May 21, 2025 08:43

wangxiyuan approved these changes May 22, 2025

View reviewed changes

wangxiyuan added the ready read for review label May 22, 2025

wangxiyuan merged commit b4d6672 into vllm-project:main May 22, 2025
16 checks passed

rjg-lyh deleted the pr-bugfix-dsv branch July 22, 2025 12:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BugFix] Fix chunked prefill bugs in engine v1 #844

[BugFix] Fix chunked prefill bugs in engine v1 #844

rjg-lyh commented May 14, 2025 •

edited

Loading

Uh oh!

Yikun commented May 14, 2025

Uh oh!

wangxiyuan May 21, 2025

Uh oh!

wangxiyuan May 22, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		self.instance_id = random_uuid()[:5]


		VllmConfig.__post_init__ = __post_init__

[BugFix] Fix chunked prefill bugs in engine v1 #844

[BugFix] Fix chunked prefill bugs in engine v1 #844

Conversation

rjg-lyh commented May 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Yikun commented May 14, 2025

Uh oh!

wangxiyuan May 21, 2025

Choose a reason for hiding this comment

Uh oh!

wangxiyuan May 22, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

rjg-lyh commented May 14, 2025 •

edited

Loading