Skip to content

Conversation

@whx-sjtu
Copy link
Collaborator

@whx-sjtu whx-sjtu commented Jun 28, 2025

When use AscendScheduler with prefix-cache enabled and chunk-prefill disabled, there will be accuray problem because there is no branch in mla_v1 to process this scenario. This PR fixes it.

Backport: #1498

Signed-off-by: whx-sjtu <2952154980@qq.com>
@MengqingCao
Copy link
Collaborator

Thanks for this fix, please also backport on main

@wangxiyuan wangxiyuan merged commit 9acc082 into vllm-project:v0.9.1-dev Jun 28, 2025
15 checks passed
@whx-sjtu whx-sjtu deleted the fix_prefix_cache_accu_091 branch June 28, 2025 06:53
@Yikun Yikun changed the title [BugFix] Fix accuray bug of prefix-caching. [V0.9.1][BugFix] Address PrefillCacheHit state to fix prefix cache accuracy bug Jun 29, 2025
@Yikun Yikun added the no-main label Jul 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants