[v0.9.1][Fix] Fix block table shape #1297

yiz-liu · 2025-06-19T09:09:23Z

What this PR does / why we need it?

This fix the shape of block_table which was introduced by hybrid kv groups several weeks ago.

Error will be raised when enable prefix-cache (eager or not) and Ascend Scheduler at the same time, just send two identical requests and it will reproduce.

Backported: #1446

Does this PR introduce any user-facing change?

How was this patch tested?

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

yiz-liu · 2025-06-26T03:15:40Z

I have tested myself and observed no accuracy issues. However, I recommend adding accuracy tests for this scenario to catch any potential bugs. We should take a look into it @MengqingCao

@ganyi1996ppo @wangxiyuan Please review and merge this at your earliest convenience, as this bug can be triggered easily and has a broad impact.

…heduler (#1446) ### What this PR does / why we need it? This fix the shape of block_table which was introduced by hybrid kv groups several weeks ago. Error will be raised when enable prefix-cache (eager or not) and Ascend Scheduler at the same time, just send two identical requests and it will reproduce. v0.9.1: #1297 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test manually Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

…heduler (vllm-project#1446) ### What this PR does / why we need it? This fix the shape of block_table which was introduced by hybrid kv groups several weeks ago. Error will be raised when enable prefix-cache (eager or not) and Ascend Scheduler at the same time, just send two identical requests and it will reproduce. v0.9.1: vllm-project#1297 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test manually Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

@yiz-liu

### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR #1297 #1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: #1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>

…heduler (vllm-project#1446) ### What this PR does / why we need it? This fix the shape of block_table which was introduced by hybrid kv groups several weeks ago. Error will be raised when enable prefix-cache (eager or not) and Ascend Scheduler at the same time, just send two identical requests and it will reproduce. v0.9.1: vllm-project#1297 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test manually Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

@yiz-liu

### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>

…heduler (vllm-project#1446) ### What this PR does / why we need it? This fix the shape of block_table which was introduced by hybrid kv groups several weeks ago. Error will be raised when enable prefix-cache (eager or not) and Ascend Scheduler at the same time, just send two identical requests and it will reproduce. v0.9.1: vllm-project#1297 ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Test manually Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

@yiz-liu

### What this PR does / why we need it? Fix the shape of the `npu_moe_init_routing` input parameters to support aclgraph mode on qwen3-moe In addition to this PR, resolving the `gatherv3` error might be necessary. See related PR vllm-project#1297 vllm-project#1446 Thanks to @yiz-liu for providing the idea ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Tested on Qwen3-30B-A3B Closes: vllm-project#1368 --------- Signed-off-by: ApsarasX <apsarax@outlook.com> Signed-off-by: Yikun Jiang <yikunkero@gmail.com> Co-authored-by: Yizhou Liu <liu_yizhou@outlook.com> Co-authored-by: Yikun Jiang <yikunkero@gmail.com>

yiz-liu changed the title ~~Fix blcok table shape~~ [WIP] Fix blcok table shape Jun 19, 2025

yiz-liu changed the title ~~[WIP] Fix blcok table shape~~ [WIP] Fix block table shape Jun 20, 2025

MengqingCao mentioned this pull request Jun 23, 2025

[Bug]: Prefix cache feature does not work with the Ascend Scheduler #1350

Closed

ApsarasX mentioned this pull request Jun 23, 2025

[Bugfix] Support Qwen3-MOE on aclgraph mode #1381

Merged

Fix blcok table shape

e8ba3e2

Signed-off-by: Yizhou Liu <liu_yizhou@outlook.com>

yiz-liu force-pushed the fix-prefix-cache branch from a0ba5ef to e8ba3e2 Compare June 26, 2025 01:41

yiz-liu changed the title ~~[WIP] Fix block table shape~~ [Fix] Fix block table shape Jun 26, 2025

yiz-liu changed the title ~~[Fix] Fix block table shape~~ [v0.9.1][Fix] Fix block table shape Jun 26, 2025

wangxiyuan approved these changes Jun 26, 2025

View reviewed changes

wangxiyuan merged commit 105d2df into vllm-project:v0.9.1-dev Jun 27, 2025
16 checks passed

yiz-liu deleted the fix-prefix-cache branch June 27, 2025 03:01

Yikun mentioned this pull request Jun 29, 2025

[Core] Fix block table shape to make Prefix cache work with Ascend scheduler #1446

Merged

Yikun added no-main and removed no-main labels Jul 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[v0.9.1][Fix] Fix block table shape #1297

[v0.9.1][Fix] Fix block table shape #1297

Uh oh!

yiz-liu commented Jun 19, 2025 •

edited by Yikun

Loading

Uh oh!

yiz-liu commented Jun 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[v0.9.1][Fix] Fix block table shape #1297

[v0.9.1][Fix] Fix block table shape #1297

Uh oh!

Conversation

yiz-liu commented Jun 19, 2025 • edited by Yikun Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What this PR does / why we need it?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

yiz-liu commented Jun 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yiz-liu commented Jun 19, 2025 •

edited by Yikun

Loading