Skip to content

Conversation

@MengqingCao
Copy link
Collaborator

@MengqingCao MengqingCao commented Jun 28, 2025

What this PR does / why we need it?

Add test for chunked prefill and prefix cache on v1/AscendScheduler

Covered scenarios:

  • Qwen/Qwen3-0.6B-Base and deepseek-ai/DeepSeek-V2-Lite-Chat --- multicard CI time increased by 19 min
    • V1 + default scheduler vs V1 + default scheduler + enable prefix cache
    • V1 + Ascend scheduler vs V1 + Ascend scheduler + enable prefix cache vs V1 + Ascend scheduler + enable prefix cache + enable chunked prefill
  • Qwen/Qwen3-0.6B-Base --- singlecard CI time increased by 8 min
    • V1 + Ascend scheduler vs V1 + Ascend scheduler + enable chunked prefill

should rebase after #1498 and #1446

Does this PR introduce any user-facing change?

N/A

How was this patch tested?

CI passed with new added test.

@codecov
Copy link

codecov bot commented Jun 28, 2025

Codecov Report

❌ Patch coverage is 46.15385% with 7 lines in your changes missing coverage. Please review.
✅ Project coverage is 34.17%. Comparing base (c30ddb8) to head (0c10291).
⚠️ Report is 609 commits behind head on main.

Files with missing lines Patch % Lines
tests/conftest.py 46.15% 7 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1505      +/-   ##
==========================================
+ Coverage   27.39%   34.17%   +6.77%     
==========================================
  Files          56       63       +7     
  Lines        6191     7328    +1137     
==========================================
+ Hits         1696     2504     +808     
- Misses       4495     4824     +329     
Flag Coverage Δ
unittests 34.17% <46.15%> (+6.77%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@Yikun Yikun added the ready read for review label Jun 29, 2025
@github-actions github-actions bot added merge-conflicts and removed ready read for review labels Jun 30, 2025
@github-actions
Copy link

This pull request has conflicts, please resolve those before we can evaluate the pull request.

Copy link
Collaborator

@Yikun Yikun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if CI passed

@Yikun Yikun added the ready read for review label Jun 30, 2025
@github-actions github-actions bot added merge-conflicts and removed ready read for review labels Jul 1, 2025
@github-actions
Copy link

github-actions bot commented Jul 1, 2025

This pull request has conflicts, please resolve those before we can evaluate the pull request.

…uler

Signed-off-by: MengqingCao <cmq0113@163.com>
@Yikun Yikun merged commit 59237ea into vllm-project:main Jul 2, 2025
13 checks passed
ZhengWG pushed a commit to ZhengWG/vllm-ascend that referenced this pull request Jul 3, 2025
…uler (vllm-project#1505)

### What this PR does / why we need it?
Add test for chunked prefill and prefix cache on v1/AscendScheduler

Covered scenarios:
- `Qwen/Qwen3-0.6B-Base` and `deepseek-ai/DeepSeek-V2-Lite-Chat` ---
multicard CI time increased by 19 min
- `V1 + default scheduler` vs `V1 + default scheduler + enable prefix
cache`
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable prefix
cache` vs `V1 + Ascend scheduler + enable prefix cache + enable chunked
prefill`
- `Qwen/Qwen3-0.6B-Base` --- singlecard CI time increased by 8 min
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable chunked
prefill`

should rebase after vllm-project#1498 and vllm-project#1446
### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with new added test.

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: ZhengWG <zwg0606@gmail.com>
ZhengWG pushed a commit to ZhengWG/vllm-ascend that referenced this pull request Jul 3, 2025
…uler (vllm-project#1505)

### What this PR does / why we need it?
Add test for chunked prefill and prefix cache on v1/AscendScheduler

Covered scenarios:
- `Qwen/Qwen3-0.6B-Base` and `deepseek-ai/DeepSeek-V2-Lite-Chat` ---
multicard CI time increased by 19 min
- `V1 + default scheduler` vs `V1 + default scheduler + enable prefix
cache`
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable prefix
cache` vs `V1 + Ascend scheduler + enable prefix cache + enable chunked
prefill`
- `Qwen/Qwen3-0.6B-Base` --- singlecard CI time increased by 8 min
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable chunked
prefill`

should rebase after vllm-project#1498 and vllm-project#1446
### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with new added test.

Signed-off-by: MengqingCao <cmq0113@163.com>
Signed-off-by: ZhengWG <zwg0606@gmail.com>
@MengqingCao MengqingCao deleted the e2e branch July 8, 2025 02:04
chopper0126 pushed a commit to chopper0126/vllm-ascend that referenced this pull request Oct 16, 2025
…uler (vllm-project#1505)

### What this PR does / why we need it?
Add test for chunked prefill and prefix cache on v1/AscendScheduler

Covered scenarios:
- `Qwen/Qwen3-0.6B-Base` and `deepseek-ai/DeepSeek-V2-Lite-Chat` ---
multicard CI time increased by 19 min
- `V1 + default scheduler` vs `V1 + default scheduler + enable prefix
cache`
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable prefix
cache` vs `V1 + Ascend scheduler + enable prefix cache + enable chunked
prefill`
- `Qwen/Qwen3-0.6B-Base` --- singlecard CI time increased by 8 min
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable chunked
prefill`

should rebase after vllm-project#1498 and vllm-project#1446
### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with new added test.

Signed-off-by: MengqingCao <cmq0113@163.com>
Angazenn pushed a commit to Angazenn/vllm-ascend that referenced this pull request Oct 21, 2025
…uler (vllm-project#1505)

### What this PR does / why we need it?
Add test for chunked prefill and prefix cache on v1/AscendScheduler

Covered scenarios:
- `Qwen/Qwen3-0.6B-Base` and `deepseek-ai/DeepSeek-V2-Lite-Chat` ---
multicard CI time increased by 19 min
- `V1 + default scheduler` vs `V1 + default scheduler + enable prefix
cache`
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable prefix
cache` vs `V1 + Ascend scheduler + enable prefix cache + enable chunked
prefill`
- `Qwen/Qwen3-0.6B-Base` --- singlecard CI time increased by 8 min
- `V1 + Ascend scheduler` vs `V1 + Ascend scheduler + enable chunked
prefill`

should rebase after vllm-project#1498 and vllm-project#1446
### Does this PR introduce _any_ user-facing change?
N/A

### How was this patch tested?
CI passed with new added test.

Signed-off-by: MengqingCao <cmq0113@163.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants