[Bugfix]: Clean up chunked prefill logging when using whisper #25075

simondanielsson · 2025-09-17T14:38:27Z

Purpose

Test Plan

When using whisper:

vllm serve openai/whisper-large-v3

logs should no longer mention "Chunked prefill is enabled with ...":

(APIServer pid=3140911) INFO 09-17 12:37:08 [scheduler.py:222] Chunked prefill is enabled with max_num_batched_tokens=8192.
(APIServer pid=3140911) INFO 09-17 12:37:10 [__init__.py:2790] Encoder-decoder models do not support chunked prefill nor prefix caching; disabling both.

Expecting simply

(APIServer pid=3140911) INFO 09-17 12:37:10 [__init__.py:2790] Encoder-decoder models do not support chunked prefill nor prefix caching; disabling both.

Should lead to no changes to the SchedulerConfig nor VllmConfig. Verify with new tests.

Test Result

Command:

Tested on GPU: L4.
Output from "test" command:

(vllm) danielssonsimon@XXXXXX:~/code/vllm$ vllm serve openai/whisper-large-v3
INFO 09-17 18:43:30 [__init__.py:216] Automatically detected platform cuda.
(APIServer pid=49917) INFO 09-17 18:43:33 [api_server.py:1813] vLLM API server version 0.10.2rc3.dev169+ge3db5ebb6.d20250917
(APIServer pid=49917) INFO 09-17 18:43:33 [utils.py:328] non-default args: {'model_tag': 'openai/whisper-large-v3', 'model': 'openai/whisper-large-v3'}
(APIServer pid=49917) INFO 09-17 18:43:42 [__init__.py:707] Resolved architecture: WhisperForConditionalGeneration
(APIServer pid=49917) `torch_dtype` is deprecated! Use `dtype` instead!
(APIServer pid=49917) INFO 09-17 18:43:42 [__init__.py:1762] Using max model len 448
(APIServer pid=49917) INFO 09-17 18:43:43 [scheduler.py:197] Encoder-decoder models do not support chunked prefill nor prefix caching; disabling both.
Fetching 1 files: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 11915.64it/s]

New tests pass locally.

Essential Elements of an Effective PR Description Checklist

The purpose of the PR, such as "Fix some issue (link existing issues this PR will resolve)".
The test plan, such as providing test command.
The test results, such as pasting the results comparison before and after, or e2e results
(Optional) The necessary documentation update, such as updating supported_models.md and examples for a new model.
(Optional) Release notes update. If your change is user facing, please update the release notes draft in the Google Doc.

mergify · 2025-09-17T19:44:00Z

This pull request has merge conflicts that must be resolved before it can be
merged. Please rebase the PR, @simondanielsson.

https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

hmellor · 2025-09-18T10:27:09Z

vllm/config/scheduler.py

+    is_encoder_decoder: bool = False
+    """True if the model is an encoder-decoder model."""
+


If this already exists in ModelConfig, why duplicate it here?

True, we likely don't want to store it here as well.

Would an InitVar be sufficient here?

The InitVar solution works.

However, in other cases like this (where two sibling configs interact) I've tended to perform those interactions in the parent's __post_init__, VllmConfig in this case. Would that work in this case?

That's where I had it before this change, but we end up with a confusing log message about features being enabled coming from the SchedulerConfig's post_init before VllmConfig's post_init fixed it and disabled them.

Another option would be to perform the Chunked prefill is enabled... log in the VllmConfig, but not sure it makes sense to put it there

Ah I see, thankn you for explaining. Let's stick with the initvar

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

simondanielsson · 2025-09-30T06:36:36Z

@russelb conflicts fixed now - should be good to go after CI. Thanks!

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

### What this PR does / why we need it? This is the step 1 of refactoring code to adapt with vllm main, and this pr aligned with vllm-project/vllm@17c540a 1. refactor deepseek to the latest code arch as of vllm-project/vllm@17c540a 2. bunches of fixes due to vllm changes - Fix `AscendScheduler` `__post_init__`, caused by vllm-project/vllm#25075 - Fix `AscendScheduler` init got an unexpected arg `block_size`, caused by vllm-project/vllm#26296 - Fix `KVCacheManager` `get_num_common_prefix_blocks` arg, caused by vllm-project/vllm#23485 - Fix `MLAAttention` import,caused by vllm-project/vllm#25103 - Fix `SharedFusedMoE` import, caused by vllm-project/vllm#26145 - Fix `LazyLoader` improt, caused by vllm-project/vllm#27022 - Fix `vllm.utils.swap_dict_values` improt, caused by vllm-project/vllm#26990 - Fix `Backend` enum import, caused by vllm-project/vllm#25893 - Fix `CompilationLevel` renaming to `CompilationMode` issue introduced by vllm-project/vllm#26355 - Fix fused_moe ops, caused by vllm-project/vllm#24097 - Fix bert model because of `inputs_embeds`, caused by vllm-project/vllm#25922 - Fix MRope because of `get_input_positions_tensor` to `get_mrope_input_positions`, caused by vllm-project/vllm#24172 - Fix `splitting_ops` changes introduced by vllm-project/vllm#25845 - Fix multi-modality changes introduced by vllm-project/vllm#16229 - Fix lora bias dropping issue introduced by vllm-project/vllm#25807 - Fix structured ouput break introduced by vllm-project/vllm#26737 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? CI passed with existing test. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Icey <1790571317@qq.com>

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>

simondanielsson changed the title ~~[Bug]: Clean up chunked prefill logging when using whisper~~ [Bugfix]: Clean up chunked prefill logging when using whisper Sep 17, 2025

mergify bot added the needs-rebase label Sep 17, 2025

simondanielsson force-pushed the feature/clean-up-prefill-logging branch from 5070792 to 8ea5e85 Compare September 17, 2025 19:45

mergify bot removed the needs-rebase label Sep 17, 2025

simondanielsson marked this pull request as ready for review September 17, 2025 19:47

simondanielsson requested review from ApostaC, ProExpertProg, WoosukKwon, alexm-redhat, comaniac, heheda12345, hmellor, houseroad, mgoin, njhill, robertgshaw2-redhat, simon-mo, tlrmchlsmth, yewentao256, youkaichao and ywang96 as code owners September 17, 2025 19:47

simondanielsson force-pushed the feature/clean-up-prefill-logging branch from 6cf6d9e to 4edcb2f Compare September 17, 2025 19:54

mergify bot added the v1 label Sep 17, 2025

heheda12345 requested review from russellb and removed request for robertgshaw2-redhat and ywang96 September 17, 2025 19:59

hmellor reviewed Sep 18, 2025

View reviewed changes

simondanielsson requested a review from hmellor September 18, 2025 13:03

simondanielsson force-pushed the feature/clean-up-prefill-logging branch from fefc7ab to 4a48dc5 Compare September 18, 2025 13:03

simondanielsson force-pushed the feature/clean-up-prefill-logging branch 3 times, most recently from 2abc703 to b721f6c Compare September 29, 2025 18:32

simondanielsson added 8 commits September 30, 2025 08:33

Move disabling of chunked prefill for enc-dec to SchedulerConfig

a329eed

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Add tests

5a46231

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Move test to v1 package

fd6d12f

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Make is_encoder_decoder an init var

a988219

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Run pre-commit

0e3a5e9

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Fix formatting

6e72083

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Make test compatible with pydantic.dataclasses

a5f9cae

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Disable prefix caching only if chunked prefill is explicitly disabled

46594df

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

simondanielsson force-pushed the feature/clean-up-prefill-logging branch from b721f6c to 46594df Compare September 30, 2025 06:34

Remove unused code

89eff01

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

hmellor enabled auto-merge (squash) September 30, 2025 07:30

hmellor merged commit e23cacd into vllm-project:main Sep 30, 2025
45 checks passed

simondanielsson deleted the feature/clean-up-prefill-logging branch September 30, 2025 08:36

Liu-congo mentioned this pull request Oct 1, 2025

[Perf] Optimize reshape_and_cache CUDA Kernel #26021

Closed

2 tasks

hl475 mentioned this pull request Oct 2, 2025

[CI Failure] fix_test_auto_prefix_cache_support #26053

Merged

5 tasks

pdasigi pushed a commit to pdasigi/vllm that referenced this pull request Oct 2, 2025

[Bugfix]: Clean up chunked prefill logging when using whisper (vllm-p…

687f42d

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

yewentao256 pushed a commit that referenced this pull request Oct 3, 2025

[Bugfix]: Clean up chunked prefill logging when using whisper (#25075)

da71651

Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>

This was referenced Oct 16, 2025

[CI] Upgrade vllm to newest commit vllm-project/vllm-ascend#3423

Closed

[CI] Upgrade vllm to 0.11.1 vllm-project/vllm-ascend#3499

Closed

lywa1998 pushed a commit to lywa1998/vllm that referenced this pull request Oct 20, 2025

[Bugfix]: Clean up chunked prefill logging when using whisper (vllm-p…

6668f58

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

MengqingCao mentioned this pull request Oct 22, 2025

[1/N][Refactor] Refactor code to adapt with vllm main vllm-project/vllm-ascend#3612

Merged

alhridoy pushed a commit to alhridoy/vllm that referenced this pull request Oct 24, 2025

[Bugfix]: Clean up chunked prefill logging when using whisper (vllm-p…

286715f

…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Uh oh!

[Bugfix]: Clean up chunked prefill logging when using whisper #25075

[Bugfix]: Clean up chunked prefill logging when using whisper #25075

Uh oh!

simondanielsson commented Sep 17, 2025 •

edited by github-actions bot

Loading

Uh oh!

mergify bot commented Sep 17, 2025

Uh oh!

hmellor Sep 18, 2025

Uh oh!

simondanielsson Sep 18, 2025

Uh oh!

hmellor Sep 18, 2025 •

edited

Loading

Uh oh!

russellb Sep 18, 2025

Uh oh!

simondanielsson Sep 18, 2025

Uh oh!

hmellor Sep 20, 2025

Uh oh!

simondanielsson commented Sep 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		is_encoder_decoder: bool = False
		"""True if the model is an encoder-decoder model."""

Uh oh!

Uh oh!

[Bugfix]: Clean up chunked prefill logging when using whisper #25075

[Bugfix]: Clean up chunked prefill logging when using whisper #25075

Uh oh!

Conversation

simondanielsson commented Sep 17, 2025 • edited by github-actions bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Test Plan

Test Result

Uh oh!

mergify bot commented Sep 17, 2025

Uh oh!

hmellor Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

simondanielsson Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

hmellor Sep 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

russellb Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

simondanielsson Sep 18, 2025

Choose a reason for hiding this comment

Uh oh!

hmellor Sep 20, 2025

Choose a reason for hiding this comment

Uh oh!

simondanielsson commented Sep 30, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

simondanielsson commented Sep 17, 2025 •

edited by github-actions bot

Loading

hmellor Sep 18, 2025 •

edited

Loading