-
-
Couldn't load subscription status.
- Fork 10.8k
[Bugfix]: Clean up chunked prefill logging when using whisper #25075
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bugfix]: Clean up chunked prefill logging when using whisper #25075
Conversation
|
This pull request has merge conflicts that must be resolved before it can be |
5070792 to
8ea5e85
Compare
6cf6d9e to
4edcb2f
Compare
vllm/config/scheduler.py
Outdated
| is_encoder_decoder: bool = False | ||
| """True if the model is an encoder-decoder model.""" | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this already exists in ModelConfig, why duplicate it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, we likely don't want to store it here as well.
Would an InitVar be sufficient here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The InitVar solution works.
However, in other cases like this (where two sibling configs interact) I've tended to perform those interactions in the parent's __post_init__, VllmConfig in this case. Would that work in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's where I had it before this change, but we end up with a confusing log message about features being enabled coming from the SchedulerConfig's post_init before VllmConfig's post_init fixed it and disabled them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option would be to perform the Chunked prefill is enabled... log in the VllmConfig, but not sure it makes sense to put it there
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I see, thankn you for explaining. Let's stick with the initvar
fefc7ab to
4a48dc5
Compare
2abc703 to
b721f6c
Compare
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
b721f6c to
46594df
Compare
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
|
@russelb conflicts fixed now - should be good to go after CI. Thanks! |
…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: yewentao256 <zhyanwentao@126.com>
…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: Tomer Asida <57313761+tomeras91@users.noreply.github.com>
…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com>
### What this PR does / why we need it? This is the step 1 of refactoring code to adapt with vllm main, and this pr aligned with vllm-project/vllm@17c540a 1. refactor deepseek to the latest code arch as of vllm-project/vllm@17c540a 2. bunches of fixes due to vllm changes - Fix `AscendScheduler` `__post_init__`, caused by vllm-project/vllm#25075 - Fix `AscendScheduler` init got an unexpected arg `block_size`, caused by vllm-project/vllm#26296 - Fix `KVCacheManager` `get_num_common_prefix_blocks` arg, caused by vllm-project/vllm#23485 - Fix `MLAAttention` import,caused by vllm-project/vllm#25103 - Fix `SharedFusedMoE` import, caused by vllm-project/vllm#26145 - Fix `LazyLoader` improt, caused by vllm-project/vllm#27022 - Fix `vllm.utils.swap_dict_values` improt, caused by vllm-project/vllm#26990 - Fix `Backend` enum import, caused by vllm-project/vllm#25893 - Fix `CompilationLevel` renaming to `CompilationMode` issue introduced by vllm-project/vllm#26355 - Fix fused_moe ops, caused by vllm-project/vllm#24097 - Fix bert model because of `inputs_embeds`, caused by vllm-project/vllm#25922 - Fix MRope because of `get_input_positions_tensor` to `get_mrope_input_positions`, caused by vllm-project/vllm#24172 - Fix `splitting_ops` changes introduced by vllm-project/vllm#25845 - Fix multi-modality changes introduced by vllm-project/vllm#16229 - Fix lora bias dropping issue introduced by vllm-project/vllm#25807 - Fix structured ouput break introduced by vllm-project/vllm#26737 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? CI passed with existing test. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Icey <1790571317@qq.com>
…roject#25075) Signed-off-by: simondanielsson <simon.danielsson99@hotmail.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
Purpose
Closes #25071.
Test Plan
logs should no longer mention
"Chunked prefill is enabled with ...":Expecting simply
SchedulerConfignorVllmConfig. Verify with new tests.Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.