-
-
Notifications
You must be signed in to change notification settings - Fork 11k
[Frontend][torch.compile] CompilationConfig Overhaul (#20283): name change compilation level to compilation mode, deprecation compilation level #26355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This pull request has merge conflicts that must be resolved before it can be |
4753011 to
e347d4d
Compare
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
…ops and utils_/test_utils.py::test_dict_args Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
211fe4a to
4a53a64
Compare
| CompilationMode.STOCK_TORCH_COMPILE, | ||
| CompilationMode.DYNAMO_TRACE_ONCE, | ||
| CompilationMode.VLLM_COMPILE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are all of these 1:1?
If they are... DYNAMO_AS_IS didn't include Inductor previously, but I would expect STOCK_TORCH_COMPILE to include Inductor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly LGTM, I am just confused about if Mode.STOCK_TORCH_COMPILE implies torch.compile with inductor. I am hoping it is compositional and that it uses whatever "compilation_config.backend" is. A test would convince me that this is what happens (doesn't need to be in this PR)
I think we ended up making inductor the default for all modes (except NONE) in the previous PR (#26502). Agreed we should add a test, @morrison-turnansky can you open a follow-up for that when you get a chance? But yeah the goal of the previous PR was to respect |
|
Issue created: #26911 |
): name change compilation level to compilation mode, deprecation compilation level (vllm-project#26355) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: bbartels <benjamin@bartels.dev>
): name change compilation level to compilation mode, deprecation compilation level (vllm-project#26355) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…m-project#27260) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
…m-project#27260) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: Alberto Perdomo <aperdomo@redhat.com>
): name change compilation level to compilation mode, deprecation compilation level (vllm-project#26355) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
### What this PR does / why we need it? This is the step 1 of refactoring code to adapt with vllm main, and this pr aligned with vllm-project/vllm@17c540a 1. refactor deepseek to the latest code arch as of vllm-project/vllm@17c540a 2. bunches of fixes due to vllm changes - Fix `AscendScheduler` `__post_init__`, caused by vllm-project/vllm#25075 - Fix `AscendScheduler` init got an unexpected arg `block_size`, caused by vllm-project/vllm#26296 - Fix `KVCacheManager` `get_num_common_prefix_blocks` arg, caused by vllm-project/vllm#23485 - Fix `MLAAttention` import,caused by vllm-project/vllm#25103 - Fix `SharedFusedMoE` import, caused by vllm-project/vllm#26145 - Fix `LazyLoader` improt, caused by vllm-project/vllm#27022 - Fix `vllm.utils.swap_dict_values` improt, caused by vllm-project/vllm#26990 - Fix `Backend` enum import, caused by vllm-project/vllm#25893 - Fix `CompilationLevel` renaming to `CompilationMode` issue introduced by vllm-project/vllm#26355 - Fix fused_moe ops, caused by vllm-project/vllm#24097 - Fix bert model because of `inputs_embeds`, caused by vllm-project/vllm#25922 - Fix MRope because of `get_input_positions_tensor` to `get_mrope_input_positions`, caused by vllm-project/vllm#24172 - Fix `splitting_ops` changes introduced by vllm-project/vllm#25845 - Fix multi-modality changes introduced by vllm-project/vllm#16229 - Fix lora bias dropping issue introduced by vllm-project/vllm#25807 - Fix structured ouput break introduced by vllm-project/vllm#26737 ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? CI passed with existing test. - vLLM version: v0.11.0rc3 - vLLM main: https://github.com/vllm-project/vllm/commit/v0.11.0 --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: Icey <1790571317@qq.com>
): name change compilation level to compilation mode, deprecation compilation level (vllm-project#26355) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
): name change compilation level to compilation mode, deprecation compilation level (vllm-project#26355) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: xuebwang-amd <xuebwang@amd.com>
…m-project#27260) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
): name change compilation level to compilation mode, deprecation compilation level (vllm-project#26355) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…m-project#27260) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
): name change compilation level to compilation mode, deprecation compilation level (vllm-project#26355) Signed-off-by: morrison-turnansky <mturnans@redhat.com> Signed-off-by: Morrison Turnansky <mturnans@redhat.com> Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
…m-project#27260) Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Signed-off-by: 0xrushi <6279035+0xrushi@users.noreply.github.com>
Purpose
See #20283 (comment)
The purpose of this PR is to perform variable name changes and deprecation warnings for compilation level to compilation mode. Enum values names are also changed. No true changes of behavior nor breaking changes have occurred.
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.