-
Notifications
You must be signed in to change notification settings - Fork 531
Upgrade to new vllm commit #3719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
315a0e0 to
71af5bf
Compare
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
…lm main Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
66f53b7 to
f96a942
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this hard work! I only have 1 question, I think this pr will not work with Qwen3-Next because actually I missed the Qwen3-Next in the previous PR. Could you test Qwen3-Next with #3741 upon this pr?
| if not model_config.is_multimodal_model and \ | ||
| structured_outputs_config.backend == "auto" and \ | ||
| not getattr(scheduler_config, "scheduler_delay_factor", 0) > 0 and \ | ||
| not scheduler_config.send_delta_data and \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe we need to guaranteed compatibility with 0.11.0 instead of remove it directly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++,
### What this PR does / why we need it? Upgrade to new vllm commit: vllm-project/vllm@c9461e0 - Fix many imports, caused by vllm-project/vllm#26908 - Fix import ```sha256```, caused by vllm-project/vllm#27169 - Remove ```SchedulerConfig.send_delta_data```, caused by vllm-project/vllm#27142 - Fix ```FusedMoE``` because of dual stream execution, caused by vllm-project/vllm#26440 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new added/existing test. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@17c540a --------- Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com> Co-authored-by: MengqingCao <cmq0113@163.com> Signed-off-by: nwpu-zxr <zhouxuerong2@huawei.com>
### What this PR does / why we need it? vllm-project/vllm@c9461e0 Fix ```spec decode rejection sampler```, caused by vllm-project/vllm#26060 Fix some ```import```, caused by vllm-project/vllm#27374 Fix ```scheduler_config.send_delta_data```, caused by #3719 Fix ```init_with_cudagraph_sizes```, caused by vllm-project/vllm#26016 Fix ```vl model```of replacing PatchEmbed's conv3d to linear layer, caused by vllm-project/vllm#27418 ### Does this PR introduce _any_ user-facing change? N/A ### How was this patch tested? CI passed with new added/existing test. - vLLM version: v0.11.0rc3 - vLLM main: vllm-project/vllm@c9461e0 --------- Signed-off-by: Icey <1790571317@qq.com>
What this PR does / why we need it?
Upgrade to new vllm commit: vllm-project/vllm@c9461e0
vllm.utilsvllm#26908sha256, caused by [Misc] Move utils to avoid conflicts with stdlib, and move tests vllm#27169SchedulerConfig.send_delta_data, caused by [V0 Deprecation] Remove V0 executors vllm#27142FusedMoEbecause of dual stream execution, caused by [Performance] Dual stream execution of "shared_experts" and "selected_experts" inside FusedMoE vllm#26440Does this PR introduce any user-facing change?
N/A
How was this patch tested?
CI passed with new added/existing test.