-
Notifications
You must be signed in to change notification settings - Fork 543
[main] Fix main to vllm newer commit #3544
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
👋 Hi! Thank you for contributing to the vLLM Ascend project. The following points will speed up your PR merge:
If CI fails, you can run linting and testing checks locally according Contributing and Testing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to align the codebase with a newer version of vLLM. The changes are extensive and touch many parts of the system, introducing conditional logic to support multiple vLLM versions. This is a significant effort to maintain compatibility. However, I have a few concerns. A number of tests have been skipped with a 'TODO' message. While this can be a temporary measure, it's crucial to track these and re-enable them promptly to avoid regressions, especially for correctness and end-to-end tests. I've also found a critical issue with duplicated code in vllm_ascend/worker/model_runner_v1.py which needs to be fixed.
| # In multi-DP scenarios, there may be situations where all DP groups are executing dummy runs. | ||
| # If sequence parallelism is enabled, it is essential to ensure that num_tokens is divisible by tp_size. | ||
| if self.use_aclgraph and enable_sp(self.vllm_config): | ||
| tp_size = self.vllm_config.parallel_config.tensor_parallel_size | ||
| num_tokens = math.ceil(num_tokens / tp_size) * tp_size | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
* fix bert model * fix guided decoding * revert skipped e2e test * fix lora vllm-project/vllm#25807 * fix vl Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: MengqingCao <cmq0113@163.com> Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
Signed-off-by: Icey <1790571317@qq.com>
|
offline external launcher and sleep mode accuracy: https://github.com/vllm-project/vllm-ascend/actions/runs/18650189290/job/53166509381?pr=3544 |
Signed-off-by: Icey <1790571317@qq.com>
|
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
What this PR does / why we need it?
[CI] Upgrade vllm to newest commit
vllm.utilsvllm#26908Does this PR introduce any user-facing change?
N/A
How was this patch tested?
CI passed with new added/existing test.