-
-
Notifications
You must be signed in to change notification settings - Fork 11.3k
[CI] make all multi-gpu weight loading tests run nightly #23792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CI] make all multi-gpu weight loading tests run nightly #23792
Conversation
Signed-off-by: Alex Yun <alexyun04@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to move multi-GPU weight loading tests to a nightly schedule to reduce CI time. The current change marks the test step as optional in the Buildkite pipeline. However, this only disables the test from running on pull requests and does not configure it to run nightly. This creates a temporary but significant gap in test coverage until the nightly schedule is implemented. I have raised a high-severity concern about this and recommended including the nightly schedule configuration in this same PR to avoid introducing regressions.
| mirror_hardwares: [amdexperimental] | ||
| working_dir: "/vllm-workspace/tests" | ||
| num_gpus: 2 | ||
| optional: true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change makes the 'Test Multi-GPU Weight Loading (AMD)' step optional, effectively disabling it from running on pull requests. While the goal is to move this to a nightly run, this pull request only implements the first part of that change.
Merging this PR as-is will create a gap in test coverage, as these multi-GPU tests will not run at all until the nightly schedule is configured to un-block this step. This could allow regressions to be introduced undetected.
It would be safer to include the changes for the nightly schedule in this same pull request to avoid any period where these important tests are disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my understanding is that all optional tests will be run automatically every night, so nothing else needs to be changed (#10465)
…t#23792) Signed-off-by: Alex Yun <alexyun04@gmail.com>
…t#23792) Signed-off-by: Alex Yun <alexyun04@gmail.com>
…t#23792) Signed-off-by: Alex Yun <alexyun04@gmail.com>
…t#23792) Signed-off-by: Alex Yun <alexyun04@gmail.com>
cc @njhill
Purpose
Reduce CI time by deferring multi-gpu weightloading tests to nightly. First part of #23669
Test Plan
n/a
Essential Elements of an Effective PR Description Checklist