-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[tune][release] Upgrade tune_torch_benchmark to v2 #56804
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request successfully upgrades the tune_torch_benchmark to use the Ray Train V2 API. The changes correctly adapt the benchmark to the new V2 patterns, such as using a train_driver_fn for tuning and leveraging TuneReportCallback. The refactoring to move train_loop to the module level improves code clarity. Additionally, the test configuration in release_tests.yaml has been updated to run faster and enable the V2 API, which is a sensible adjustment for release testing.
I have one minor suggestion to improve the robustness of the tune_torch function against a potential TypeError if it's called with a None config, as allowed by its signature.
release/air_tests/air_benchmarks/workloads/tune_torch_benchmark.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
justinvyu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
release/air_tests/air_benchmarks/workloads/tune_torch_benchmark.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Lehui Liu <lehui@anyscale.com>
1. Update to Train V2 Train+Tune integration API. 2. Pass in the TuneReportCallback for the trainer that used in Tune for reported results. 3. Reduced the number of runs / trials to make the test run faster --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
1. Update to Train V2 Train+Tune integration API. 2. Pass in the TuneReportCallback for the trainer that used in Tune for reported results. 3. Reduced the number of runs / trials to make the test run faster --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
1. Update to Train V2 Train+Tune integration API. 2. Pass in the TuneReportCallback for the trainer that used in Tune for reported results. 3. Reduced the number of runs / trials to make the test run faster --------- Signed-off-by: Lehui Liu <lehui@anyscale.com>
1. Update to Train V2 Train+Tune integration API. 2. Pass in the TuneReportCallback for the trainer that used in Tune for reported results. 3. Reduced the number of runs / trials to make the test run faster --------- Signed-off-by: Lehui Liu <lehui@anyscale.com>
1. Update to Train V2 Train+Tune integration API. 2. Pass in the TuneReportCallback for the trainer that used in Tune for reported results. 3. Reduced the number of runs / trials to make the test run faster --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Why are these changes needed?
Related issue number
Checks
git commit -s) in this PR.scripts/format.shto lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.