Skip to content

Conversation

@liulehui
Copy link
Contributor

@liulehui liulehui commented Sep 22, 2025

Why are these changes needed?

  1. In Train V2, tune runs trials of train_driver_fn instead of Trainer instance in V1 into the Tuner, see latest doc
  2. Pass in the TuneReportCallback for the trainer that used in Tune for reported results.
  3. Reduced the number of runs / trials to make the test run faster
  4. example run: https://buildkite.com/ray-project/release/builds/59578#019973ab-dee6-407c-bc0e-702ee9247ced

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request successfully upgrades the tune_torch_benchmark to use the Ray Train V2 API. The changes correctly adapt the benchmark to the new V2 patterns, such as using a train_driver_fn for tuning and leveraging TuneReportCallback. The refactoring to move train_loop to the module level improves code clarity. Additionally, the test configuration in release_tests.yaml has been updated to run faster and enable the V2 API, which is a sensible adjustment for release testing.

I have one minor suggestion to improve the robustness of the tune_torch function against a potential TypeError if it's called with a None config, as allowed by its signature.

Signed-off-by: Lehui Liu <lehui@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Signed-off-by: Lehui Liu <lehui@anyscale.com>
@ray-gardener ray-gardener bot added tune Tune-related issues release-test release test labels Sep 23, 2025
@liulehui liulehui requested a review from a team September 23, 2025 16:34
Copy link
Contributor

@justinvyu justinvyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Signed-off-by: Lehui Liu <lehui@anyscale.com>
@liulehui liulehui added the go add ONLY when ready to merge, run all tests label Sep 23, 2025
@justinvyu justinvyu merged commit ba9fbf9 into ray-project:master Sep 24, 2025
7 checks passed
elliot-barn pushed a commit that referenced this pull request Sep 27, 2025
1. Update to Train V2 Train+Tune integration API.
2. Pass in the TuneReportCallback for the trainer that used in Tune for
reported results.
3. Reduced the number of runs / trials to make the test run faster

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
dstrodtman pushed a commit that referenced this pull request Oct 6, 2025
1. Update to Train V2 Train+Tune integration API.
2. Pass in the TuneReportCallback for the trainer that used in Tune for
reported results.
3. Reduced the number of runs / trials to make the test run faster

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Douglas Strodtman <douglas@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
1. Update to Train V2 Train+Tune integration API.
2. Pass in the TuneReportCallback for the trainer that used in Tune for
reported results.
3. Reduced the number of runs / trials to make the test run faster

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
1. Update to Train V2 Train+Tune integration API.
2. Pass in the TuneReportCallback for the trainer that used in Tune for
reported results.
3. Reduced the number of runs / trials to make the test run faster

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
1. Update to Train V2 Train+Tune integration API.
2. Pass in the TuneReportCallback for the trainer that used in Tune for
reported results.
3. Reduced the number of runs / trials to make the test run faster

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests release-test release test tune Tune-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants