Skip to content

Conversation

@liulehui
Copy link
Contributor

@liulehui liulehui commented Oct 2, 2025

  1. This PR re-introduces the ray.tune.integration.keras import TuneReportCheckpointCallback
  2. This is mainly uses for Tune only usage, testing through script here: https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run pre-commit jobs to lint the changes in this PR. (pre-commit setup)
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@liulehui liulehui changed the title [train][tune] refactor and reintroduce keras tune callback [train][tune] Refactor and Reintroduce keras tune callback Oct 2, 2025
@liulehui liulehui marked this pull request as ready for review October 2, 2025 17:35
@liulehui liulehui requested review from a team as code owners October 2, 2025 17:35
@ray-gardener ray-gardener bot added the tune Tune-related issues label Oct 2, 2025
Copy link
Contributor

@justinvyu justinvyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@liulehui liulehui added the go add ONLY when ready to merge, run all tests label Oct 6, 2025
cursor[bot]

This comment was marked as outdated.

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
@liulehui liulehui force-pushed the fix-keras-callback branch from 9dff146 to 834726e Compare October 6, 2025 16:04
Signed-off-by: Lehui Liu <lehui@anyscale.com>
cursor[bot]

This comment was marked as outdated.

Signed-off-by: Lehui Liu <lehui@anyscale.com>
def _save_and_report_checkpoint(
self, metrics: Dict, checkpoint: TensorflowCheckpoint
):
ray.tune.report(metrics, checkpoint=checkpoint)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Tune Callback Passes Invalid Argument

The TuneReportCheckpointCallback passes a checkpoint argument to ray.tune.report(). This causes a TypeError because ray.tune.report() does not accept a checkpoint parameter, unlike ray.train.report(). Tune's API handles checkpoints through a different mechanism.

Fix in Cursor Fix in Web

Copy link
Contributor

@justinvyu justinvyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@justinvyu justinvyu merged commit f6a83c0 into ray-project:master Oct 6, 2025
6 checks passed
liulehui added a commit to liulehui/ray that referenced this pull request Oct 9, 2025
…ct#57121)

1. This PR re-introduces the `ray.tune.integration.keras import
TuneReportCheckpointCallback`
2. This is mainly uses for Tune only usage, testing through script here:
https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
joshkodi pushed a commit to joshkodi/ray that referenced this pull request Oct 13, 2025
…ct#57121)

1. This PR re-introduces the `ray.tune.integration.keras import
TuneReportCheckpointCallback`
2. This is mainly uses for Tune only usage, testing through script here:
https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Josh Kodi <joshkodi@gmail.com>
justinvyu added a commit that referenced this pull request Oct 16, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: #57534, #57256, #56868, #56820, #56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by #57042 and
#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
…ct#57121)

1. This PR re-introduces the `ray.tune.integration.keras import
TuneReportCheckpointCallback`
2. This is mainly uses for Tune only usage, testing through script here:
https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
justinyeh1995 pushed a commit to justinyeh1995/ray that referenced this pull request Oct 20, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
xinyuangui2 pushed a commit to xinyuangui2/ray that referenced this pull request Oct 22, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: xgui <xgui@anyscale.com>
elliot-barn pushed a commit that referenced this pull request Oct 23, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: #57534, #57256, #56868, #56820, #56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by #57042 and
#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
…ct#57121)

1. This PR re-introduces the `ray.tune.integration.keras import
TuneReportCheckpointCallback`
2. This is mainly uses for Tune only usage, testing through script here:
https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
landscapepainter pushed a commit to landscapepainter/ray that referenced this pull request Nov 17, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
…ct#57121)

1. This PR re-introduces the `ray.tune.integration.keras import
TuneReportCheckpointCallback`
2. This is mainly uses for Tune only usage, testing through script here:
https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Aydin-ab pushed a commit to Aydin-ab/ray-aydin that referenced this pull request Nov 19, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
…ct#57121)

1. This PR re-introduces the `ray.tune.integration.keras import
TuneReportCheckpointCallback`
2. This is mainly uses for Tune only usage, testing through script here:
https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9

---------

Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Future-Outlier pushed a commit to Future-Outlier/ray that referenced this pull request Dec 7, 2025
Ports over the remaining unit tests that were marked as TODOs from this
series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816.

Notably:
* `test_new_dataset_config -> test_data_integration`
* `test_backend -> test_torch_trainer, test_worker_group`
* `test_gpu -> test_torch_gpu`

This PR also finishes migrating the Tune LightGBM/Keras examples which
were unblocked by ray-project#57042 and
ray-project#57121.

---------

Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Signed-off-by: Future-Outlier <eric901201@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

go add ONLY when ready to merge, run all tests tune Tune-related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants