-
Notifications
You must be signed in to change notification settings - Fork 7.1k
[train][tune] Refactor and Reintroduce keras tune callback #57121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
justinvyu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
Signed-off-by: Lehui Liu <lehui@anyscale.com>
9dff146 to
834726e
Compare
Signed-off-by: Lehui Liu <lehui@anyscale.com>
| def _save_and_report_checkpoint( | ||
| self, metrics: Dict, checkpoint: TensorflowCheckpoint | ||
| ): | ||
| ray.tune.report(metrics, checkpoint=checkpoint) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Tune Callback Passes Invalid Argument
The TuneReportCheckpointCallback passes a checkpoint argument to ray.tune.report(). This causes a TypeError because ray.tune.report() does not accept a checkpoint parameter, unlike ray.train.report(). Tune's API handles checkpoints through a different mechanism.
justinvyu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
…ct#57121) 1. This PR re-introduces the `ray.tune.integration.keras import TuneReportCheckpointCallback` 2. This is mainly uses for Tune only usage, testing through script here: https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9 --------- Signed-off-by: Lehui Liu <lehui@anyscale.com>
…ct#57121) 1. This PR re-introduces the `ray.tune.integration.keras import TuneReportCheckpointCallback` 2. This is mainly uses for Tune only usage, testing through script here: https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9 --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Josh Kodi <joshkodi@gmail.com>
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: #57534, #57256, #56868, #56820, #56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by #57042 and #57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…ct#57121) 1. This PR re-introduces the `ray.tune.integration.keras import TuneReportCheckpointCallback` 2. This is mainly uses for Tune only usage, testing through script here: https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9 --------- Signed-off-by: Lehui Liu <lehui@anyscale.com>
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: xgui <xgui@anyscale.com>
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: #57534, #57256, #56868, #56820, #56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by #57042 and #57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: elliot-barn <elliot.barnwell@anyscale.com>
…ct#57121) 1. This PR re-introduces the `ray.tune.integration.keras import TuneReportCheckpointCallback` 2. This is mainly uses for Tune only usage, testing through script here: https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9 --------- Signed-off-by: Lehui Liu <lehui@anyscale.com>
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com>
…ct#57121) 1. This PR re-introduces the `ray.tune.integration.keras import TuneReportCheckpointCallback` 2. This is mainly uses for Tune only usage, testing through script here: https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9 --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Aydin Abiar <aydin@anyscale.com>
…ct#57121) 1. This PR re-introduces the `ray.tune.integration.keras import TuneReportCheckpointCallback` 2. This is mainly uses for Tune only usage, testing through script here: https://gist.github.com/liulehui/99a5560031c10c274c1d7d99247033d9 --------- Signed-off-by: Lehui Liu <lehui@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
Ports over the remaining unit tests that were marked as TODOs from this series of PRs: ray-project#57534, ray-project#57256, ray-project#56868, ray-project#56820, ray-project#56816. Notably: * `test_new_dataset_config -> test_data_integration` * `test_backend -> test_torch_trainer, test_worker_group` * `test_gpu -> test_torch_gpu` This PR also finishes migrating the Tune LightGBM/Keras examples which were unblocked by ray-project#57042 and ray-project#57121. --------- Signed-off-by: Justin Yu <justinvyu@anyscale.com> Signed-off-by: Future-Outlier <eric901201@gmail.com>
ray.tune.integration.keras import TuneReportCheckpointCallbackWhy are these changes needed?
Related issue number
Checks
git commit -s) in this PR.method in Tune, I've added it in
doc/source/tune/api/under thecorresponding
.rstfile.