-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow logging in on_train_end
#15143
Comments
@lminer You can accumulate all the metrics from |
I want to have some way to run the general |
@carmocca I don't remember the exact criterions for when we allow to log in a hook or not. Perhaps there is a good reason, probably related to reduction as on_epoch/on_step is not meaningful post the epoch loops, but I'm not sure. @lminer You should be able to work around this issue for now by calling the logger directly (this is always possible): self.logger.experiment.add_scalar("best", value) # e.g. for tensorboard logger |
That's correct @awaelchli. |
Yes, makes sense. Then I suppose #15143 (comment) is becoming more like our official recommendation for such cases. Do you think it is meaningful to add a hint to our error messages that say "it is not allowed to self.log in x", where we would recommend "you can alternatively log with logger.experiment..."? |
Sure |
Hi @awaelchli, I would like to work on this issue. |
Sorry @Al3xDo, I was away. Yes, if you still want, you can take this up. |
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Fixes #15143
Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Fixes #15143
* update chlog * CI: Add remote fetch (#16001) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> (cherry picked from commit 37fe3f6) * Set the logger explicitly in tests (#15815) (cherry picked from commit 9ed43c6) * [App] Fix `AutoScaler` trying to replicate multiple works in a single machine (#15991) * dont try to replicate new works in the existing machine * update chglog * Update comment * Update src/lightning_app/components/auto_scaler.py * add test (cherry picked from commit c1d0156) * Fix typo in PR titles generated by github-actions bot (#16003) (cherry picked from commit 2dcebc2) * Update docker requirement from <=5.0.3,>=5.0.0 to >=5.0.0,<6.0.2 in /requirements (#16007) Update docker requirement in /requirements Updates the requirements on [docker](https://github.com/docker/docker-py) to permit the latest version. - [Release notes](https://github.com/docker/docker-py/releases) - [Commits](docker/docker-py@5.0.0...6.0.1) --- updated-dependencies: - dependency-name: docker dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 4083b20) * Update deepdiff requirement from <=5.8.1,>=5.7.0 to >=5.7.0,<6.2.3 in /requirements (#16006) Update deepdiff requirement in /requirements Updates the requirements on [deepdiff](https://github.com/seperman/deepdiff) to permit the latest version. - [Release notes](https://github.com/seperman/deepdiff/releases) - [Changelog](https://github.com/seperman/deepdiff/blob/master/docs/changelog.rst) - [Commits](seperman/deepdiff@5.7.0...6.2.2) --- updated-dependencies: - dependency-name: deepdiff dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 5e705fa) * app: update doctest_skip (#15997) simple Co-authored-by: hhsecond <sherin@grid.ai> (cherry picked from commit 4fea6bf) * CI: clean install & share pkg build (#15986) * abstract pkg build * share ci * syntax * Checkgroup * folders * whl 1st * doctest Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> (cherry picked from commit 18a4638) * Adding hint to the logger's error messages (#16034) Co-authored-by: Jirka Borovec <6035284+Borda@users.noreply.github.com> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Fixes #15143 (cherry picked from commit 7ce3825) * fix publish * Introduce `{Work,Flow}.lightningignore` (#15818) (cherry picked from commit edd2b42) * [App] Support running on multiple clusters (#16016) (cherry picked from commit d3a7226) * [App] Improve lightning connect experience (#16035) (cherry picked from commit e522a12) * Cleanup cluster waiting (#16054) (cherry picked from commit 6458a5a) * feature(cli): login flow fixes and improvements (#16052) Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> (cherry picked from commit ebe7848) * Add guards to cluster deletion from cli (#16053) Adds guards to cluster deletion. - If cluster has running apps -> throw an error - If cluster has stopped apps -> confirm w/ user that apps and logs will be deleted (cherry picked from commit 64d0ebb) * Load app before setting LIGHTNING_DISPATCHED (#16057) (cherry picked from commit 8d3339a) * [App] Hot fix: Resolve detection of python debugger (#16068) Co-authored-by: thomas <thomas@thomass-MacBook-Pro.local> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> (cherry picked from commit eae56ee) * fix(cloud): detect and ignore venv (#16056) Co-authored-by: Ethan Harris <ethanwharris@gmail.com> (cherry picked from commit 3b323c8) * version 1.8.5 * update chlog Co-authored-by: thomas chaton <thomas@grid.ai> Co-authored-by: Carlos Mocholí <carlossmocholi@gmail.com> Co-authored-by: Akihiro Nitta <nitta@akihironitta.com> Co-authored-by: Adrian Wälchli <aedu.waelchli@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Huy Đỗ <56794124+Al3xDo@users.noreply.github.com> Co-authored-by: Ethan Harris <ethanwharris@gmail.com> Co-authored-by: Luca Furst <rlfurst@gmail.com> Co-authored-by: Yurij Mikhalevich <yurij@grid.ai> Co-authored-by: Luca Antiga <luca.antiga@gmail.com>
🚀 Feature
I would like to be able to log the metrics of the best performing epoch at the end of training. This is especially useful for hyperparameter sweeps.
on_train_end
seems like a natural place for this.The text was updated successfully, but these errors were encountered: