Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added missing metrics when logging on tensorboard (#1298) #1299

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

rogierz
Copy link

@rogierz rogierz commented Jan 26, 2023

Co-authored-by: Riccardo Sepe pastarick@users.noreply.github.com
Co-authored-by: Francesco Scalera francescoscalera99@users.noreply.github.com

Added missing metrics when logging on tensorboard (#1298)

Description

Now both the hparam_dict and the metric_dict are stored on Tensorboard

Motivation and Context

Tensorboard library allows storing both hparam_dict and metric_dict, this PR changes sb3 implementation so as to be
compatible with tensorboard.
The problem is described in issue #1298.
Closes #1298.

  • I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Documentation (update in the documentation)

Checklist

  • I've read the CONTRIBUTION guide (required)
  • I have updated the changelog accordingly (required).
  • My change requires a change to the documentation.
  • I have updated the tests accordingly (required for a bug fix or a new feature).
  • I have updated the documentation accordingly.
  • I have opened an associated PR on the SB3-Contrib repository (if necessary)
  • I have opened an associated PR on the RL-Zoo3 repository (if necessary)
  • I have reformatted the code using make format (required)
  • I have checked the codestyle using make check-codestyle and make lint (required)
  • I have ensured make pytest and make type both pass. (required)
  • I have checked that the documentation builds using make doc (required)

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

@araffin
Copy link
Member

araffin commented Feb 15, 2023

@timothe-chaumont could you review/test this one?

@timothe-chaumont
Copy link
Contributor

@timothe-chaumont could you review/test this one?

Yep I'll have a look at it by the end of the week :)

Co-authored-by: Riccardo Sepe <pastarick@users.noreply.github.com>
Co-authored-by: Francesco Scalera <francescoscalera99@users.noreply.github.com>
@timothe-chaumont
Copy link
Contributor

timothe-chaumont commented Feb 20, 2023

Thank you @rogierz.

The changes proposed in this PR follows the implementation in pytorch tensorboard writer, and if added to SB3 they would have the following impact:

  • When custom metrics are passed through metric_dict and are not logged from somewhere else, they will also be displayed in TIME SERIES and SCALARS tab (as a graph with one point, see screenshot below).

    Screenshot 2023-02-20 at 00 39 57
  • For metrics that are logged separately, and (e.g. rollout/ep_len_mean), and mentioned in HParam's metric_dict, the value given will be added to the metric data (i.e. adding a fake data point).

    For example, in the screenshot below, the value 0 in dark blue has been logged when calling logger.record("hparams", HParam(...)) and does not correspond to a real value.

    Screenshot 2023-02-19 at 23 49 22

Given these two side-effects, we can either:

  1. Use @rogierz's proposition and add a warnings in the documentation,
  2. Update the documentation to make it clearer, but don't modify the logger code.
    e.g. Add a warning saying that custom metrics must be logged, and add self.logger.record("custom_metric", 1.0) in the example (_on_training_start or _on_step).
  3. Modify the HParam class to ask for metrics names only (without values):
HParam(hparam_dict, metric_list=['train/value_loss', 'custom_metric'])

hence forcing users to log custom metrics separately with:

logger.record("custom_metric", 1.0)

@araffin
Copy link
Member

araffin commented Mar 2, 2023

@timothe-chaumont thanks for reviewing.

Modify the HParam class to ask for metrics names only (without values):

yes, look like a better fix, but hparams is asking for a non-empty metric dict (according to SB3 docstring), so you would assign zero values to all of them?
(I'm also wondering how a metric can be non-zero/have a meaningful meaning at the very start of training)

@timothe-chaumont
Copy link
Contributor

yes, look like a better fix, but hparams is asking for a non-empty metric dict (according to SB3 docstring), so you would assign zero values to all of them?

Yes, I think that the value given here has no impact on the logged data so we can use 0.

(I'm also wondering how a metric can be non-zero/have a meaningful meaning at the very start of training)

I guess that some metrics (e.g. ep_len_mean) wouldn't be exactly zero at the very start of the training, but I agree that meaningful use of this value must be rare. As it might be confusing we might as well remove it :)

@timothe-chaumont
Copy link
Contributor

@rogierz do you want to add this change:

Modify the HParam class to ask for metrics names only (without values)

HParam(hparam_dict, metric_list=['train/value_loss', 'custom_metric'])

Or should I do it?

@rogierz
Copy link
Author

rogierz commented Mar 27, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug]: Missing metrics when logging hyperparameters on tensorboard
3 participants