Skip to content
This repository has been archived by the owner on Nov 3, 2023. It is now read-only.

Full Tensorboard metric titles #3534

Merged
merged 7 commits into from
Mar 18, 2021
Merged

Full Tensorboard metric titles #3534

merged 7 commits into from
Mar 18, 2021

Conversation

spencerp
Copy link
Contributor

Patch description
The abbreviated metric titles are hard to get used to, but using longer metric identifiers makes the stdout print hard to parse.
This PR adds a separation between the metric identifier and the display name of the metric. It also adds a description. The title and description are included in tensorboard, but the abbreviated identifiers are used in the logs.
It moves the source of truth for metric titles and descriptions from the docs to the code, and generates the metrics table from this.

There are a few things I don't really have time to tackle right now, but would be nice for a later PR:

  • Representing families of metrics more automatically in the source of truth dict (collapse rouge-* metrics and use appropriate name and description if any metrics matching that format are used).
  • Collapsing multiple metrics of the same family in the metrics table
  • Preserving the monospace formatting of the metrics in the docs table

Testing steps
Ran a basic test run locally:

parlai train_model --task babi:task10k:1 --model-file ~/tmp/babi_memnn --batchsize 1 --num-epochs 5 --model memnn --no-cuda -tblog True

Verified the metrics were short and fit on the screen:
Screen Shot 2021-03-17 at 4 28 03 PM

Verified that they were long and had descriptions on Tensorboard:
Screen Shot 2021-03-17 at 4 27 16 PM

Built website:

cd docs; make html

Verified the metrics list was rendered:
Screen Shot 2021-03-17 at 4 50 19 PM

Copy link
Contributor

@stephenroller stephenroller left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

v cool

docs/source/generate_metric_list.py Outdated Show resolved Hide resolved
@@ -34,6 +34,100 @@
}
ALL_METRICS = DEFAULT_METRICS | ROUGE_METRICS | BLEU_METRICS | DISTINCT_METRICS

MetricDisplayData = namedtuple('MetricDisplayData', ('title', 'description'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about a data class instead

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, what's the advantage you see? I was using a namedtuple because this isn't really data that should be mutable.

If it's the pretty syntax you're looking for, though, I just found out there's a nicer syntax for namedtuples:

class MetricDisplayData(NamedTuple):
    title: str
    description: str

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We used NamedTuples before and it gave me headaches and now i stay away with them. That syntax is nicer and fine with me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious what headaches you encountered! I haven't used them a ton (just for small things like this here and there) so maybe there're headaches incoming I'm ignorant of.

}


def get_metric_display_data(metric: str) -> MetricDisplayData:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe as a utility of MetricsDisplayData

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's kind of nice to keep this functional, though, since there isn't any state we should be keeping around. Also, it needs access to METRICS_DISPLAY_DATA which I think makes more sense scoped to the namespace than a class. What's the advantage you see from putting it in MetricDisplayData?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of a classmethod (and maybe the global too), just to keep everything in a tight namespace.

Copy link
Contributor

@stephenroller stephenroller Mar 18, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol prolly the global can't be in there so long as it's self-typed...

anyway saul goodman

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I think if we went that path, the metrics/titles/descriptions would live in a separate json/yaml file. And we'd have a separate function that loads them up as MetricDisplayDatas into a global. But then there'd be a disconnect between the source of truth and the global which is a little weird. I guess we could make MetricDisplayData a singleton and load them up on instantiation, but then we have to instantiate an object just to get this static list of strings which feels heavy.

Another way to keep them in a tight namespace would be to just create a metrics_list.py module.

Idk let me know if any of those options sound better, I see plenty of advantages and disadvantages to each so not super opinionated lol

parlai/core/metrics.py Outdated Show resolved Hide resolved
@spencerp spencerp merged commit d016e58 into master Mar 18, 2021
@spencerp spencerp deleted the trun-len-2 branch March 18, 2021 21:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants