Design a public Python API for the hparams plugin #1998

wchargin · 2019-03-11T22:21:56Z

To use the hparams dashboard, users currently have to manually construct
hparams-specific protocol buffers and send them to file
writers (see the tutorial notebook for an example; it takes a few
dozen lines of Python code). The protobuf bindings are not particularly
idiomatic Python, and are less than pleasant to use. We should
investigate possible simplifications to this API.

For example, we could streamline the construction of the ListValues
for the discrete domains by allowing the user to pass Python lists, and
we can also infer the data types from the types of the elements of the
domain* (which also lets us require that the list is homogeneously
typed):

def create_experiment_summary():
  api = hparams_summary  # for brevity
  hparams = [
      api.hparam("num_units", domain=api.discrete([16, 32])),
      api.hparam("dropout_rate", domain=api.interval(min=0.1, max=0.2)),
      api.hparam("optimizer", domain=api.discrete(["adam", "sgd"])),
  ]
  metrics = [
      api.metric("accuracy"),
      api.metric("xent", display_name="cross-entropy"),
  ]
  experiment = api.experiment(hparams=hparams, metrics=metrics)
  experiment.write(logdir=os.path.join("logs", "hparam_tuning"))

This is just a sketch, but it’s already three times shorter than the
current demo without (imho) any loss of utility.

* It’s fine to prohibit empty domains here. If a hyperparameter has
empty domain, then the whole hyperparameter space is empty, so there can
be no runs; thus, allowing empty domains is not actually useful.

The text was updated successfully, but these errors were encountered:

omoindrot · 2019-03-14T09:42:06Z

Thanks for this awesome plugin, this is really a useful addition to Tensorboard.

After using it in my project, I have some comments:

Would it be possible to support int types? There is currently only api_pb2.DATA_TYPE_FLOAT64, I think it would make sense to add api_pb2.DATA_TYPE.INT64.
It would just be a little better for the display.
The idea above to have intervals (api.interval(min=0.1, max=0.2)) is a great one. I currently have to manually add every value used. It also enable random search and not just grid search. The user interface would have to change a bit to enable users to select ranges when filtering on the left tab
Adding new hyperparameters is not fully supported. If I add a new parameter new and re-create the event file for the HParams tab, the previous runs that don't contain new will all disappear if I choose to display new.
Ideally I would be able to add values to my old summaries, but I couldn't find a way to do that.
For instance if I add a batch_size parameter, I would need add the default value 32 to all previous runs.
In general it would be great to be able to manipulate old summaries easily.

Just some small comments, amazing work overall 🥇

wchargin · 2019-03-14T18:20:22Z

Hi @omoindrot—thanks for writing in, and glad to hear that you like it!

Would it be possible to support int types?

I was wondering the same thing. The parallel coordinates view and the
scatter plot matrix view each display integers without any decimal
points, but it’s true that the table view does show superfluous decimal
points. Patching the table view to be consistent with the other views
would certainly help; it might still be reasonable to also add explicit
support for int data.

The idea above to have intervals […] is a great one

This should already be supported:

tensorboard/tensorboard/plugins/hparams/api.proto

Lines 90 to 96 in 25dc3e8

    
           oneof domain { 
        
             // A discrete set of the values this hyperparameter can hold. 
        
             google.protobuf.ListValue domain_discrete = 5; 
        
             // Numeric data type only. The (real) interval from which values of this 
        
             // hyperparameter are taken. 
        
             Interval domain_interval = 6; 
        
           }

…though our tutorials don’t use it and it’s only mentioned in a proto
definition, so it makes sense that people don’t know about it. One
benefit of having a proper Python API here (api.interval) is that we
can more visibly document things like this. :-)

Adding new hyperparameters is not fully supported. […]

Great point; I’ve opened #2014 to track this.

shashvatshahi1998 · 2019-04-01T05:40:24Z

@wchargin can we add int64 dtype in DataType(present in api.proto) as requested by @omoindrot

Summary: The existing demo in `hparams_demo.py` properly exercises the hparams functionality, but isn’t actually related to machine learning at all. This commit introduces a demo that trains a family of MNIST models. Some hyperparameters are critically important, while others end up having effectively no impact. The experiment includes categorical, discrete, and real-valued hyperparameters. The resulting parallel coordinates view looks something like this: ![Screenshot of the parallel coordinates view][parcoords] It’s immediately obvious that the `optimizer` parameter is in fact a perfect separator for both accuracy and loss, whereas the influence of the other hyperparameters is less clear. Filtering to the Adam-optimized sessions only, we can look at the scatter plot matrix: ![Screenshot of the scatter plot matrix for `optimizer="adam"`][matrix] Here, it’s easier to see that `dropout` and `dense_layers` appear to have negligible impact, while `conv_layers` and `conv_kernel_size` are each significant. [parcoords]: https://user-images.githubusercontent.com/4317806/56250030-cf26d180-6062-11e9-9b46-daf29d8c0229.png [matrix]: https://user-images.githubusercontent.com/4317806/56250052-e49bfb80-6062-11e9-911c-9bf4c868ef58.png This demo uses only the existing hparams APIs, even when they’re a bit awkward. We still need to manually manage file writers, construct protos (and `ListValue`s in particular…), and duplicate domain information across the experiment summary and our ad hoc tuner. Also, we can’t specify integer-valued hparams over ranges, because the `Interval` type applies only to real-valued hparams. As we improve these APIs (#1998), we can improve this demo! :-) Test Plan: Tested with `tf-nightly-2.0-preview==2.0.0.dev20190416`, Python 2 and 3. wchargin-branch: hparams-ml-demo

Summary: This change introduces `HParam`, `Metric`, and `Experiment` classes, which represent their proto counterparts in a more Python-friendly way. It similarly includes a `Domain` class hierarchy, which does not correspond to a specific proto message, but rather unifies the domain variants defined on the `HParamInfo` proto. The design is roughly as in the original sketch of #1998. The primary benefit of this change is that having first-class domains enables clients to reuse the domain information for both the experiment summary and the underlying tuning algorithm. We don’t provide a method to do this out of the box, because we don’t actually provide any tuners at this time, but it’s easy to write (e.g.) a `sample_uniform` function like the one included in this commit. Then, sampling is as easy as ```python hparams = {h: sample_uniform(h.domain, rng) for h in HPARAMS} ``` It is also now more convenient to reference hparam values such that static analysis can detect potential typos, because the `HParam` objects themselves can be declared as constants and used as keys in a dict. Writing `hparams["dropuot"]` fails at runtime, but `hparams[HP_DROPUOT]` fails at lint time. As a pleasant bonus, hparam definitions are now more compact, fitting on one line instead of several. The demo code has net fewer lines. Manual summary writer management is still required. A future change will introduce a Keras callback to reduce this overhead. Test Plan: Some unit tests included, and the demo still works. wchargin-branch: hparams-structured-api

Summary: Depends on #2126, #2130, and #2139. Cf. #1998. Test Plan: None. These code snippets aren’t meant to be literally executable out of context; that’s what the demo is for. wchargin-branch: hparams-api-docs

Summary: This change introduces `HParam`, `Metric`, and `Experiment` classes, which represent their proto counterparts in a more Python-friendly way. It similarly includes a `Domain` class hierarchy, which does not correspond to a specific proto message, but rather unifies the domain variants defined on the `HParamInfo` proto. The design is roughly as in the original sketch of #1998. The primary benefit of this change is that having first-class domains enables clients to reuse the domain information for both the experiment summary and the underlying tuning algorithm. We don’t provide a method to do this out of the box, because we don’t actually provide any tuners at this time, but it’s easy to write (e.g.) a `sample_uniform` function like the one included in this commit. Then, sampling is as easy as ```python hparams = {h: sample_uniform(h.domain, rng) for h in HPARAMS} ``` It is also now more convenient to reference hparam values such that static analysis can detect potential typos, because the `HParam` objects themselves can be declared as constants and used as keys in a dict. Writing `hparams["dropuot"]` fails at runtime, but `hparams[HP_DROPUOT]` fails at lint time. As a pleasant bonus, hparam definitions are now more compact, fitting on one line instead of several. The demo code has net fewer lines. Manual summary writer management is still required. A future change will introduce a Keras callback to reduce this overhead. Test Plan: Some unit tests included, and the demo still works. wchargin-branch: hparams-structured-api

Summary: A new `hparams.api.KerasCallback` class simplifies client APIs by writing session start and end summaries automatically, with a dict of hparams provided by the client. Cf. #1998. This only works in TensorFlow eager mode. The stock `TensorBoard` Keras callback works in both eager and graph modes, but to do so it must use TensorFlow-internal symbols (`eager_mode` and `executing_eagerly` on the context object, which we do not have access to). Test Plan: Unit tests included. The demo still works, generating valid data. wchargin-branch: hparams-keras-callback

Summary: The `Experiment` object bundled `HParam`s and `Metric`s with some metadata that’s not actually used in the current UI. We don’t think that it pulls its conceptual weight, so this commit replaces it with a direct summary-writing operation. This function will soon be extracted into a `summary_pb2` module, as part of a [larger plan to refactor the `api` module][1]. Making this change first minimizes churn in the demo code. [1]: #2139 (comment) Cf. #1998. Test Plan: Unit tests modified appropriately, and the demo still works. wchargin-branch: hparams-experiment-writing

wchargin · 2019-05-01T00:47:34Z

A few people have expressed confusion about the term “session” as used
by the current hparams API. In TensorFlow 1.x, tf.Session is a core
piece of technical infrastructure for evaluating graphs. In the hparams
API, a “session” is a single run of the model (training plus validation)
with one set of hyperparameter values, but in TensorFlow 1.x this may
correspond to many sess.run() calls, and in TensorFlow 2.x there are
no sessions at all. The two notions of “session” are roughly unrelated.

I propose omitting “session” from new API symbols where feasible, and
using “trial” instead. This is consistent with Vizier’s usage—from §1.2
of the Vizier paper,

A Trial is a list of parameter values, x, that will lead to a
single evaluation of f(x).

—and, I think, also suggests the correct meaning.

“Session groups” would most literally become “trial groups”, though this
name doesn’t make it obvious that these are specifically groups of
trials with the same hyperparameters (nor did “session groups”). Rather
than just using this literal replacement, we should try to convey the
actual meaning: e.g., instead of asking for a “session group name”, ask
for a “hparams key”.

Even better, though, I think that we can avoid asking for session group
names at all in the common case. Rather than letting the session group
name default to the session name, we should let it default to something
like sha256(str(hparams)). This satisfies the intended behavior of
session groups partitioning the trial space by hyperparameter values,
without requiring any additional user input.

nfelt · 2019-05-02T00:37:21Z

Some thoughts: even having a separate "trial/trial group" concept seems a little unwieldy to me. What if we just called them "runs" to use the terminology from the rest of TensorBoard?

The mapping is not exact, but I think in the common cases the concepts do align, and it seems better to me to share terminology in the common cases than introduce new terms just to be slightly more exact in the less common cases. In the long run, it would make sense for "runs" to be more conceptually defined anyway - the "subdirectory with event files in it" definition doesn't apply to database-first summaries, for example.

Off the top of my head, the cases where trials don't exactly match up with runs are:

Runs might not record any session_start_pb summary and hence won't show up in the hparams dashboard - this seems fine to me, they're just not "hparams-enabled runs"
Repeated executions of a given group (i.e. a "trial group") where each execution is a run, rather than the whole collection - these are more of a special case (the actual dashboard doesn't aggregate across these yet anyway). To me it makes more sense for the top-level primitive to be singular - i.e. to just be "trial" or "run", no "group" - because otherwise the likely majority of users who have one trial per trial group are having to deal with extra conceptual overhead of the less common case. Instead, it seems better to me to say that a repeatedly executed hparam combination is a special type of run, e.g. a "repeated run" or a "meta-run" or a "synthetic run" (with its metric values defined according to the specified aggregation rules). And that definition could be generalized to other dashboards to address this longstanding feature request: Allow grouping of multiple runs #376
Metrics defined against "sub-runs" of the run that contains the session pb - this also seems like a special case to me; it's basically a workaround for the fact that a single logical run often uses multiple subdirectories to facilitate comparison (e.g. train and eval subdir). And I think in a sense this is just a specific instance of case 1 above - the sub-runs won't define their own session PB summaries, so they just aren't hparams-enabled runs.

Summary: This summarizes APIs introduced in #2126, #2130, #2139, and #2188. See also #1998. Test Plan: None. These code snippets aren’t meant to be literally executable out of context; that’s what the demo is for. wchargin-branch: hparams-api-docs

moritzmeister · 2019-05-09T08:29:04Z

If I may throw in my opinion, I have been trying out the API and I totally agree with @wchargin about

“Session groups” would most literally become “trial groups”, though this
name doesn’t make it obvious that these are specifically groups of
trials with the same hyperparameters (nor did “session groups”). Rather
than just using this literal replacement, we should try to convey the
actual meaning: e.g., instead of asking for a “session group name”, ask
for a “hparams key”.

The "sessions" term is very confusing. Trial groups might not be perfect but more intuitive.
What you call a "hparams key", we call a Trial ID for example, which is also a hash of the unique Hyperparameter combination. I see that you want to create these keys/IDs (no matter how you call them) for the user and hide this from him/her. However, in the current public API, I am missing the possibility to set this myself. We have a Trial ID already to identify and track our trials across services in our system and I would like to be able to reuse this own ID. I can imagine other users coming across this in the future, too. Do you think this could be included? It used to be possible in the KerasCallback (hp.KerasCallback(logdir, hparams, group_name=group_id)). Adding it to the summary_v2.hparams and hparams_pb would also help already, so I can write my own callback.

On another note, I will be happy to start contributing to this plugin in the near future, since I am planning to make use of it once 1.14 gets released :)

wchargin · 2019-05-09T15:10:35Z

Hey @moritzmeister: thanks a ton for trying out the new APIs. This
feedback is invaluable.

We can certainly add a group_name/group_id parameter to the Keras
callback and to summary_v2.hparams{,_pb}. I omitted it initially
because it’s easier to add it in later than to remove it, and I wasn’t
sure whether it was actually useful. But linking the IDs to those used
in other systems is totally reasonable and seems like sufficient
justification to me. Will add.

On another note, I will be happy to start contributing to this plugin
in the near future, since I am planning to make use of it once 1.14
gets released :)

Excellent! Looking forward to it. :-)

Summary: Resolves #2440. See #1998 for discussion. Test Plan: The hparams demo still does not specify trial IDs (intentionally, as this is the usual path). But apply the following patch— ```diff diff --git a/tensorboard/plugins/hparams/hparams_demo.py b/tensorboard/plugins/hparams/hparams_demo.py index ac4e762b..d0279f27 100644 --- a/tensorboard/plugins/hparams/hparams_demo.py +++ b/tensorboard/plugins/hparams/hparams_demo.py @@ -63,7 +63,7 @@ flags.DEFINE_integer( ) flags.DEFINE_integer( "num_epochs", - 5, + 1, "Number of epochs per trial.", ) @@ -160,7 +160,7 @@ def model_fn(hparams, seed): return model -def run(data, base_logdir, session_id, hparams): +def run(data, base_logdir, session_id, trial_id, hparams): """Run a training/validation session. Flags must have been parsed for this function to behave. @@ -179,7 +179,7 @@ def run(data, base_logdir, session_id, hparams): update_freq=flags.FLAGS.summary_freq, profile_batch=0, # workaround for issue #2084 ) - hparams_callback = hp.KerasCallback(logdir, hparams) + hparams_callback = hp.KerasCallback(logdir, hparams, trial_id=trial_id) ((x_train, y_train), (x_test, y_test)) = data result = model.fit( x=x_train, @@ -235,6 +235,7 @@ def run_all(logdir, verbose=False): data=data, base_logdir=logdir, session_id=session_id, + trial_id="trial-%d" % group_index, hparams=hparams, ) ``` —and then run `//tensorboard/plugins/hparams:hparams_demo`, and observe that the HParams dashboard renders a “Trial ID” column with the specified IDs: ![Screenshot of new version of HParams dashboard] [1]: https://user-images.githubusercontent.com/4317806/61491024-1fb01280-a963-11e9-8a47-35e0a01f3691.png wchargin-branch: hparams-trial-id

Summary: Resolves #2440. See #1998 for discussion. Test Plan: The hparams demo still does not specify trial IDs (intentionally, as this is the usual path). But apply the following patch— ```diff diff --git a/tensorboard/plugins/hparams/hparams_demo.py b/tensorboard/plugins/hparams/hparams_demo.py index ac4e762b..38b2b122 100644 --- a/tensorboard/plugins/hparams/hparams_demo.py +++ b/tensorboard/plugins/hparams/hparams_demo.py @@ -160,7 +160,7 @@ def model_fn(hparams, seed): return model -def run(data, base_logdir, session_id, hparams): +def run(data, base_logdir, session_id, trial_id, hparams): """Run a training/validation session. Flags must have been parsed for this function to behave. @@ -179,7 +179,7 @@ def run(data, base_logdir, session_id, hparams): update_freq=flags.FLAGS.summary_freq, profile_batch=0, # workaround for issue #2084 ) - hparams_callback = hp.KerasCallback(logdir, hparams) + hparams_callback = hp.KerasCallback(logdir, hparams, trial_id=trial_id) ((x_train, y_train), (x_test, y_test)) = data result = model.fit( x=x_train, @@ -235,6 +235,7 @@ def run_all(logdir, verbose=False): data=data, base_logdir=logdir, session_id=session_id, + trial_id="trial-%d" % group_index, hparams=hparams, ) ``` —and then run `//tensorboard/plugins/hparams:hparams_demo`, and observe that the HParams dashboard renders a “Trial ID” column with the specified IDs: ![Screenshot of new version of HParams dashboard][1] [1]: https://user-images.githubusercontent.com/4317806/61491024-1fb01280-a963-11e9-8a47-35e0a01f3691.png wchargin-branch: hparams-trial-id

wchargin · 2019-07-19T17:12:28Z

@moritzmeister: Long delay, I know, but: I’ve just added a trial_id
kwarg to KerasCallback, hparams, and hparams_pb, as requested. The
new forms are in latest nightly; see #2440/#2442.

I’m going to close this issue, since its original purpose has been
completed and I don’t have any more planned changes in the works. Please
feel free to open new issues for any further requests or feedback!

joshlk · 2019-07-22T09:57:09Z

@wchargin Could we also include the run name (folder name) in the summary table too? So that you can easily link the data in the scalars tab with HPARMS

wchargin · 2019-07-22T18:15:16Z

@joshlk: There’s not a one-to-one correspondence; a single trial can
contain multiple runs, with the same hyperparameters but different
random seeds. This is why “trial ID” is a separate concept from “run
name” in the first place.

Do note that you can check the “Show Metrics” box in the table view to
view scalar charts for a trial:

joshlk · 2019-07-23T08:42:42Z

@wchargin Do you have an example of multiple runs per trail? How would that work? Do you take the average of the metrics in the display?

wchargin · 2019-07-24T21:32:25Z

@joshlk: Yes, the default is to take the average of each metric
independently. The backend supports options to aggregate by a
metric: e.g., “for each session group, show me the metrics corresponding
to the run with the highest accuracy” or “…the run with the lowest
xent”. But the frontend doesn’t currently expose any UI to enable
this, so from a user’s perspective the behavior that you describe is the
only supported one.

Do you have an example of multiple runs per trail?

The hparams demo script emits two runs per trial. For a trivial
example, you can run the following:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from tensorboard.plugins.hparams import api as hp
import tensorflow.compat.v2 as tf


__import__("tensorflow").compat.v1.enable_eager_execution()

with tf.summary.create_file_writer("a").as_default():
  hp.hparams({"learning_rate": 0.2})
  tf.summary.scalar("loss", 0.1, step=0)
  tf.summary.scalar("accuracy", 0.8, step=0)

with tf.summary.create_file_writer("b").as_default():
  hp.hparams({"learning_rate": 0.2})
  tf.summary.scalar("loss", 0.3, step=0)
  tf.summary.scalar("accuracy", 0.9, step=0)

Then, launch tensorboard --logdir ., and you’ll see a single trial
with accuracy reported as 0.85 and loss reported as 0.2. If you check
the “Show Metrics” box, you can see charts with data from both runs:

joshlk · 2019-07-25T13:47:20Z

@wchargin Thanks thats a really useful feature! 👍

I've been using the HParams feature for a couple of days now and it would still be really great to link the Tail ID to the Runs as otherwise the HParams tab is totally disconnected from Scalars.

The HParams tab doesn't have the same overview of all runs like the Scalars tab does. When I look at the results I start from the Scalars tab and identify the runs that are doing well but then its very difficult for me to find those runs in the HParams tab. Maybe it could show the runs IDs when you select show metrics. Or you could filter by run ID on the right side.

wchargin · 2019-07-26T00:03:08Z

@joshlk: Yep; we generally agree. The original vision was that the
hparams dashboard should do something like this. We didn’t get around to
implementing it in the initial version, partly because we’re not
entirely sure where we want to go with it. For instance, one ambitious
approach would be to redefine what a “run” is entirely; currently a run
is basically a directory on disk, but conceptually a run is deeper than
that. In any case, I’ve filed a feature request for tracking: #2465.

Also filed #2464 to track aggregation support.

joshlk · 2019-07-26T07:49:08Z

@wchargin Great - thanks for the work!

Summary: Active development of these APIs has completed, and they were published in the 1.14 release notes. Closes #1998. wchargin-branch: hparams-public-api

wchargin added type:feature plugin:hparams labels Mar 11, 2019

wchargin mentioned this issue Mar 14, 2019

Adding new hyperparameters should be supported #2014

Open

wchargin mentioned this issue Mar 18, 2019

Make TensorBoard aware of hyperparameters #46

Open

shashvatshahi1998 mentioned this issue Apr 1, 2019

Added datatype int64 #2076

Closed

wchargin mentioned this issue Apr 16, 2019

hparams: add demo using an ML model #2117

Merged

This was referenced Apr 18, 2019

hparams: add Python layer for hparams and metrics #2126

Merged

hparams: add Keras callback #2130

Merged

hparams: write hparams_config summary directly #2139

Merged

wchargin mentioned this issue Apr 22, 2019

hparams: add docstring describing public API #2140

Merged

wchargin mentioned this issue Jul 18, 2019

HParams Session Group Name. #2440

Closed

wchargin mentioned this issue Jul 18, 2019

hparams: allow setting trial ID #2442

Merged

wchargin closed this as completed Jul 19, 2019

This was referenced Jul 25, 2019

Add UI support for min/max aggregation across runs in a trial #2464

Open

Integrate hparams with global run selection #2465

Open

wchargin added a commit that referenced this issue Aug 15, 2019

hparams: mark public APIs as stable

fa59f55

Summary: Active development of these APIs has completed, and they were published in the 1.14 release notes. Closes #1998. wchargin-branch: hparams-public-api

wchargin mentioned this issue Aug 15, 2019

hparams: mark public APIs as stable #2552

Merged

wchargin added a commit that referenced this issue Aug 15, 2019

hparams: mark public APIs as stable (#2552)

f6a64aa

Summary: Active development of these APIs has completed, and they were published in the 1.14 release notes. Closes #1998. wchargin-branch: hparams-public-api

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design a public Python API for the hparams plugin #1998

Design a public Python API for the hparams plugin #1998

wchargin commented Mar 11, 2019

omoindrot commented Mar 14, 2019

wchargin commented Mar 14, 2019

shashvatshahi1998 commented Apr 1, 2019

wchargin commented May 1, 2019

nfelt commented May 2, 2019

moritzmeister commented May 9, 2019

wchargin commented May 9, 2019

wchargin commented Jul 19, 2019

joshlk commented Jul 22, 2019

wchargin commented Jul 22, 2019

joshlk commented Jul 23, 2019

wchargin commented Jul 24, 2019

joshlk commented Jul 25, 2019

wchargin commented Jul 26, 2019

joshlk commented Jul 26, 2019

Design a public Python API for the hparams plugin #1998

Design a public Python API for the hparams plugin #1998

Comments

wchargin commented Mar 11, 2019

omoindrot commented Mar 14, 2019

wchargin commented Mar 14, 2019

shashvatshahi1998 commented Apr 1, 2019

wchargin commented May 1, 2019

nfelt commented May 2, 2019

moritzmeister commented May 9, 2019

wchargin commented May 9, 2019

wchargin commented Jul 19, 2019

joshlk commented Jul 22, 2019

wchargin commented Jul 22, 2019

joshlk commented Jul 23, 2019

wchargin commented Jul 24, 2019

joshlk commented Jul 25, 2019

wchargin commented Jul 26, 2019

joshlk commented Jul 26, 2019