-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
hparams: write hparams_config
summary directly
#2139
Conversation
Summary: This change introduces `HParam`, `Metric`, and `Experiment` classes, which represent their proto counterparts in a more Python-friendly way. It similarly includes a `Domain` class hierarchy, which does not correspond to a specific proto message, but rather unifies the domain variants defined on the `HParamInfo` proto. The design is roughly as in the original sketch of #1998. The primary benefit of this change is that having first-class domains enables clients to reuse the domain information for both the experiment summary and the underlying tuning algorithm. We don’t provide a method to do this out of the box, because we don’t actually provide any tuners at this time, but it’s easy to write (e.g.) a `sample_uniform` function like the one included in this commit. Then, sampling is as easy as ```python hparams = {h: sample_uniform(h.domain, rng) for h in HPARAMS} ``` It is also now more convenient to reference hparam values such that static analysis can detect potential typos, because the `HParam` objects themselves can be declared as constants and used as keys in a dict. Writing `hparams["dropuot"]` fails at runtime, but `hparams[HP_DROPUOT]` fails at lint time. As a pleasant bonus, hparam definitions are now more compact, fitting on one line instead of several. The demo code has net fewer lines. Manual summary writer management is still required. A future change will introduce a Keras callback to reduce this overhead. Test Plan: Some unit tests included, and the demo still works. wchargin-branch: hparams-structured-api
Summary: A new `hparams.api.KerasCallback` class simplifies client APIs by writing session start and end summaries automatically, with a dict of hparams provided by the client. This only works in TensorFlow eager mode. The stock `TensorBoard` Keras callback works in both eager and graph modes, but to do so it must use TensorFlow-internal symbols (`eager_mode` and `executing_eagerly` on the context object, which we do not have access to). Test Plan: Unit tests included. The demo still works, generating valid data. wchargin-branch: hparams-keras-callback
Summary: This abstracts over the last `tf.summary.experimental.write_raw_pb` call in the demo. The new `hp.experiment(experiment)` function works in both eager and graph modes. Test Plan: Unit tests included; tested on 1.x and 2.0 nightlies. The demo continues to work properly on the latest 2.0 nightly. wchargin-branch: hparams-experiment-writing
tf.summary.experimental.write_raw_pb(experiment_string, step=0) | ||
base_writer.flush() | ||
base_writer.close() | ||
hp.experiment(experiment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: The awkwardness of hp.experiment(experiment)
, or the equivalent
but worse hp.experiment(hp.Experiment(…))
, is why I’d originally
proposed having experiment.write()
. I understand why you don’t want
that. If you have suggestions for improvement here, I’m all ears.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern that would be most consonant with what we do for other plugins would be for hparams to have a summary_v2.py containing a function like
def hparams(experiment):
if not isinstance(experiment, Experiment):
raise TypeError("type(experiment) is %r" % (type(experiment),))
raw_pb = experiment.summary_pb().SerializeToString()
summary_scope = (
getattr(tf.summary.experimental, 'summary_scope', None) or
tf.summary.summary_scope)
with summary_scope('hparams_summary'):
return tf.summary.experimental.write_raw_pb(raw_pb)
To be like the others, that would be exposed under tb.summary.v2. We could also expose it under hparams.api
I suppose, though perhaps with the name summary()
or write()
or something instead.
Then the callsite would look like either tb.summary.hparams(experiment)
or hp.summary(experiment)
- something like that.
Down the road in the writer-method style of summary writing, this would become writer.hparams(experiment)
.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hesitate to just call the endpoints .hparams
, because hparams really
has three kinds of summaries, and of the three kinds of summaries, the
name .hparams
sounds more like session_start_pb
(logging your actual
hparam values) than experiment_pb
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hesitate to just call the endpoints .hparams
, because hparams really
has three kinds of summaries, and of the three kinds of summaries, the
name .hparams
sounds more like session_start_pb
(logging your actual
hparam values) than experiment_pb
.
I guess we could provide summary_v2
with hparams
, session_start
,
and session_end
(the last two for non-Keras users), but (a) these
would look more out of place as writer
methods (“session start? what
session?”), and (b) that’s not great if we’re trying to get rid of the
confusing “session” terminology.
I’ll try to think about this a bit more and get back to it tomorrow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair, but IMO the pretty strong reason to use hparams
would be consistency with all the other plugins, which use their plugin name as (at least a prefix to) the summary op name when they have a dedicated summary op. And I assume we would consider writer.hparams_experiment()
to be too unwieldy :)
In fairness, I think part of the difficulty here is that hparams sort of invented summary formats for concepts that are more properly broader than just the hparams plugin; "experiment" and "session" are constructs that go beyond just setting hparams. I don't expect us to resolve all of that in the short term, but what I'd maybe suggest is that we can figure out a subset of the full end-state API that we're pretty comfortable with now (and which still improves on the original API). We don't necessarily need to expose a full Experiment construct with a name and user when there is nothing in TensorBoard today to actually visualize that data.
For example, a subset of the API could be something like:
hparams_config(hparams, metrics)
to log the hparam and metric definitions (optional and if not logged, TB will infer domains and use all scalar values as metrics)hparams(hparams)
to log the actual values
To me that seems like it already covers virtually all of the functionality that's actually usable today in the hparams dashboard; the main exception is the session end status value, which doesn't seem like a big deal to me (and there's still the "plumbing" API for now if folks who aren't using the keras callback really want that). Even the session group name doesn't add much value since session group aggregation isn't implemented on the frontend.
This would buy us a little more time to fine tune how we'd like to capture experiment and session information in the generalized summary context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I can be convinced that Experiment
isn’t worth its conceptual
weight, and that we should remove it from the public API in favor of
directly writing hparams_config(hparams, metrics)
.
Instead of complicating the module and dependency structure more by
introducing both api
and summary_v2
, I would like to rename api
to summary_v2
, containing:
hparams(hparams)
to log a session start summary;hparams_config(hparams, metrics)
to log an experiment summary;HParam
,Metric
, and domain descriptors, as currently;KerasCallback
, as currently.
But KerasCallback
has a hard (module-time) dependency on tf.keras
,
so this would preclude summary_v2.hparams
in notf builds, which is not
great. The two options that I see right now are:
-
omit the callback from
summary_v2
and instead put it inapi
(which would need to exist after all), and also letapi
forward
the othersummary_v2
symbols; or -
move the callback into a separate
callback
module.
Option (1) has the downside that I can’t rename api
to __init__
as
I’d hoped, because you again wouldn’t be able to import summary_v2
without depending on tf.keras
. Option (2) has the downside of being
ugly.
I guess option (1) is less bad. At least in that case, the vast majority
of users could just use api
and not worry about summary_v2
unless
they need a notf build. I’ll implement it and see if I can come up with
anything better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed offline. We’re comfortable with this. Explicitly:
hparams.api
:- is the one-stop user-facing shop, to be imported
as hp
- re-exports
*
fromhparams.summary_v2
- re-exports
*
fromhparams.callback
- transitively depends on
tf.keras
- is the one-stop user-facing shop, to be imported
hparams.summary_v2
:- contains
hparams(concrete_hparam_values)
- contains
hparams_config(hparams, metrics)
- contains
HParam
,Metric
, and domain descriptors - depends only on
api_pb2
,summary
, etc. - is the only public API for notf users
- contains
hparams.callback
:- contains
KerasCallback
- depends on
summary_v2
- depends on
tf.keras
- contains
- nothing is added to
tb.summary.v2.*
- nothing is currently added to
tf.summary
; in the future, either
tf.summary
or a unified TensorBoardwriter
may depend on
summary_v2
but notcallback
to avoid atf.keras
dependency
with tf.compat.v1.Session() as sess: | ||
with tf.compat.v2.summary.create_file_writer(logdir).as_default() as w: | ||
with tf.compat.v2.summary.record_if(True): | ||
tf.contrib.summary.initialize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: I couldn’t find a way to get the V1 test case to work without
calling into tf.contrib
. If there is one, I’d love to hear it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sess.run(w.init())
should do the trick, if the problem was the writer not being initialized?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, that does it. Thanks!
tf.summary.experimental.write_raw_pb(experiment_string, step=0) | ||
base_writer.flush() | ||
base_writer.close() | ||
hp.experiment(experiment) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pattern that would be most consonant with what we do for other plugins would be for hparams to have a summary_v2.py containing a function like
def hparams(experiment):
if not isinstance(experiment, Experiment):
raise TypeError("type(experiment) is %r" % (type(experiment),))
raw_pb = experiment.summary_pb().SerializeToString()
summary_scope = (
getattr(tf.summary.experimental, 'summary_scope', None) or
tf.summary.summary_scope)
with summary_scope('hparams_summary'):
return tf.summary.experimental.write_raw_pb(raw_pb)
To be like the others, that would be exposed under tb.summary.v2. We could also expose it under hparams.api
I suppose, though perhaps with the name summary()
or write()
or something instead.
Then the callsite would look like either tb.summary.hparams(experiment)
or hp.summary(experiment)
- something like that.
Down the road in the writer-method style of summary writing, this would become writer.hparams(experiment)
.
What do you think?
base_writer.flush() | ||
base_writer.close() | ||
hp.experiment(experiment) | ||
base_writer.close() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a little weird IMO to .close()
a context-manager-returned object within the context. I might have expected this to fail actually since the context is supposed to .flush()
the writer at the context exit, but I guess .flush()
is a no-op on a writer that's already closed.
I think you can just leave out the .close()
. The important part in terms of functionality is .flush()
which is already implicit in the context exit (or could be made explicit). There isn't a real need to explicitly close the writer here since it will be closed when it goes out of scope anyway, and I believe that should actually happen at context exit anyway if you drop the as
clause.
Ditto in the tests that also call .close()
like this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it; I wasn’t sure what was idiomatic. Thanks.
else: | ||
self.fail("No summary data found") | ||
|
||
@test_util.run_v2_only("Requires eager summary writing semantics.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment on #2130 re: how you could run this in v1 also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
with tf.compat.v1.Session() as sess: | ||
with tf.compat.v2.summary.create_file_writer(logdir).as_default() as w: | ||
with tf.compat.v2.summary.record_if(True): | ||
tf.contrib.summary.initialize() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sess.run(w.init())
should do the trick, if the problem was the writer not being initialized?
logdir = os.path.join(self.get_temp_dir(), "logs") | ||
with tf.compat.v1.Session() as sess: | ||
with tf.compat.v2.summary.create_file_writer(logdir).as_default() as w: | ||
with tf.compat.v2.summary.record_if(True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line shouldn't be necessary with the V2 summary API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, interesting. I guess always_record_summaries
was required in the
V1 contrib
APIs, and so I assumed that record_if
was necessary here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a feature of the TF 2.0 APIs that you no longer have to do that :)
if event.WhichOneof("what") != "summary": | ||
continue | ||
self.assertEqual(event.summary, summary_pb) | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit, but you can simplify this to just return
here and then skip the else
clause and self.fail()
unconditionally. It's one line shorter and avoids introducing a slightly more novel language feature when there's a more standard alternative.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(No longer needed.)
wchargin-branch: hparams-experiment-writing # Conflicts: # tensorboard/plugins/hparams/BUILD # tensorboard/plugins/hparams/api.py # tensorboard/plugins/hparams/api_test.py # tensorboard/plugins/hparams/hparams_demo.py
wchargin-branch: hparams-experiment-writing
wchargin-branch: hparams-experiment-writing
hparams_config
summary directly
Summary: Fixes an unintentional omission from #2130: `api.py` (not just its test) actually does currently depend on TensorFlow. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: We’ll shortly reinstate `api` as a wrapper over `summary_v2` (in a separate commit to help Git track the moves). Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: We’ll shortly extract the Keras callback out of `summary_v2` such that `summary_v2` does not need to depend on TensorFlow. It will continue to be exposed from `api`. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: The `summary_v2` module no longer has any fundamental dependency on TensorFlow. Its implementation still depends on `summary`, which does directly reference the TensorFlow proto copies. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Ready for re-review when you get a chance. |
tensorboard/plugins/hparams/api.py
Outdated
class Experiment(object): | ||
"""A top-level experiment description. | ||
def hparams_config(hparams, metrics, time_created_secs=None): | ||
"""Write a top-level experiment description. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's maybe worth tweaking this language a little now to reflect the new naming?
wchargin-branch: hparams-experiment-writing
wchargin-branch: hparams-experiment-writing
Summary: Fixes an unintentional omission from #2130: `api.py` (not just its test) actually does currently depend on TensorFlow. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: We’ll shortly reinstate `api` as a wrapper over `summary_v2` (in a separate commit to help Git track the moves). Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: We’ll shortly extract the Keras callback out of `summary_v2` such that `summary_v2` does not need to depend on TensorFlow. It will continue to be exposed from `api`. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: The `summary_v2` module no longer has any fundamental dependency on TensorFlow. Its implementation still depends on `summary`, which does directly reference the TensorFlow proto copies. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: Fixes an unintentional omission from #2130: `api.py` (not just its test) actually does currently depend on TensorFlow. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: We’ll shortly reinstate `api` as a wrapper over `summary_v2` (in a separate commit to help Git track the moves). Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: We’ll shortly extract the Keras callback out of `summary_v2` such that `summary_v2` does not need to depend on TensorFlow. It will continue to be exposed from `api`. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary: The `summary_v2` module no longer has any fundamental dependency on TensorFlow. Its implementation still depends on `summary`, which does directly reference the TensorFlow proto copies. Part of a reorganization to match the structure described here: <#2139 (comment)> wchargin-branch: hparams-api-refactor
Summary:
The
Experiment
object bundledHParam
s andMetric
s with somemetadata that’s not actually used in the current UI. We don’t think that
it pulls its conceptual weight, so this commit replaces it with a direct
summary-writing operation.
This function will soon be extracted into a
summary_pb2
module, aspart of a larger plan to refactor the
api
module. Making thischange first minimizes churn in the demo code.
Cf. #1998.
Test Plan:
Unit tests modified appropriately, and the demo still works.
wchargin-branch: hparams-experiment-writing