data: plumb experiment ID through runs and scalars #2580

wchargin · 2019-08-19T23:40:22Z

Summary:
This implements a rudimentary experiment selection mechanism for the
runs selector and scalars dashboard only, for the purposes of testing
data provider implementations. The experiment ID is stored as a query
parameter, which isn’t perfect because it precludes caching the big
TensorBoard HTML blob across experiments. In the long term, we can
consider refactoring to serve the big HTML blob from a static path
that’s fetched by the page.

The TensorBoard codebase still has some fragments of a previous
experimental data selector, which was partially removed in #2290, and so
parts of this diff look like changes but are functionally additions.

Test Plan:
Instrument the multiplexer data provider to log experiment IDs:

diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py
index ef602320..4c72d096 100644
--- a/tensorboard/backend/event_processing/data_provider.py
+++ b/tensorboard/backend/event_processing/data_provider.py
@@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider):
       return None

   def list_runs(self, experiment_id):
-    del experiment_id  # ignored for now
+    logger.warn("Listing runs for experiment %r", experiment_id)
     return [
         provider.Run(
             run_id=run,  # use names as IDs
@@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider):
     ]

   def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None):
-    del experiment_id  # ignored for now
+    logger.warn("Listing scalars for experiment %r", experiment_id)
     run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name)
     result = {}
     if run_tag_filter is None:
@@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider):
   def read_scalars(
       self, experiment_id, plugin_name, downsample=None, run_tag_filter=None
   ):
+    logger.warn("Reading scalars for experiment %r", experiment_id)
     # TODO(@wchargin): Downsampling not implemented, as the multiplexer
     # is already downsampled. We could downsample on top of the existing
     # sampling, which would be nice for testing.

Then launch TensorBoard with --generic_data=true and navigate to
http://localhost:6006/?experiment=foo; verify that all three varieties
of server logs are exclusively for the correct experiment ID.

Then, remove the query parameter, and ensure that TensorBoard still
works normally (with experiment ID the empty string).

wchargin-branch: data-experiment

@wchargin

Summary: This implements a rudimentary experiment selection mechanism for the runs selector and scalars dashboard only, for the purposes of testing data provider implementations. The experiment ID is stored as a query parameter, which isn’t perfect because it precludes caching the big TensorBoard HTML blob across experiments. In the long term, we can consider refactoring to serve the big HTML blob from a static path that’s fetched by the page. The TensorBoard codebase still has some fragments of a previous experimental data selector, which was partially removed in #2290, and so parts of this diff look like changes but are functionally additions. Test Plan: Instrument the multiplexer data provider to log experiment IDs: ```diff diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py index ef60232..4c72d096 100644 --- a/tensorboard/backend/event_processing/data_provider.py +++ b/tensorboard/backend/event_processing/data_provider.py @@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider): return None def list_runs(self, experiment_id): - del experiment_id # ignored for now + logger.warn("Listing runs for experiment %r", experiment_id) return [ provider.Run( run_id=run, # use names as IDs @@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider): ] def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None): - del experiment_id # ignored for now + logger.warn("Listing scalars for experiment %r", experiment_id) run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name) result = {} if run_tag_filter is None: @@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider): def read_scalars( self, experiment_id, plugin_name, downsample=None, run_tag_filter=None ): + logger.warn("Reading scalars for experiment %r", experiment_id) # TODO(@wchargin): Downsampling not implemented, as the multiplexer # is already downsampled. We could downsample on top of the existing # sampling, which would be nice for testing. ``` Then launch TensorBoard with `--generic_data=true` and navigate to <http://localhost:6006/?experiment=foo>; verify that all three varieties of server logs are exclusively for the correct experiment ID. Then, remove the query parameter, and ensure that TensorBoard still works normally (with experiment ID the empty string). wchargin-branch: data-experiment

wchargin · 2019-08-19T23:42:04Z

tensorboard/components/tf_backend/router.ts

    pluginsListing: () => string;
-    runs: () => string;
+    runs: (experiment?: string) => string;
    runsForExperiment: (id: tf_backend.ExperimentId) => string;


This machinery is unrelated; I’ll remove it in a future change.

I didn’t want to reuse it because it assumes some things about both
inputs (e.g., type ExperimentId = number) and outputs (the output of
the /experiment_runs route is not just a list of string names) that
may not be what we want.

wchargin-source: 51e9f37203db775b7ca7d0c8b80dc5de44947e32 wchargin-branch: data-experiment

nfelt · 2019-08-21T20:31:11Z

tensorboard/plugins/scalar/scalars_plugin.py

    tag = request.args.get('tag')
    run = request.args.get('run')
-    experiment = request.args.get('experiment')
+    experiment = request.args.get('experiment', '')


Any particular reason to have this default to empty instead of None?

The experiment ID is required for any calls to the data provider API
(even though our current implementation doesn’t strictly validate
this—after this PR, maybe it should), so it would have to be converted
from empty string to None at some point. It seemed cleanest to me to
reduce the space of input types as early as possible. No strong opinion.

nfelt · 2019-08-21T20:35:26Z

tensorboard/components/tf_backend/runsStore.ts


+    _getExperimentId() {
+      return (
+        new URLSearchParams(window.location.search).get('experiment') || ''


Could we expose this somewhere so that the scalars code can re-use it rather than replicating it? It could potentially be here although experimentsStore would perhaps be the best fit, even though this isn't strictly speaking the same as the old experiment data being served by that API.

That way it's easier to refactor later if we migrate away from a query parameter to e.g. embedding the experiment ID in the path.

Exposed tf_backend.getExperimentId(). It’s still a bit of a kludge,
but it’s one kludge instead of two. :-)

wchargin-source: 1387baa2ffc19f8733507e836a986fb73b4b7c4e wchargin-branch: data-experiment

@wchargin

Summary: This commit changes the frontend experiment ID plumbing from using the query string to using a path component: experiment URLs are now of the form `http://localhost:6006/experiment/123/`. This supersedes the short-term changes from #2580, #2613, and #2627. Test Plan: Apply the below patch to the multiplexer data provider, then launch TensorBoard with `--generic_data=true` and verify that the correct experiment ID is printed for each call, at `/` and at `/experiment/123`. Verify also that no request has an `experiment` query parameter. Patch: ```diff diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py index 0b354aa..1c00b930 100644 wchargin-source: c5b1a893a67cb82286f9de0f5b2a0638dc372a54 --- a/tensorboard/backend/event_processing/data_provider.py +++ b/tensorboard/backend/event_processing/data_provider.py @@ -57,10 +57,12 @@ class MultiplexerDataProvider(provider.DataProvider): return None def data_location(self, experiment_id): + logger.warn("data_location(%r)", experiment_id) del experiment_id # ignored return str(self._logdir) def list_runs(self, experiment_id): + logger.warn("list_runs(%r)", experiment_id) del experiment_id # ignored for now return [ provider.Run( @@ -72,6 +74,7 @@ class MultiplexerDataProvider(provider.DataProvider): ] def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None): + logger.warn("list_scalars(%r, ...)", experiment_id) del experiment_id # ignored for now run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name) result = {} @@ -103,6 +106,7 @@ class MultiplexerDataProvider(provider.DataProvider): def read_scalars( self, experiment_id, plugin_name, downsample=None, run_tag_filter=None ): + logger.warn("read_scalars(%r, ...)", experiment_id) # TODO(@wchargin): Downsampling not implemented, as the multiplexer # is already downsampled. We could downsample on top of the existing # sampling, which would be nice for testing. ``` wchargin-branch: eid-frontend

@wchargin

Summary: This commit changes the frontend experiment ID plumbing from using the query string to using a path component: experiment URLs are now of the form `http://localhost:6006/experiment/123/`. This supersedes the short-term changes from #2580, #2613, and #2627. Test Plan: Apply the below patch to the multiplexer data provider, then launch TensorBoard with `--generic_data=true` and verify that the correct experiment ID is printed for each call, at `/` and at `/experiment/123`. Verify also that no request has an `experiment` query parameter. Patch: ```diff diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py index 0b354aa..1c00b930 100644 --- a/tensorboard/backend/event_processing/data_provider.py +++ b/tensorboard/backend/event_processing/data_provider.py @@ -59,2 +59,3 @@ class MultiplexerDataProvider(provider.DataProvider): def data_location(self, experiment_id): + logger.warn("data_location(%r)", experiment_id) del experiment_id # ignored @@ -63,2 +64,3 @@ class MultiplexerDataProvider(provider.DataProvider): def list_runs(self, experiment_id): + logger.warn("list_runs(%r)", experiment_id) del experiment_id # ignored for now @@ -74,2 +76,3 @@ class MultiplexerDataProvider(provider.DataProvider): def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None): + logger.warn("list_scalars(%r, ...)", experiment_id) del experiment_id # ignored for now @@ -105,2 +108,3 @@ class MultiplexerDataProvider(provider.DataProvider): ): + logger.warn("read_scalars(%r, ...)", experiment_id) # TODO(@wchargin): Downsampling not implemented, as the multiplexer ``` wchargin-branch: eid-frontend

wchargin added type:feature core:frontend labels Aug 19, 2019

wchargin requested a review from nfelt August 19, 2019 23:40

googlebot added the cla: yes label Aug 19, 2019

wchargin commented Aug 19, 2019

View reviewed changes

[update patch]

38f22dd

wchargin-source: 51e9f37203db775b7ca7d0c8b80dc5de44947e32 wchargin-branch: data-experiment

nfelt approved these changes Aug 21, 2019

View reviewed changes

[update patch]

8929ef8

wchargin-source: 1387baa2ffc19f8733507e836a986fb73b4b7c4e wchargin-branch: data-experiment

wchargin merged commit acd46f5 into master Aug 22, 2019

wchargin deleted the wchargin-data-experiment branch August 22, 2019 16:58

wchargin mentioned this pull request Oct 6, 2019

frontend: use experiment ID from path, not query #2736

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

data: plumb experiment ID through runs and scalars #2580

data: plumb experiment ID through runs and scalars #2580

Uh oh!

wchargin commented Aug 19, 2019

Uh oh!

wchargin Aug 19, 2019

Uh oh!

nfelt Aug 21, 2019

Uh oh!

wchargin Aug 22, 2019

Uh oh!

nfelt Aug 21, 2019

Uh oh!

wchargin Aug 22, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

data: plumb experiment ID through runs and scalars #2580

data: plumb experiment ID through runs and scalars #2580

Uh oh!

Conversation

wchargin commented Aug 19, 2019

Uh oh!

wchargin Aug 19, 2019

Choose a reason for hiding this comment

Uh oh!

nfelt Aug 21, 2019

Choose a reason for hiding this comment

Uh oh!

wchargin Aug 22, 2019

Choose a reason for hiding this comment

Uh oh!

nfelt Aug 21, 2019

Choose a reason for hiding this comment

Uh oh!

wchargin Aug 22, 2019

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants