Skip to content

Conversation

@wchargin
Copy link
Contributor

Summary:
This implements a rudimentary experiment selection mechanism for the
runs selector and scalars dashboard only, for the purposes of testing
data provider implementations. The experiment ID is stored as a query
parameter, which isn’t perfect because it precludes caching the big
TensorBoard HTML blob across experiments. In the long term, we can
consider refactoring to serve the big HTML blob from a static path
that’s fetched by the page.

The TensorBoard codebase still has some fragments of a previous
experimental data selector, which was partially removed in #2290, and so
parts of this diff look like changes but are functionally additions.

Test Plan:
Instrument the multiplexer data provider to log experiment IDs:

diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py
index ef602320..4c72d096 100644
--- a/tensorboard/backend/event_processing/data_provider.py
+++ b/tensorboard/backend/event_processing/data_provider.py
@@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider):
       return None

   def list_runs(self, experiment_id):
-    del experiment_id  # ignored for now
+    logger.warn("Listing runs for experiment %r", experiment_id)
     return [
         provider.Run(
             run_id=run,  # use names as IDs
@@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider):
     ]

   def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None):
-    del experiment_id  # ignored for now
+    logger.warn("Listing scalars for experiment %r", experiment_id)
     run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name)
     result = {}
     if run_tag_filter is None:
@@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider):
   def read_scalars(
       self, experiment_id, plugin_name, downsample=None, run_tag_filter=None
   ):
+    logger.warn("Reading scalars for experiment %r", experiment_id)
     # TODO(@wchargin): Downsampling not implemented, as the multiplexer
     # is already downsampled. We could downsample on top of the existing
     # sampling, which would be nice for testing.

Then launch TensorBoard with --generic_data=true and navigate to
http://localhost:6006/?experiment=foo; verify that all three varieties
of server logs are exclusively for the correct experiment ID.

Then, remove the query parameter, and ensure that TensorBoard still
works normally (with experiment ID the empty string).

wchargin-branch: data-experiment

Summary:
This implements a rudimentary experiment selection mechanism for the
runs selector and scalars dashboard only, for the purposes of testing
data provider implementations. The experiment ID is stored as a query
parameter, which isn’t perfect because it precludes caching the big
TensorBoard HTML blob across experiments. In the long term, we can
consider refactoring to serve the big HTML blob from a static path
that’s fetched by the page.

The TensorBoard codebase still has some fragments of a previous
experimental data selector, which was partially removed in #2290, and so
parts of this diff look like changes but are functionally additions.

Test Plan:
Instrument the multiplexer data provider to log experiment IDs:

```diff
diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py
index ef60232..4c72d096 100644
--- a/tensorboard/backend/event_processing/data_provider.py
+++ b/tensorboard/backend/event_processing/data_provider.py
@@ -54,7 +54,7 @@ class MultiplexerDataProvider(provider.DataProvider):
       return None

   def list_runs(self, experiment_id):
-    del experiment_id  # ignored for now
+    logger.warn("Listing runs for experiment %r", experiment_id)
     return [
         provider.Run(
             run_id=run,  # use names as IDs
@@ -65,7 +65,7 @@ class MultiplexerDataProvider(provider.DataProvider):
     ]

   def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None):
-    del experiment_id  # ignored for now
+    logger.warn("Listing scalars for experiment %r", experiment_id)
     run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name)
     result = {}
     if run_tag_filter is None:
@@ -96,6 +96,7 @@ class MultiplexerDataProvider(provider.DataProvider):
   def read_scalars(
       self, experiment_id, plugin_name, downsample=None, run_tag_filter=None
   ):
+    logger.warn("Reading scalars for experiment %r", experiment_id)
     # TODO(@wchargin): Downsampling not implemented, as the multiplexer
     # is already downsampled. We could downsample on top of the existing
     # sampling, which would be nice for testing.
```

Then launch TensorBoard with `--generic_data=true` and navigate to
<http://localhost:6006/?experiment=foo>; verify that all three varieties
of server logs are exclusively for the correct experiment ID.

Then, remove the query parameter, and ensure that TensorBoard still
works normally (with experiment ID the empty string).

wchargin-branch: data-experiment
pluginsListing: () => string;
runs: () => string;
runs: (experiment?: string) => string;
runsForExperiment: (id: tf_backend.ExperimentId) => string;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This machinery is unrelated; I’ll remove it in a future change.

I didn’t want to reuse it because it assumes some things about both
inputs (e.g., type ExperimentId = number) and outputs (the output of
the /experiment_runs route is not just a list of string names) that
may not be what we want.

wchargin-source: 51e9f37203db775b7ca7d0c8b80dc5de44947e32
wchargin-branch: data-experiment
tag = request.args.get('tag')
run = request.args.get('run')
experiment = request.args.get('experiment')
experiment = request.args.get('experiment', '')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any particular reason to have this default to empty instead of None?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The experiment ID is required for any calls to the data provider API
(even though our current implementation doesn’t strictly validate
this—after this PR, maybe it should), so it would have to be converted
from empty string to None at some point. It seemed cleanest to me to
reduce the space of input types as early as possible. No strong opinion.


_getExperimentId() {
return (
new URLSearchParams(window.location.search).get('experiment') || ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we expose this somewhere so that the scalars code can re-use it rather than replicating it? It could potentially be here although experimentsStore would perhaps be the best fit, even though this isn't strictly speaking the same as the old experiment data being served by that API.

That way it's easier to refactor later if we migrate away from a query parameter to e.g. embedding the experiment ID in the path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exposed tf_backend.getExperimentId(). It’s still a bit of a kludge,
but it’s one kludge instead of two. :-)

wchargin-source: 1387baa2ffc19f8733507e836a986fb73b4b7c4e
wchargin-branch: data-experiment
@wchargin wchargin merged commit acd46f5 into master Aug 22, 2019
@wchargin wchargin deleted the wchargin-data-experiment branch August 22, 2019 16:58
wchargin added a commit that referenced this pull request Oct 6, 2019
Summary:
This commit changes the frontend experiment ID plumbing from using the
query string to using a path component: experiment URLs are now of the
form `http://localhost:6006/experiment/123/`. This supersedes the
short-term changes from #2580, #2613, and #2627.

Test Plan:
Apply the below patch to the multiplexer data provider, then launch
TensorBoard with `--generic_data=true` and verify that the correct
experiment ID is printed for each call, at `/` and at `/experiment/123`.
Verify also that no request has an `experiment` query parameter. Patch:

```diff
diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py
index 0b354aa..1c00b930 100644

wchargin-source: c5b1a893a67cb82286f9de0f5b2a0638dc372a54
--- a/tensorboard/backend/event_processing/data_provider.py
+++ b/tensorboard/backend/event_processing/data_provider.py
@@ -57,10 +57,12 @@ class MultiplexerDataProvider(provider.DataProvider):
       return None

   def data_location(self, experiment_id):
+    logger.warn("data_location(%r)", experiment_id)
     del experiment_id   # ignored
     return str(self._logdir)

   def list_runs(self, experiment_id):
+    logger.warn("list_runs(%r)", experiment_id)
     del experiment_id  # ignored for now
     return [
         provider.Run(
@@ -72,6 +74,7 @@ class MultiplexerDataProvider(provider.DataProvider):
     ]

   def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None):
+    logger.warn("list_scalars(%r, ...)", experiment_id)
     del experiment_id  # ignored for now
     run_tag_content = self._multiplexer.PluginRunToTagToContent(plugin_name)
     result = {}
@@ -103,6 +106,7 @@ class MultiplexerDataProvider(provider.DataProvider):
   def read_scalars(
       self, experiment_id, plugin_name, downsample=None, run_tag_filter=None
   ):
+    logger.warn("read_scalars(%r, ...)", experiment_id)
     # TODO(@wchargin): Downsampling not implemented, as the multiplexer
     # is already downsampled. We could downsample on top of the existing
     # sampling, which would be nice for testing.
```

wchargin-branch: eid-frontend
wchargin added a commit that referenced this pull request Oct 8, 2019
Summary:
This commit changes the frontend experiment ID plumbing from using the
query string to using a path component: experiment URLs are now of the
form `http://localhost:6006/experiment/123/`. This supersedes the
short-term changes from #2580, #2613, and #2627.

Test Plan:
Apply the below patch to the multiplexer data provider, then launch
TensorBoard with `--generic_data=true` and verify that the correct
experiment ID is printed for each call, at `/` and at `/experiment/123`.
Verify also that no request has an `experiment` query parameter. Patch:

```diff
diff --git a/tensorboard/backend/event_processing/data_provider.py b/tensorboard/backend/event_processing/data_provider.py
index 0b354aa..1c00b930 100644
--- a/tensorboard/backend/event_processing/data_provider.py
+++ b/tensorboard/backend/event_processing/data_provider.py
@@ -59,2 +59,3 @@ class MultiplexerDataProvider(provider.DataProvider):
   def data_location(self, experiment_id):
+    logger.warn("data_location(%r)", experiment_id)
     del experiment_id   # ignored
@@ -63,2 +64,3 @@ class MultiplexerDataProvider(provider.DataProvider):
   def list_runs(self, experiment_id):
+    logger.warn("list_runs(%r)", experiment_id)
     del experiment_id  # ignored for now
@@ -74,2 +76,3 @@ class MultiplexerDataProvider(provider.DataProvider):
   def list_scalars(self, experiment_id, plugin_name, run_tag_filter=None):
+    logger.warn("list_scalars(%r, ...)", experiment_id)
     del experiment_id  # ignored for now
@@ -105,2 +108,3 @@ class MultiplexerDataProvider(provider.DataProvider):
   ):
+    logger.warn("read_scalars(%r, ...)", experiment_id)
     # TODO(@wchargin): Downsampling not implemented, as the multiplexer
```

wchargin-branch: eid-frontend
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants