Graph plugin: Read from Data Provider #2991

davidsoergel · 2019-12-03T00:10:17Z

This is one of those PRs where pulling on one thread unraveled the sweater. I could break it apart into a chain of 3 PRs if you want, but if you're OK with this as is, there'll be less friction with CI and whatnot.

The original goal was to teach the graph plugin to read from a data provider, mirroring the approach used for scalars. However it turned out that the tests for that use the logdir-based data provider (including multiplexer and accumulator stuff), so I implemented the new data provider blob API there too, largely mirroring code you've already seen internally. Some duplication results (which I can resolve after this is in).

wchargin

Everything looks good at a high level/design level; thanks! Just a bunch
of local questions.

tensorboard/data/provider.py

tensorboard/backend/event_processing/data_provider.py

wchargin · 2019-12-03T00:32:30Z

tensorboard/backend/event_processing/event_accumulator.py

      return graph
    raise ValueError('There is no graph in this EventAccumulator')

+  def SerializedGraph(self):


FWIW, the non-plugin event_accumulator and event_multiplexer are
retained only for compatibility with certain internal clients; it’s
perfectly fine to add new functionality to plugin_event_multiplexer
without backporting it.

OK thanks for the clarification. I was wondering about the duplication but didn't investigate. Will leave this alone for now, anyway.

+1 to avoid changing the non-plugin event accumulator, since it's basically only maintained for legacy purposes, and we should avoid suggesting that it's still getting new features.

OK, reverted this file and its test.

wchargin · 2019-12-03T00:35:28Z

tensorboard/plugins/graph/graphs_plugin.py

  def run_metadata_route(self, request):
    """Given a tag and a run, return the session.run() metadata."""
+    if self._data_provider:
+      return None


This will 500 if it’s ever hit:

>>> wrappers.Request.application(lambda req: None)({"REQUEST_METHOD": "GET"}, print) Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/google/home/wchargin/virtualenv/tf-nightly-20191202-py3.7/lib/python3.7/site-packages/werkzeug/wrappers/base_request.py", line 240, in application return resp(*args[-2:]) TypeError: 'NoneType' object is not callable

What’s the right thing to do here? If this isn’t intended to be
reachable from the normal client-side app flows, maybe 400/404 instead?

Hmm. Let's send a 404 for now, but really it looks like we should discuss plumbing the run metadata through the data provider. Looks like it might be a step-indexed blob, so I hope we can hook it up using the existing APIs. (That would also resolve the weirdness that is_active() checks for metadata in the bare multiplexer case, but not in the data provider case).

tensorboard/backend/event_processing/plugin_event_multiplexer.py

tensorboard/plugins/graph/graphs_plugin_test.py

wchargin · 2019-12-03T02:11:51Z

tensorboard/backend/event_processing/data_provider.py

+      1)  Tuple: (123, "graphs", "train", "graph_def", 2, 0)
+      2)   JSON: "[123,"graphs","train","graph_def",2,0]"
+      3) base64: WzEyMywiZ3JhcGhzIiwidHJhaW4iLCJncmFwaF9kZWYiLDIsMF0


Experiment IDs are like "123", not 123, no?

1) Tuple: ("123", "graphs", "train", "graph_def", 2, 0) 2) JSON: "["123","graphs","train","graph_def",2,0]" 3) base64: IjEyMyIsImdyYXBocyIsInRyYWluIiwiZ3JhcGhfZGVmIiwyLDA

(Thanks for including this example—it’s quite helpful.)

Yep; now using "some_id" to make really clear that it's a string.

tensorboard/backend/event_processing/data_provider.py

davidsoergel · 2019-12-03T07:12:41Z

Thanks for the careful review!

Also: I resolved my confusion about the tag=None case (it's fine), and all tests pass locally now.

tensorboard/plugins/graph/graphs_plugin.py

wchargin

CI still fast-failing; missed a commit, maybe?

tensorboard/plugins/graph/graphs_plugin.py

tensorboard/backend/event_processing/data_provider.py

tensorboard/plugins/graph/graphs_plugin.py

tensorboard/plugins/graph/graphs_plugin_test.py

tensorboard/backend/event_processing/data_provider.py

wchargin

Great; thank you! :D

tensorboard/plugins/graph/graphs_plugin.py

wchargin · 2019-12-05T02:18:32Z

This doesn’t actually appear to work for me. When I run

$ bazel run //tensorboard -- --generic_data=true --logdir ~/tensorboard_data/mnist

and navigate to the graphs dashboard, I get JS console errors:

Uncaught TypeError: Cannot destructure property 'run' of 'd' as it is null.
    at HTMLElement._load (tf-tensorboard.html.js:16576)
    at HTMLElement.<anonymous> (tf-tensorboard.html.js:16576)
    at tf-tensorboard.html.js:7036
    at MutationObserver.microtaskFlush (tf-tensorboard.html.js:1687)

The result from /data/plugin/graphs/info (formatted) used to be:

{
  "lr_1E-03,conv=1,fc=2": {
    "run": "lr_1E-03,conv=1,fc=2",
    "tags": {},
    "run_graph": true
  },
  "lr_1E-04,conv=2,fc=2": {
    "run": "lr_1E-04,conv=2,fc=2",
    "tags": {},
    "run_graph": true
  },
  "lr_1E-04,conv=1,fc=2": {
    "run": "lr_1E-04,conv=1,fc=2",
    "tags": {},
    "run_graph": true
  },
  "lr_1E-03,conv=2,fc=2": {
    "run": "lr_1E-03,conv=2,fc=2",
    "tags": {},
    "run_graph": true
  }
}

but is now:

{
  "lr_1E-03,conv=1,fc=2": {
    "run": "lr_1E-03,conv=1,fc=2",
    "tags": {},
    "run_graph": false,
    "op_graph": true
  },
  "lr_1E-04,conv=2,fc=2": {
    "run": "lr_1E-04,conv=2,fc=2",
    "tags": {},
    "run_graph": false,
    "op_graph": true
  },
  "lr_1E-04,conv=1,fc=2": {
    "run": "lr_1E-04,conv=1,fc=2",
    "tags": {},
    "run_graph": false,
    "op_graph": true
  },
  "lr_1E-03,conv=2,fc=2": {
    "run": "lr_1E-03,conv=2,fc=2",
    "tags": {},
    "run_graph": false,
    "op_graph": true
  }
}

The failure is new as of this commit, and also disappears if I run with
--generic_data=false or --generic_data=auto.

Could you please investigate? (Maybe consider rolling back in the
meantime, too.)

wchargin · 2019-12-05T02:19:04Z

…and in addition to the JS console errors, the graph display is blank.

Summary: Experiment IDs must be passed to the data provider, but the multiplexer provider doesn’t actually need them, so code that forgets to pass the ID will be silently broken on other providers. This commit updates the multiplexer provider to explicitly fail when the ID is omitted. This would have caught bugs in earlier drafts of both #2981 and #2991, which were written by different people, so clearly the mistake is easy to make. :-) Test Plan: Running with `--generic_data true`, the scalars, histograms, and distributions dashboards all still work. The graphs dashboard has the same error as prior to this change (see comments on #2991). Unit tests have been updated to always pass experiment IDs. wchargin-branch: muxprovider-safety-eid

davidsoergel · 2019-12-05T14:57:25Z

Yikes. Fixing presently.

Summary: Experiment IDs must be passed to the data provider, but the multiplexer provider doesn’t actually need them, so code that forgets to pass the ID will be silently broken on other providers. This commit updates the multiplexer provider to explicitly fail when the ID is omitted. This would have caught bugs in earlier drafts of both #2981 and #2991, which were written by different people, so clearly the mistake is easy to make. :-) Test Plan: Running with `--generic_data true`, the scalars, histograms, and distributions dashboards all still work. The graphs dashboard has the same error as prior to this change (see comments on #2991). Unit tests have been updated to always pass experiment IDs. wchargin-branch: muxprovider-safety-eid

Summary: Follow-up to #2991. Fixes #3434. Test Plan: Tests pass as written. wchargin-branch: data-blob-sequence-tests wchargin-source: fbd3302933cb0c50609df970edf137202723c769

Summary: Follow-up to #2991. Fixes #3434. Test Plan: Tests pass as written. wchargin-branch: data-blob-sequence-tests

Summary: Follow-up to tensorflow#2991. Fixes tensorflow#3434. Test Plan: Tests pass as written. wchargin-branch: data-blob-sequence-tests

Summary: Follow-up to #2991. Fixes #3434. Test Plan: Tests pass as written. wchargin-branch: data-blob-sequence-tests

davidsoergel added 2 commits December 2, 2019 18:16

wip

c97debc

cleanup

92e4d52

davidsoergel requested review from stephanwlee and wchargin December 3, 2019 00:10

googlebot added the cla: yes label Dec 3, 2019

wchargin reviewed Dec 3, 2019

View reviewed changes

Review cleanups

7116633

Merge branch 'master' into graph_plugin_data_provider

2238f42

wchargin reviewed Dec 3, 2019

View reviewed changes

tensorboard/plugins/graph/graphs_plugin.py Show resolved Hide resolved

Review cleanups

230034e

davidsoergel requested a review from wchargin December 3, 2019 17:23

wchargin reviewed Dec 3, 2019

View reviewed changes

davidsoergel added 3 commits December 3, 2019 15:08

Fix build

3cc3ce3

More review comment fixes

2951250

Merge branch 'master' into graph_plugin_data_provider

57a143a

davidsoergel requested review from nfelt and wchargin December 4, 2019 03:59

wchargin approved these changes Dec 4, 2019

View reviewed changes

tensorboard/plugins/graph/graphs_plugin.py Outdated Show resolved Hide resolved

final review cleanup

48f924c

davidsoergel merged commit 9d9b34b into master Dec 4, 2019

wchargin mentioned this pull request Dec 5, 2019

data: require experiment ID in multiplexer provider #3000

Merged

davidsoergel mentioned this pull request Dec 5, 2019

Fix graph plugin with data provider, and its test #3001

Merged

wchargin mentioned this pull request Mar 26, 2020

MultiplexerDataProvider blob code needs tests #3434

Closed

wchargin mentioned this pull request Mar 26, 2020

data: add tests for blob sequence handling #3435

Merged

wchargin added a commit that referenced this pull request Mar 26, 2020

data: add tests for blob sequence handling (#3435)

aa4b9af

Summary: Follow-up to #2991. Fixes #3434. Test Plan: Tests pass as written. wchargin-branch: data-blob-sequence-tests

bileschi pushed a commit that referenced this pull request Apr 15, 2020

data: add tests for blob sequence handling (#3435)

1b242f7

Summary: Follow-up to #2991. Fixes #3434. Test Plan: Tests pass as written. wchargin-branch: data-blob-sequence-tests

Graph plugin: Read from Data Provider #2991

Graph plugin: Read from Data Provider #2991

Uh oh!

Conversation

davidsoergel commented Dec 3, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

wchargin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

davidsoergel commented Dec 3, 2019

Uh oh!

Uh oh!

wchargin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

wchargin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

wchargin commented Dec 5, 2019

Uh oh!

wchargin commented Dec 5, 2019

Uh oh!

davidsoergel commented Dec 5, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

davidsoergel commented Dec 3, 2019 •

edited

Loading