-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Graph plugin: Read from Data Provider #2991
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
wchargin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Everything looks good at a high level/design level; thanks! Just a bunch
of local questions.
| return graph | ||
| raise ValueError('There is no graph in this EventAccumulator') | ||
|
|
||
| def SerializedGraph(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, the non-plugin event_accumulator and event_multiplexer are
retained only for compatibility with certain internal clients; it’s
perfectly fine to add new functionality to plugin_event_multiplexer
without backporting it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK thanks for the clarification. I was wondering about the duplication but didn't investigate. Will leave this alone for now, anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to avoid changing the non-plugin event accumulator, since it's basically only maintained for legacy purposes, and we should avoid suggesting that it's still getting new features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, reverted this file and its test.
| def run_metadata_route(self, request): | ||
| """Given a tag and a run, return the session.run() metadata.""" | ||
| if self._data_provider: | ||
| return None |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will 500 if it’s ever hit:
>>> wrappers.Request.application(lambda req: None)({"REQUEST_METHOD": "GET"}, print)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/local/google/home/wchargin/virtualenv/tf-nightly-20191202-py3.7/lib/python3.7/site-packages/werkzeug/wrappers/base_request.py", line 240, in application
return resp(*args[-2:])
TypeError: 'NoneType' object is not callableWhat’s the right thing to do here? If this isn’t intended to be
reachable from the normal client-side app flows, maybe 400/404 instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Let's send a 404 for now, but really it looks like we should discuss plumbing the run metadata through the data provider. Looks like it might be a step-indexed blob, so I hope we can hook it up using the existing APIs. (That would also resolve the weirdness that is_active() checks for metadata in the bare multiplexer case, but not in the data provider case).
| 1) Tuple: (123, "graphs", "train", "graph_def", 2, 0) | ||
| 2) JSON: "[123,"graphs","train","graph_def",2,0]" | ||
| 3) base64: WzEyMywiZ3JhcGhzIiwidHJhaW4iLCJncmFwaF9kZWYiLDIsMF0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Experiment IDs are like "123", not 123, no?
1) Tuple: ("123", "graphs", "train", "graph_def", 2, 0)
2) JSON: "["123","graphs","train","graph_def",2,0]"
3) base64: IjEyMyIsImdyYXBocyIsInRyYWluIiwiZ3JhcGhfZGVmIiwyLDA
(Thanks for including this example—it’s quite helpful.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep; now using "some_id" to make really clear that it's a string.
|
Thanks for the careful review! Also: I resolved my confusion about the |
wchargin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CI still fast-failing; missed a commit, maybe?
wchargin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great; thank you! :D
|
This doesn’t actually appear to work for me. When I run and navigate to the graphs dashboard, I get JS console errors: The result from but is now: The failure is new as of this commit, and also disappears if I run with Could you please investigate? (Maybe consider rolling back in the |
|
…and in addition to the JS console errors, the graph display is blank. |
Summary: Experiment IDs must be passed to the data provider, but the multiplexer provider doesn’t actually need them, so code that forgets to pass the ID will be silently broken on other providers. This commit updates the multiplexer provider to explicitly fail when the ID is omitted. This would have caught bugs in earlier drafts of both #2981 and #2991, which were written by different people, so clearly the mistake is easy to make. :-) Test Plan: Running with `--generic_data true`, the scalars, histograms, and distributions dashboards all still work. The graphs dashboard has the same error as prior to this change (see comments on #2991). Unit tests have been updated to always pass experiment IDs. wchargin-branch: muxprovider-safety-eid
|
Yikes. Fixing presently. |
Summary: Experiment IDs must be passed to the data provider, but the multiplexer provider doesn’t actually need them, so code that forgets to pass the ID will be silently broken on other providers. This commit updates the multiplexer provider to explicitly fail when the ID is omitted. This would have caught bugs in earlier drafts of both #2981 and #2991, which were written by different people, so clearly the mistake is easy to make. :-) Test Plan: Running with `--generic_data true`, the scalars, histograms, and distributions dashboards all still work. The graphs dashboard has the same error as prior to this change (see comments on #2991). Unit tests have been updated to always pass experiment IDs. wchargin-branch: muxprovider-safety-eid
Summary: Follow-up to tensorflow#2991. Fixes tensorflow#3434. Test Plan: Tests pass as written. wchargin-branch: data-blob-sequence-tests
This is one of those PRs where pulling on one thread unraveled the sweater. I could break it apart into a chain of 3 PRs if you want, but if you're OK with this as is, there'll be less friction with CI and whatnot.
The original goal was to teach the graph plugin to read from a data provider, mirroring the approach used for scalars. However it turned out that the tests for that use the logdir-based data provider (including multiplexer and accumulator stuff), so I implemented the new data provider blob API there too, largely mirroring code you've already seen internally. Some duplication results (which I can resolve after this is in).