-
Notifications
You must be signed in to change notification settings - Fork 1.7k
application: add middleware for experiment IDs #2735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: Currently, we apply the `_handle_errors` middleware as a decorator directly on `__call__`. We plan to add middleware that is parameterized on instance variables, so this pattern will no longer suffice. This commit simply refactors the existing code to apply middleware at initialization time. Test Plan: It suffices to run `bazel test //tensorboard/backend:application_test`. wchargin-branch: app-middleware wchargin-source: 77c4f793de8b64d88bd10212db5a0191995b29ca
Summary: This way, we can reason about and test it in isolation as we add more middleware to the application. Note that when using `--path_prefix`, loading the main TensorBoard page without a trailing slash on the URL has long been broken (#1117). This commit does not fix that, but changes the exact failure mode: rather than 404ing, we now load a broken TensorBoard (relative URLs don’t resolve). A follow-up commit will fix this properly. Test Plan: Verify that TensorBoard still works fully, with `--path_prefix` and without it. wchargin-branch: path-prefix-middleware wchargin-source: 3e07d2b681f12c4bab310bef2afc066590ae5cb7
Summary: When using a `--path_prefix`, TensorBoard has historically required a trailing slash after that prefix: e.g., one visits `localhost:6006/foo/` rather than `localhost:6006/foo`. (See, for instance, #1176.) Different versions of TensorBoard have different failure modes when the non-slash path is loaded; currently, the TensorBoard shell loads, but the frontend computes relative URLs incorrectly. This commit adds a redirect from the empty path to `/` to avoid the problem. This logic could be inlined into the `PathPrefixMiddleware`, but we’ll soon have other middlewares with similar needs, so providing this as its own middleware avoids duplicating the functionality. Test Plan: Launch TensorBoard with `--path_prefix /foo` (or `--path_prefix /foo/`; the two are equivalent) and navigate to `/foo/` and `/foo` in a browser. Note that they now both work and resolve to `/foo/`, while prior to this commit navigating to `/foo` yielded a broken TensorBoard. wchargin-branch: empty-path-redirect wchargin-source: 300682590a9a44f1b37d16168b457b4f5898463c
Summary:
This commit teaches TensorBoard to recognize URLs like `/experiment/123`
as specifying experiment IDs. If these two path components are omitted,
the experiment ID is the empty string, for backward compatibility. The
experiment ID is intercepted by middleware and stripped from the URL
before the application handles it, so no existing logic needs to change.
The ID is added to the WSGI environment and may be extracted via a new
public function, `plugin_util.experiment_id`.
This commit does not yet replace the existing query parameter wiring, so
this is not a user-facing change (except that more URLs are now valid).
This functionality is compatible with `--path_prefix`.
Note that the `/data/plugins_listing` implementation does not need to
change: it correctly specifies experiment IDs in iframe loading
mechanisms because we already use the general-purpose `SCRIPT_NAME` key
specified by WSGI for this purpose.
Test Plan:
Unit tests and a small integration test in `:application_test` included.
For manual testing, modify `application.py`’s `_route_request` method to
print out the application-level path and experiment ID:
```python
# at top of `_route_request`:
from tensorboard import plugin_util
eid = plugin_util.experiment_id(environ)
logger.warn("%r [%r]" % (environ.get("PATH_INFO"), eid))
```
Then, check that the scalars plugin and the dynamically loaded projector
plugin both work, and that the paths and experiment IDs are as expected,
in the following configurations:
- with `--path_prefix=/foo`:
- at `/foo` (should redirect to add slash)
- at `/foo/`
- at `/foo/experiment/123` (should redirect to add slash)
- at `/foo/experiment/123/`
- with no path prefix:
- at `/`
- at `/experiment/123` (should redirect to add slash)
- at `/experiment/123/`
wchargin-branch: eid-middleware
wchargin-source: 21f505c418f05a90ed434a9ca86b41959a39040f
wchargin-branch: eid-middleware wchargin-source: e956be9e14a712e424594550d58ef337710a0a2b
wchargin-branch: empty-path-redirect wchargin-source: 5cb309fa62fd9efac68ac52b8a57fcef67a2fcb5 # Conflicts: # tensorboard/backend/BUILD # tensorboard/backend/application.py
wchargin-branch: empty-path-redirect wchargin-source: 5cb309fa62fd9efac68ac52b8a57fcef67a2fcb5
wchargin-branch: empty-path-redirect wchargin-source: fa2c0afa5ecc3882d03ea695c6370edaa77399a2
tensorboard/backend/experiment_id.py
Outdated
|
|
||
| Any request whose path matches `/experiment/SOME_EID[/...]` will have | ||
| its first two path components stripped, and its experiment ID stored | ||
| onto the WSGI environment with key `WSGI_ENVIRON_KEY`. All other requests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe clarify that it's the key stored in the constant WSGI_ENVIRON_KEY not literally that key? Or just provide the literal key here in the docstring.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure; done.
| r"/%s/([^/]*)" % re.escape(_EXPERIMENT_PATH_COMPONENT) | ||
| ) | ||
|
|
||
| def __call__(self, environ, start_response): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, for more Hanoification this could be implemented using PathPrefixMiddleware if we just detect the experiment ID prefix first, aka something like
if m:
eid = m.group(1)
app = PathPrefixMiddleware(self._application, m.group(0))
else:
eid = ""
app = self._application
environ[WSGI_ENVIRON_KEY] = eid
return self._application(environ, start_response)(The real ambition, as commented on #2733, is to convert all of these into specializations of a path-remapping middleware.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I considered it, but it adds a bunch of unnecessary validation
overhead (both for the dynamic construction of the middleware and for
re-checking that the path actually starts with the prefix), and I felt
that this way was readable enough, so didn’t see a compelling reason to
add the extra layer of indirection.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But hanoi!!!
|
|
||
| # Value of the first path component that signals that the second path | ||
| # component represents an experiment ID. | ||
| _EXPERIMENT_PATH_COMPONENT = "experiment" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we might want something shorter as a prefix? I guess being more explicit for now is good, but at some point IMO it might make sense to just use something like x for brevity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed; I wondered the same thing and have no strong preference. I went
with the long form to be explicit to readers of the URL and because,
unlike (e.g.) Reddit /r/foo and /u/bar where the identifiers are
generally human-readable, the experiment IDs are at the discretion of
the data provider and in many cases will probably be encodings of binary
blobs, like UUIDs. So I didn’t think that it was too important that
users be able to quickly type out such a URL by hand. It’s still true
that it’s nice for shortlinks to be short: e.g., Stack Overflow has
opaque numeric IDs but still uses https://stackoverflow.com/a/1732454.
Of course, this would be easy to change in the future, and it would also
be easy to support both. Happy to change now or later.
wchargin-branch: eid-middleware wchargin-source: f581a5c84d81b331ffcae4c0ef7dfbc32da9986e
wchargin-branch: eid-middleware wchargin-source: f581a5c84d81b331ffcae4c0ef7dfbc32da9986e
wchargin-branch: eid-middleware wchargin-source: 1883d4c6fa7cf726cd9fc159b325232080e856db # Conflicts: # tensorboard/backend/BUILD # tensorboard/backend/application.py
wchargin-branch: eid-middleware wchargin-source: 1883d4c6fa7cf726cd9fc159b325232080e856db
nfelt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fine as is on both counts.
| r"/%s/([^/]*)" % re.escape(_EXPERIMENT_PATH_COMPONENT) | ||
| ) | ||
|
|
||
| def __call__(self, environ, start_response): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But hanoi!!!
Summary:
This commit teaches TensorBoard to recognize URLs like
/experiment/123as specifying experiment IDs. If these two path components are omitted,
the experiment ID is the empty string, for backward compatibility. The
experiment ID is intercepted by middleware and stripped from the URL
before the application handles it, so no existing logic needs to change.
The ID is added to the WSGI environment and may be extracted via a new
public function,
plugin_util.experiment_id.This commit does not yet replace the existing query parameter wiring, so
this is not a user-facing change (except that more URLs are now valid).
This functionality is compatible with
--path_prefix.Note that the
/data/plugins_listingimplementation does not need tochange: it correctly specifies experiment IDs in iframe loading
mechanisms because we already use the general-purpose
SCRIPT_NAMEkeyspecified by WSGI for this purpose.
Test Plan:
Unit tests and a small integration test in
:application_testincluded.For manual testing, modify
application.py’s_route_requestmethod toprint out the application-level path and experiment ID:
Then, check that the scalars plugin and the dynamically loaded projector
plugin both work, and that the paths and experiment IDs are as expected,
in the following configurations:
--path_prefix=/foo:/foo(should redirect to add slash)/foo//foo/experiment/123(should redirect to add slash)/foo/experiment/123///experiment/123(should redirect to add slash)/experiment/123/wchargin-branch: eid-middleware