Skip to content

Conversation

@wchargin
Copy link
Contributor

@wchargin wchargin commented Oct 6, 2019

Summary:
This commit teaches TensorBoard to recognize URLs like /experiment/123
as specifying experiment IDs. If these two path components are omitted,
the experiment ID is the empty string, for backward compatibility. The
experiment ID is intercepted by middleware and stripped from the URL
before the application handles it, so no existing logic needs to change.
The ID is added to the WSGI environment and may be extracted via a new
public function, plugin_util.experiment_id.

This commit does not yet replace the existing query parameter wiring, so
this is not a user-facing change (except that more URLs are now valid).

This functionality is compatible with --path_prefix.

Note that the /data/plugins_listing implementation does not need to
change: it correctly specifies experiment IDs in iframe loading
mechanisms because we already use the general-purpose SCRIPT_NAME key
specified by WSGI for this purpose.

Test Plan:
Unit tests and a small integration test in :application_test included.
For manual testing, modify application.py’s _route_request method to
print out the application-level path and experiment ID:

    # at top of `_route_request`:
    from tensorboard import plugin_util
    eid = plugin_util.experiment_id(environ)
    logger.warn("%r [%r]" % (environ.get("PATH_INFO"), eid))

Then, check that the scalars plugin and the dynamically loaded projector
plugin both work, and that the paths and experiment IDs are as expected,
in the following configurations:

  • with --path_prefix=/foo:
    • at /foo (should redirect to add slash)
    • at /foo/
    • at /foo/experiment/123 (should redirect to add slash)
    • at /foo/experiment/123/
  • with no path prefix:
    • at /
    • at /experiment/123 (should redirect to add slash)
    • at /experiment/123/

wchargin-branch: eid-middleware

Summary:
Currently, we apply the `_handle_errors` middleware as a decorator
directly on `__call__`. We plan to add middleware that is parameterized
on instance variables, so this pattern will no longer suffice. This
commit simply refactors the existing code to apply middleware at
initialization time.

Test Plan:
It suffices to run `bazel test //tensorboard/backend:application_test`.

wchargin-branch: app-middleware
wchargin-source: 77c4f793de8b64d88bd10212db5a0191995b29ca
Summary:
This way, we can reason about and test it in isolation as we add more
middleware to the application.

Note that when using `--path_prefix`, loading the main TensorBoard page
without a trailing slash on the URL has long been broken (#1117). This
commit does not fix that, but changes the exact failure mode: rather
than 404ing, we now load a broken TensorBoard (relative URLs don’t
resolve). A follow-up commit will fix this properly.

Test Plan:
Verify that TensorBoard still works fully, with `--path_prefix` and
without it.

wchargin-branch: path-prefix-middleware
wchargin-source: 3e07d2b681f12c4bab310bef2afc066590ae5cb7
Summary:
When using a `--path_prefix`, TensorBoard has historically required a
trailing slash after that prefix: e.g., one visits `localhost:6006/foo/`
rather than `localhost:6006/foo`. (See, for instance, #1176.) Different
versions of TensorBoard have different failure modes when the non-slash
path is loaded; currently, the TensorBoard shell loads, but the frontend
computes relative URLs incorrectly. This commit adds a redirect from the
empty path to `/` to avoid the problem.

This logic could be inlined into the `PathPrefixMiddleware`, but we’ll
soon have other middlewares with similar needs, so providing this as its
own middleware avoids duplicating the functionality.

Test Plan:
Launch TensorBoard with `--path_prefix /foo` (or `--path_prefix /foo/`;
the two are equivalent) and navigate to `/foo/` and `/foo` in a browser.
Note that they now both work and resolve to `/foo/`, while prior to this
commit navigating to `/foo` yielded a broken TensorBoard.

wchargin-branch: empty-path-redirect
wchargin-source: 300682590a9a44f1b37d16168b457b4f5898463c
Summary:
This commit teaches TensorBoard to recognize URLs like `/experiment/123`
as specifying experiment IDs. If these two path components are omitted,
the experiment ID is the empty string, for backward compatibility. The
experiment ID is intercepted by middleware and stripped from the URL
before the application handles it, so no existing logic needs to change.
The ID is added to the WSGI environment and may be extracted via a new
public function, `plugin_util.experiment_id`.

This commit does not yet replace the existing query parameter wiring, so
this is not a user-facing change (except that more URLs are now valid).

This functionality is compatible with `--path_prefix`.

Note that the `/data/plugins_listing` implementation does not need to
change: it correctly specifies experiment IDs in iframe loading
mechanisms because we already use the general-purpose `SCRIPT_NAME` key
specified by WSGI for this purpose.

Test Plan:
Unit tests and a small integration test in `:application_test` included.
For manual testing, modify `application.py`’s `_route_request` method to
print out the application-level path and experiment ID:

```python
    # at top of `_route_request`:
    from tensorboard import plugin_util
    eid = plugin_util.experiment_id(environ)
    logger.warn("%r [%r]" % (environ.get("PATH_INFO"), eid))
```

Then, check that the scalars plugin and the dynamically loaded projector
plugin both work, and that the paths and experiment IDs are as expected,
in the following configurations:

  - with `--path_prefix=/foo`:
      - at `/foo` (should redirect to add slash)
      - at `/foo/`
      - at `/foo/experiment/123` (should redirect to add slash)
      - at `/foo/experiment/123/`
  - with no path prefix:
      - at `/`
      - at `/experiment/123` (should redirect to add slash)
      - at `/experiment/123/`

wchargin-branch: eid-middleware
wchargin-source: 21f505c418f05a90ed434a9ca86b41959a39040f
wchargin-branch: eid-middleware
wchargin-source: e956be9e14a712e424594550d58ef337710a0a2b
wchargin-branch: empty-path-redirect
wchargin-source: 5cb309fa62fd9efac68ac52b8a57fcef67a2fcb5

# Conflicts:
#	tensorboard/backend/BUILD
#	tensorboard/backend/application.py
wchargin-branch: empty-path-redirect
wchargin-source: 5cb309fa62fd9efac68ac52b8a57fcef67a2fcb5
wchargin-branch: empty-path-redirect
wchargin-source: fa2c0afa5ecc3882d03ea695c6370edaa77399a2

Any request whose path matches `/experiment/SOME_EID[/...]` will have
its first two path components stripped, and its experiment ID stored
onto the WSGI environment with key `WSGI_ENVIRON_KEY`. All other requests
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe clarify that it's the key stored in the constant WSGI_ENVIRON_KEY not literally that key? Or just provide the literal key here in the docstring.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure; done.

r"/%s/([^/]*)" % re.escape(_EXPERIMENT_PATH_COMPONENT)
)

def __call__(self, environ, start_response):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, for more Hanoification this could be implemented using PathPrefixMiddleware if we just detect the experiment ID prefix first, aka something like

if m:
  eid = m.group(1)
  app = PathPrefixMiddleware(self._application, m.group(0))
else:
  eid = ""
  app = self._application
environ[WSGI_ENVIRON_KEY] = eid
return self._application(environ, start_response)

(The real ambition, as commented on #2733, is to convert all of these into specializations of a path-remapping middleware.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I considered it, but it adds a bunch of unnecessary validation
overhead (both for the dynamic construction of the middleware and for
re-checking that the path actually starts with the prefix), and I felt
that this way was readable enough, so didn’t see a compelling reason to
add the extra layer of indirection.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But hanoi!!!


# Value of the first path component that signals that the second path
# component represents an experiment ID.
_EXPERIMENT_PATH_COMPONENT = "experiment"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we might want something shorter as a prefix? I guess being more explicit for now is good, but at some point IMO it might make sense to just use something like x for brevity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; I wondered the same thing and have no strong preference. I went
with the long form to be explicit to readers of the URL and because,
unlike (e.g.) Reddit /r/foo and /u/bar where the identifiers are
generally human-readable, the experiment IDs are at the discretion of
the data provider and in many cases will probably be encodings of binary
blobs, like UUIDs. So I didn’t think that it was too important that
users be able to quickly type out such a URL by hand. It’s still true
that it’s nice for shortlinks to be short: e.g., Stack Overflow has
opaque numeric IDs but still uses https://stackoverflow.com/a/1732454.

Of course, this would be easy to change in the future, and it would also
be easy to support both. Happy to change now or later.

wchargin-branch: eid-middleware
wchargin-source: f581a5c84d81b331ffcae4c0ef7dfbc32da9986e
wchargin-branch: eid-middleware
wchargin-source: f581a5c84d81b331ffcae4c0ef7dfbc32da9986e
@wchargin wchargin changed the base branch from wchargin-empty-path-redirect to master October 7, 2019 22:12
wchargin-branch: eid-middleware
wchargin-source: 1883d4c6fa7cf726cd9fc159b325232080e856db

# Conflicts:
#	tensorboard/backend/BUILD
#	tensorboard/backend/application.py
wchargin-branch: eid-middleware
wchargin-source: 1883d4c6fa7cf726cd9fc159b325232080e856db
Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine as is on both counts.

r"/%s/([^/]*)" % re.escape(_EXPERIMENT_PATH_COMPONENT)
)

def __call__(self, environ, start_response):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But hanoi!!!

@wchargin wchargin merged commit 7fdf117 into master Oct 8, 2019
@wchargin wchargin deleted the wchargin-eid-middleware branch October 8, 2019 01:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants