-
Notifications
You must be signed in to change notification settings - Fork 1.7k
TensorBoard 2.0.1 #2859
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorBoard 2.0.1 #2859
Conversation
Summary:
This commit adds a subcommand interface to support invocations in the
manner of `apt-get install` or `git commit`. All existing invocations
remain valid, and are now also available under the `tensorboard serve`
subcommand. Other subcommands may be injected programmatically. For
instance, we could migrate the existing `tensorboard --inspect` to
`tensorboard inspect`, which makes more sense semantically (as it does
something quite different from starting a web server).
To support both `tensorboard --flags` and `tensorboard serve --flags`,
we require some hacks first to get off the ground at all in Python 2,
and then to have sane error messages without massive flag duplication in
all Python versions. Even recent Python versions don’t have a notion of
a _default_ subcommand, so we need to manually inspect `argv` to
determine what kind of argument parser to construct.
Test Plan:
Unit tests added, and the smoke test for the dynamic plugin (plus the
Pip package test script) serve as integration tests for existing
behavior. It’s important to test this in both Python 2 and Python 3. For
manual testing:
- Check that the output of `tensorboard --help` still includes all the
normal flags, and now has a list of subcommands (one item long).
- Check that the output of `tensorboard serve --help` is just like the
output of `tensorboard --help`, but without the list of subcommands.
- Patch a `demo` subcommand into `main.py`, and check that the output
of `tensorboard demo --help` is just that subcommand’s help (without
flags to `serve`) and that `tensorboard --help` shows the `demo`
subcommand.
Finally, note that changing the implementation of the monkey patch to
just `yield; return` causes both `argparse_util_test` and `program_test`
to fail on Python 2 (but not Python 3).
wchargin-branch: subcommands
Summary:
This commit teaches `tb_proto_library` a new `has_services` flag. When
set, the proto library will be compiled with gRPC service support, and
an additional `*_py_pb2_grpc` `py_library` target will be created.
The `py_proto_library` macro provided by the protobuf build rules no
longer suits our purpose, as it can only create one `*_py_pb2` build
target that includes both the `*_pb2` and `*_pb2_grpc` Python modules.
This is inconsistent with TensorFlow and some Google internal rules. We
now directly use `proto_gen` rather than using `py_proto_library`. This
is fine: that macro is marked as unstable because its interface may
change, and our goal is precisely to change the interface.
Test Plan:
Run
```
bazel query --output=build \
'filter(".*protos_all.*", //tensorboard/...:all)'
```
and note that the output is identical before and after this change, so
existing protos will be unchanged.
Then, add a simple service to `tensorboard/plugins/hparams/api.proto`:
```proto
service TestService {
rpc ListSesssionGroups(ListSessionGroupsRequest)
returns (ListSessionGroupsResponse);
}
```
Build the Pip package and install it into a new virtualenv. Try to
import `tensorboard.plugins.hparams.api_pb2_grpc`, but note that
this fails. Then, add `has_services = True` to the hparams `BUILD` file,
and rebuild and reinstall the Pip package. The import should still fail.
Finally, add a `:protos_all_py_pb2_grpc` dep to `:hparams_plugin`, and
try once more. The import should now work, and the imported module
should have a `TestServiceStub` attribute.
wchargin-branch: proto-grpc
Summary:
With old versions of `argparse`, any `set_defaults` on the base parser
will take precedence over any `set_defaults` on the subparsers. This is
Python [bug #9351][bpo-9351], fixed in [af26c15110b7][fix]. Even though
the versions of argparse bundled with Python 2.7 and 3.5 each include
the fix, we can still avoid this pattern for compatibility.
To see if your Python is affected, run:
```python
import argparse
parser = argparse.ArgumentParser()
parser.set_defaults(test="AFFECTED")
parser.add_subparsers().add_parser("sub").set_defaults(test="OK")
print(parser.parse_args(["sub"]).test)
```
If it prints `AFFECTED`, then this patch affects your Python.
Googlers, see <http://b/143087149> for context.
[bpo-9351]: https://bugs.python.org/issue9351
[fix]: http://github.com/python/cpython/commit/af26c15110b76195e62a06d17e39176d42c0511c
Test Plan:
Running under an old version of `argparse`, existing tests fail before
this change and pass after.
wchargin-branch: program-defaults
Summary: By happenstance, none of our existing proto definitions need build-level dependencies (dependencies on proto files from other build targets). But this is no fundamental restriction. We teach `tb_proto_library` a `deps` argument, pointing to other `tb_proto_library` macro invocations. Test Plan: Unit tests added. After building the Pip package and installing it into a new virtualenv, we can still use `tensorboard.compat.proto` modules. wchargin-branch: proto-deps
Summary: This commit adds `tensorboard.util.grpc_util`, which provides methods for consumers of gRPC services. Original implementation due to @nfelt. Test Plan: Unit tests included. They require that the `grpcio` Pip distribution (which provides the `grpc` Python package) be installed in the enclosing virtualenv. This is already a dependency of `tensorboard`, so on typical setups no additional action should be required. Co-authored-by: Nick Felt <nfelt@users.noreply.github.com> wchargin-branch: grpc-util
Summary: This commit includes a gRPC client for a hosted TensorBoard service, which is accessible as a new subcommand under the main `tensorboard` binary. Test Plan: Modify `dev_creds.py` to include credentials, then build the Pip package (`bazel run //tensorboard/pip_package:extract_pip_package`) and install it into a new virtualenv. Run ``` tensorboard dev --endpoint ENDPOINT upload --logdir ~/tensorboard_data/mnist ``` where `ENDPOINT` is a running API server (like `localhost:12345`), and verify that the upload succeeds. Then verify that normal `tensorboard` still works to serve locally. Co-authored-by: ericdnielsen <ericdnielsen@gmail.com> Co-authored-by: Karthik Ramasamy <karthikv2k@gmail.com> Co-authored-by: Nick Felt <nfelt@users.noreply.github.com> Co-authored-by: Stanley Bileschi <bileschi@google.com> Co-authored-by: William Chargin <wchargin@gmail.com>
Summary: We depend on `grpc.local_server_credentials`, which was added in 1.24.0. Installing the Pip package into a fresh virtualenv generally picks this up, because the `grpcio >= 1.6.3` will install the latest version, but that won’t suffice for virtualenvs that already have an old version installed. We now explicitly depend on `1.24.3`. Travis wasn’t seeing this problem because the `grpcio-testing` dep pulls in a matching version of `grpcio`. Test Plan: Create a new virtualenv and install `grpcio==1.16.0` (e.g.) into it. Then, install the TensorBoard Pip package. Run ``` python -c 'import grpc; print(grpc.local_server_credentials)' ``` and note that this now passes instead of failing with `AttributeError`. wchargin-branch: grpcio-1.24.3
Summary: The code in question was always Google-owned and licensed as Apache 2.0; this patch declares that explicitly at the build level. Test Plan: Note that this is the same as in (e.g.) `tensorboard/plugins/BUILD`. wchargin-branch: uploader-license
Summary: Some of the `//tensorboard/uploader/...` tests are currently disabled in Python 2 because our backport of `mock` doesn’t support the required assertions. We can’t easily upgrade the dependency (see tensorflow#2132), but we can backport those assertions. The assertions backported are: - `assert_not_called` - `assert_called` - `assert_called_once` This patch is the combination of (restrictions of) the following upstream Git commits: - [`c7d6c7dcafd4482626194528b24f174dd15f5f5e`][1] (2014-04-17) - [`8882c2fccb9f7d3b8ba2b5534c0dd9d52343725f`][2] (2014-04-17) - [`7ca5d3afe293d5e4d769c1cb2dfccb0688ba1292`][3] (2016-03-11) - [`07532a75003ec8e871e494f9f1ab8f28fa246f08`][4] (2018-01-19) I identified those commits by diffing our pinned version of `mock.py` (the version at upstream commits `b14e284` through `dc1458a`) against the version of `mock/mock.py` in the current repo, tracing the blame on the implementation of the relevant assertions, then pruning commits that relied on other changes not yet patched in (and were [only cosmetic, anyway][5]). [1]: testing-cabal/mock@c7d6c7d [2]: testing-cabal/mock@8882c2f [3]: testing-cabal/mock@7ca5d3a [4]: testing-cabal/mock@07532a7 [5]: testing-cabal/mock@57ef3bb Test Plan: Unit tests pass in Python 2, and are now actually run, too. Changing `uploader_test.py`’s `test_upload_swallows_rpc_failure` to comment out the `_upload_once` call causes the subsequent `assert_called_once` to fail with the correct error message, showing that the assertions are wired up properly. wchargin-branch: mock-backport
…2836) * Remove --auth_type argument and make use of OAuth creds unconditional * Add OAuth client configuration
Summary: We now emit a nice message so that users don’t think that the uploader has hung, and we avoid crashing when the user does interrupt it. Test Plan: Ran and then canceled an upload, and saw the following logs: ``` Upload started and will continue reading any new data as it's added to the logdir. To stop uploading, press Ctrl-C. View your TensorBoard live at: http://localhost:1234/experiment/123 ^C Upload stopped. View your TensorBoard at http://localhost:1234/experiment/123 ``` …after which the process exited with 0. wchargin-branch: uploader-forever
Summary: This reverts commit e4a7049. Test Plan: CI suffices. wchargin-branch: unpin-estimator-20190917-nightly
Summary: In versions of Python 3 prior to Python 3.6, `json.dumps` must take a Unicode string as argument. (This was relaxed in [bpo-17909].) [bpo-17909]: https://bugs.python.org/issue17909 Test Plan: In a Python 3.5 virtualenv, run `tensorboard dev auth revoke`, then run `tensorboard dev upload --logdir /tmp/whatever`. Note that the OAuth flow now works properly rather than failing with a `TypeError`. Note that the only other use of `json.loads` under `tensorboard/uploader/` is in `exporter_test.py`, and that test passes on Python 3.5. wchargin-branch: uploader-oauth-unicode
Summary: The correct invocation is `auth revoke`, not `--auth_revoke`, which was a vestige of a previous formulation. wchargin-branch: uploader-auth-revoke-text
Summary: This will enable servers to detect when clients are using an old version and send instructions to show the user how to upgrade, especially in the case of known bugs. We could implement this with gRPC’s client-side interceptor API, but that API is loudly disclaimed as experimental, so I’ve gone with the manual plumbing for now. Test Plan: Ran the new uploader with an existing server and verified that it still works (extra metadata is safely ignored). Verified that a server can be modified to read this metadata for all three flows (upload, delete, export). Tested in both Python 2 and Python 3. wchargin-branch: uploader-version
Summary:
Given `export --outdir "${path}"`, we run `makedirs(dirname(path))`. But
this fails when `dirname(path)` is empty, which occurs when `path` has
no directory separator.
Test Plan:
Unit test added: it fails before this change and passes after it.
wchargin-branch: export-no-slash
Summary: Typical requests now include ~4700 scalars instead of ~37000 scalars (and are correspondingly processed faster by the server). This is now reliably hitting the client-side rate limiting, so we’re sending one request per 5 seconds. Test Plan: The following patch is useful for testing: ```diff diff --git a/tensorboard/uploader/uploader.py b/tensorboard/uploader/uploader.py index 1a6e795..c3191746 100644 --- a/tensorboard/uploader/uploader.py +++ b/tensorboard/uploader/uploader.py @@ -130,6 +130,10 @@ class TensorBoardUploader(object): request_bytes = request.ByteSize() - logger.info("Trying request of %d bytes", request_bytes) + total_scalars = sum( + len(tag.points) for run in request.runs for tag in run.tags) + logger.warning( + "Trying request of %d bytes (%d scalars)", + request_bytes, total_scalars) self._upload(request) upload_duration_secs = time.time() - upload_start_time - logger.info( + logger.warning( "Upload for %d runs (%d bytes) took %.3f seconds", ``` wchargin-branch: uploader-batch-128kib
Summary: This includes the Terms of Service link in the consent screen, as well as the explicit disclaimer that all uploaded data is publicly visible. It also patches the help text (`tensorboard -h`, `tensorboard dev -h`). Test Plan: Running `git grep "a hosted"` (…service) no longer returns any results. wchargin-branch: tensorboard.dev
Test Plan: Build the Pip package and install it into a new virtualenv. Using the new package, revoke auth (`tensorboard dev auth revoke`), then upload an experiment. Note that the correct Terms of Service and Privacy Policy documents are clearly displayed in the consent prompt. wchargin-branch: uploader-prod
Fixes a description of a service response, given a slight change in semantics.
Improves error handling for exporting large experiments using the uploader. Tested: Adds unit tests for grpc.CANCELLED case.
|
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to determine that you authored the commits in this PR. Maybe you used a different email address in the git commits than was used to sign the CLA? If someone else authored these commits, then please add them to this pull request and have them confirm that they're okay with them being contributed to Google. If there are co-authors, make sure they're formatted properly. In order to pass this check, please resolve this problem and then comment ℹ️ Googlers: Go here for more info. |
|
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) ℹ️ Googlers: Go here for more info. |
|
CLAs are fine because all commit authors are Googlers. |
Summary: We’ll have a redirect in place, but we’d prefer that this be the canonical path. wchargin-branch: uploader-terms-slash
eb231b1 to
9278fdf
Compare
|
We found a Contributor License Agreement for you (the sender of this pull request), but were unable to determine that you authored the commits in this PR. Maybe you used a different email address in the git commits than was used to sign the CLA? If someone else authored these commits, then please add them to this pull request and have them confirm that they're okay with them being contributed to Google. If there are co-authors, make sure they're formatted properly. In order to pass this check, please resolve this problem and then comment ℹ️ Googlers: Go here for more info. |
|
A Googler has manually verified that the CLAs look good. (Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.) ℹ️ Googlers: Go here for more info. |
|
CLAs are fine because all commit authors are Googlers. |
nfelt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 I approve this PR.
Cherrypicks:
/policy/terms/, notterms.htmlauth revokehelp suggestion (uploader: fixauth revokehelp suggestion #2842)mockassertions for Python 2 tests (Backportmockassertions for Python 2 tests #2837)grpcioversion to 1.24.3 (Bumpgrpcioversion to 1.24.3 #2829)grpc_utilwith backoff helper (util: addgrpc_utilwith backoff helper #2810)Generated with:
and then manually adding the
/policy/terms/change onto the releasebranch.
Requires Git 2.19 or higher for:
git/git@6b3351e