TensorBoard 2.0.1 #2859

wchargin · 2019-10-29T17:54:14Z

Cherrypicks:

c243b6d uploader: link to /policy/terms/, not terms.html
69ec245 Improve error handling for exporting large experiments (Improve error handling for exporting large experiments #2852)
df5dc32 Fix comment in export.proto (Fix comment in export.proto #2853)
0a7dd7b uploader: set default endpoint to production (uploader: set default endpoint to production #2856)
0ec6928 uploader: add TensorBoard.dev branding (uploader: add TensorBoard.dev branding #2851)
15cbf47 uploader: change request batch size to 128 KiB (uploader: change request batch size to 128 KiB #2849)
6b0c015 uploader: fix export when outdir has no slash (uploader: fix export when outdir has no slash #2847)
a00be9c uploader: include TensorBoard version in RPCs (uploader: include TensorBoard version in RPCs #2843)
ca946bc uploader: fix auth revoke help suggestion (uploader: fix auth revoke help suggestion #2842)
5d88ea8 uploader: parse JSON from Unicode for Python 3.5 (uploader: parse JSON from Unicode for Python 3.5 #2845)
fe1e8e4 ci: unpin 2019-09-17 estimator nightlies (ci: unpin 2019-09-17 estimator nightlies #2835)
865b139 uploader: clarify that uploads run forever (uploader: clarify that uploads run forever #2841)
34c53dc Set default --grpc_creds_type to ssl, not ssl_dev (Set default --grpc_creds_type to ssl, not ssl_dev #2838)
82081bd Configure oauth for uploader client and make it required (Configure oauth for uploader client and make it required #2836)
c7c6f7a Backport mock assertions for Python 2 tests (Backport mock assertions for Python 2 tests #2837)
7b036fe uploader: add license directive to build file (uploader: add license directive to build file #2828)
088762d Bump grpcio version to 1.24.3 (Bump grpcio version to 1.24.3 #2829)
3dcc4b2 Add uploader gRPC client (Add uploader gRPC client #2822)
41458d2 util: add grpc_util with backoff helper (util: add grpc_util with backoff helper #2810)
4d31a4b proto: support build deps (proto: support build deps #2809)
50ab70c program: don’t set default flags on base parser (program: don’t set default flags on base parser #2807)
a4fb578 proto: enable compiling gRPC services (proto: enable compiling gRPC services #2800)
7ea9258 program: add subcommand CLI support (program: add subcommand CLI support #2793)

Generated with:

printf 'git checkout 5946093fc60ef17b1763f627bf731232a11050e1 &&  # origin/2.0\n' \
&& git log --reverse \
    --author=bileschi --author=nfelt --author=wchargin \
    --format='git cherry-pick %H &&  # %s' \
    20e80a2ad00bc8b950abc57289a873a5f4a261f3^..bb30af7ed5ac7b97e39313c73698ebd9c6a67b91 \
    | grep -v 9b188c1bd4b9747f7c6f130f45dd18901a9831d7 \
&& printf '%s &&\n' \
    "sed -i -e 's#/x///#/x/#' tensorboard/program_test.py" \
    "git add tensorboard/program_test.py" \
    "git commit --fixup ':/add subcommand CLI support'" \
    'GIT_EDITOR=true git rebase -i --autosquash 5946093fc60ef17b1763f627bf731232a11050e1' \
&& printf ':\n'

and then manually adding the /policy/terms/ change onto the release
branch.

Requires Git 2.19 or higher for:
git/git@6b3351e

Summary: This commit adds a subcommand interface to support invocations in the manner of `apt-get install` or `git commit`. All existing invocations remain valid, and are now also available under the `tensorboard serve` subcommand. Other subcommands may be injected programmatically. For instance, we could migrate the existing `tensorboard --inspect` to `tensorboard inspect`, which makes more sense semantically (as it does something quite different from starting a web server). To support both `tensorboard --flags` and `tensorboard serve --flags`, we require some hacks first to get off the ground at all in Python 2, and then to have sane error messages without massive flag duplication in all Python versions. Even recent Python versions don’t have a notion of a _default_ subcommand, so we need to manually inspect `argv` to determine what kind of argument parser to construct. Test Plan: Unit tests added, and the smoke test for the dynamic plugin (plus the Pip package test script) serve as integration tests for existing behavior. It’s important to test this in both Python 2 and Python 3. For manual testing: - Check that the output of `tensorboard --help` still includes all the normal flags, and now has a list of subcommands (one item long). - Check that the output of `tensorboard serve --help` is just like the output of `tensorboard --help`, but without the list of subcommands. - Patch a `demo` subcommand into `main.py`, and check that the output of `tensorboard demo --help` is just that subcommand’s help (without flags to `serve`) and that `tensorboard --help` shows the `demo` subcommand. Finally, note that changing the implementation of the monkey patch to just `yield; return` causes both `argparse_util_test` and `program_test` to fail on Python 2 (but not Python 3). wchargin-branch: subcommands

Summary: This commit teaches `tb_proto_library` a new `has_services` flag. When set, the proto library will be compiled with gRPC service support, and an additional `*_py_pb2_grpc` `py_library` target will be created. The `py_proto_library` macro provided by the protobuf build rules no longer suits our purpose, as it can only create one `*_py_pb2` build target that includes both the `*_pb2` and `*_pb2_grpc` Python modules. This is inconsistent with TensorFlow and some Google internal rules. We now directly use `proto_gen` rather than using `py_proto_library`. This is fine: that macro is marked as unstable because its interface may change, and our goal is precisely to change the interface. Test Plan: Run ``` bazel query --output=build \ 'filter(".*protos_all.*", //tensorboard/...:all)' ``` and note that the output is identical before and after this change, so existing protos will be unchanged. Then, add a simple service to `tensorboard/plugins/hparams/api.proto`: ```proto service TestService { rpc ListSesssionGroups(ListSessionGroupsRequest) returns (ListSessionGroupsResponse); } ``` Build the Pip package and install it into a new virtualenv. Try to import `tensorboard.plugins.hparams.api_pb2_grpc`, but note that this fails. Then, add `has_services = True` to the hparams `BUILD` file, and rebuild and reinstall the Pip package. The import should still fail. Finally, add a `:protos_all_py_pb2_grpc` dep to `:hparams_plugin`, and try once more. The import should now work, and the imported module should have a `TestServiceStub` attribute. wchargin-branch: proto-grpc

Summary: With old versions of `argparse`, any `set_defaults` on the base parser will take precedence over any `set_defaults` on the subparsers. This is Python [bug #9351][bpo-9351], fixed in [af26c15110b7][fix]. Even though the versions of argparse bundled with Python 2.7 and 3.5 each include the fix, we can still avoid this pattern for compatibility. To see if your Python is affected, run: ```python import argparse parser = argparse.ArgumentParser() parser.set_defaults(test="AFFECTED") parser.add_subparsers().add_parser("sub").set_defaults(test="OK") print(parser.parse_args(["sub"]).test) ``` If it prints `AFFECTED`, then this patch affects your Python. Googlers, see <http://b/143087149> for context. [bpo-9351]: https://bugs.python.org/issue9351 [fix]: http://github.com/python/cpython/commit/af26c15110b76195e62a06d17e39176d42c0511c Test Plan: Running under an old version of `argparse`, existing tests fail before this change and pass after. wchargin-branch: program-defaults

Summary: By happenstance, none of our existing proto definitions need build-level dependencies (dependencies on proto files from other build targets). But this is no fundamental restriction. We teach `tb_proto_library` a `deps` argument, pointing to other `tb_proto_library` macro invocations. Test Plan: Unit tests added. After building the Pip package and installing it into a new virtualenv, we can still use `tensorboard.compat.proto` modules. wchargin-branch: proto-deps

@nfelt

Summary: This commit adds `tensorboard.util.grpc_util`, which provides methods for consumers of gRPC services. Original implementation due to @nfelt. Test Plan: Unit tests included. They require that the `grpcio` Pip distribution (which provides the `grpc` Python package) be installed in the enclosing virtualenv. This is already a dependency of `tensorboard`, so on typical setups no additional action should be required. Co-authored-by: Nick Felt <nfelt@users.noreply.github.com> wchargin-branch: grpc-util

Summary: This commit includes a gRPC client for a hosted TensorBoard service, which is accessible as a new subcommand under the main `tensorboard` binary. Test Plan: Modify `dev_creds.py` to include credentials, then build the Pip package (`bazel run //tensorboard/pip_package:extract_pip_package`) and install it into a new virtualenv. Run ``` tensorboard dev --endpoint ENDPOINT upload --logdir ~/tensorboard_data/mnist ``` where `ENDPOINT` is a running API server (like `localhost:12345`), and verify that the upload succeeds. Then verify that normal `tensorboard` still works to serve locally. Co-authored-by: ericdnielsen <ericdnielsen@gmail.com> Co-authored-by: Karthik Ramasamy <karthikv2k@gmail.com> Co-authored-by: Nick Felt <nfelt@users.noreply.github.com> Co-authored-by: Stanley Bileschi <bileschi@google.com> Co-authored-by: William Chargin <wchargin@gmail.com>

Summary: We depend on `grpc.local_server_credentials`, which was added in 1.24.0. Installing the Pip package into a fresh virtualenv generally picks this up, because the `grpcio >= 1.6.3` will install the latest version, but that won’t suffice for virtualenvs that already have an old version installed. We now explicitly depend on `1.24.3`. Travis wasn’t seeing this problem because the `grpcio-testing` dep pulls in a matching version of `grpcio`. Test Plan: Create a new virtualenv and install `grpcio==1.16.0` (e.g.) into it. Then, install the TensorBoard Pip package. Run ``` python -c 'import grpc; print(grpc.local_server_credentials)' ``` and note that this now passes instead of failing with `AttributeError`. wchargin-branch: grpcio-1.24.3

Summary: The code in question was always Google-owned and licensed as Apache 2.0; this patch declares that explicitly at the build level. Test Plan: Note that this is the same as in (e.g.) `tensorboard/plugins/BUILD`. wchargin-branch: uploader-license

Summary: Some of the `//tensorboard/uploader/...` tests are currently disabled in Python 2 because our backport of `mock` doesn’t support the required assertions. We can’t easily upgrade the dependency (see tensorflow#2132), but we can backport those assertions. The assertions backported are: - `assert_not_called` - `assert_called` - `assert_called_once` This patch is the combination of (restrictions of) the following upstream Git commits: - [`c7d6c7dcafd4482626194528b24f174dd15f5f5e`][1] (2014-04-17) - [`8882c2fccb9f7d3b8ba2b5534c0dd9d52343725f`][2] (2014-04-17) - [`7ca5d3afe293d5e4d769c1cb2dfccb0688ba1292`][3] (2016-03-11) - [`07532a75003ec8e871e494f9f1ab8f28fa246f08`][4] (2018-01-19) I identified those commits by diffing our pinned version of `mock.py` (the version at upstream commits `b14e284` through `dc1458a`) against the version of `mock/mock.py` in the current repo, tracing the blame on the implementation of the relevant assertions, then pruning commits that relied on other changes not yet patched in (and were [only cosmetic, anyway][5]). [1]: testing-cabal/mock@c7d6c7d [2]: testing-cabal/mock@8882c2f [3]: testing-cabal/mock@7ca5d3a [4]: testing-cabal/mock@07532a7 [5]: testing-cabal/mock@57ef3bb Test Plan: Unit tests pass in Python 2, and are now actually run, too. Changing `uploader_test.py`’s `test_upload_swallows_rpc_failure` to comment out the `_upload_once` call causes the subsequent `assert_called_once` to fail with the correct error message, showing that the assertions are wired up properly. wchargin-branch: mock-backport

…2836) * Remove --auth_type argument and make use of OAuth creds unconditional * Add OAuth client configuration

Summary: We now emit a nice message so that users don’t think that the uploader has hung, and we avoid crashing when the user does interrupt it. Test Plan: Ran and then canceled an upload, and saw the following logs: ``` Upload started and will continue reading any new data as it's added to the logdir. To stop uploading, press Ctrl-C. View your TensorBoard live at: http://localhost:1234/experiment/123 ^C Upload stopped. View your TensorBoard at http://localhost:1234/experiment/123 ``` …after which the process exited with 0. wchargin-branch: uploader-forever

Summary: This reverts commit e4a7049. Test Plan: CI suffices. wchargin-branch: unpin-estimator-20190917-nightly

Summary: In versions of Python 3 prior to Python 3.6, `json.dumps` must take a Unicode string as argument. (This was relaxed in [bpo-17909].) [bpo-17909]: https://bugs.python.org/issue17909 Test Plan: In a Python 3.5 virtualenv, run `tensorboard dev auth revoke`, then run `tensorboard dev upload --logdir /tmp/whatever`. Note that the OAuth flow now works properly rather than failing with a `TypeError`. Note that the only other use of `json.loads` under `tensorboard/uploader/` is in `exporter_test.py`, and that test passes on Python 3.5. wchargin-branch: uploader-oauth-unicode

Summary: The correct invocation is `auth revoke`, not `--auth_revoke`, which was a vestige of a previous formulation. wchargin-branch: uploader-auth-revoke-text

Summary: This will enable servers to detect when clients are using an old version and send instructions to show the user how to upgrade, especially in the case of known bugs. We could implement this with gRPC’s client-side interceptor API, but that API is loudly disclaimed as experimental, so I’ve gone with the manual plumbing for now. Test Plan: Ran the new uploader with an existing server and verified that it still works (extra metadata is safely ignored). Verified that a server can be modified to read this metadata for all three flows (upload, delete, export). Tested in both Python 2 and Python 3. wchargin-branch: uploader-version

Summary: Given `export --outdir "${path}"`, we run `makedirs(dirname(path))`. But this fails when `dirname(path)` is empty, which occurs when `path` has no directory separator. Test Plan: Unit test added: it fails before this change and passes after it. wchargin-branch: export-no-slash

Summary: Typical requests now include ~4700 scalars instead of ~37000 scalars (and are correspondingly processed faster by the server). This is now reliably hitting the client-side rate limiting, so we’re sending one request per 5 seconds. Test Plan: The following patch is useful for testing: ```diff diff --git a/tensorboard/uploader/uploader.py b/tensorboard/uploader/uploader.py index 1a6e795..c3191746 100644 --- a/tensorboard/uploader/uploader.py +++ b/tensorboard/uploader/uploader.py @@ -130,6 +130,10 @@ class TensorBoardUploader(object): request_bytes = request.ByteSize() - logger.info("Trying request of %d bytes", request_bytes) + total_scalars = sum( + len(tag.points) for run in request.runs for tag in run.tags) + logger.warning( + "Trying request of %d bytes (%d scalars)", + request_bytes, total_scalars) self._upload(request) upload_duration_secs = time.time() - upload_start_time - logger.info( + logger.warning( "Upload for %d runs (%d bytes) took %.3f seconds", ``` wchargin-branch: uploader-batch-128kib

Summary: This includes the Terms of Service link in the consent screen, as well as the explicit disclaimer that all uploaded data is publicly visible. It also patches the help text (`tensorboard -h`, `tensorboard dev -h`). Test Plan: Running `git grep "a hosted"` (…service) no longer returns any results. wchargin-branch: tensorboard.dev

Test Plan: Build the Pip package and install it into a new virtualenv. Using the new package, revoke auth (`tensorboard dev auth revoke`), then upload an experiment. Note that the correct Terms of Service and Privacy Policy documents are clearly displayed in the consent prompt. wchargin-branch: uploader-prod

Fixes a description of a service response, given a slight change in semantics.

Improves error handling for exporting large experiments using the uploader. Tested: Adds unit tests for grpc.CANCELLED case.

googlebot · 2019-10-29T17:54:22Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to determine that you authored the commits in this PR. Maybe you used a different email address in the git commits than was used to sign the CLA? If someone else authored these commits, then please add them to this pull request and have them confirm that they're okay with them being contributed to Google. If there are co-authors, make sure they're formatted properly.

In order to pass this check, please resolve this problem and then comment@googlebot I fixed it... If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

googlebot · 2019-10-29T17:55:19Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

wchargin · 2019-10-29T17:55:21Z

CLAs are fine because all commit authors are Googlers.

Summary: We’ll have a redirect in place, but we’d prefer that this be the canonical path. wchargin-branch: uploader-terms-slash

googlebot · 2019-10-29T22:24:51Z

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to determine that you authored the commits in this PR. Maybe you used a different email address in the git commits than was used to sign the CLA? If someone else authored these commits, then please add them to this pull request and have them confirm that they're okay with them being contributed to Google. If there are co-authors, make sure they're formatted properly.

In order to pass this check, please resolve this problem and then comment@googlebot I fixed it... If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

googlebot · 2019-10-29T22:25:17Z

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

wchargin · 2019-10-29T22:25:19Z

CLAs are fine because all commit authors are Googlers.

nfelt

🚀 I approve this PR.

wchargin and others added 22 commits October 29, 2019 10:37

Configure oauth for uploader client and make it required (tensorflow#…

82081bd

…2836) * Remove --auth_type argument and make use of OAuth creds unconditional * Add OAuth client configuration

Set default --grpc_creds_type to ssl, not ssl_dev (tensorflow#2838)

34c53dc

ci: unpin 2019-09-17 estimator nightlies (tensorflow#2835)

fe1e8e4

Summary: This reverts commit e4a7049. Test Plan: CI suffices. wchargin-branch: unpin-estimator-20190917-nightly

uploader: fix auth revoke help suggestion (tensorflow#2842)

ca946bc

Summary: The correct invocation is `auth revoke`, not `--auth_revoke`, which was a vestige of a previous formulation. wchargin-branch: uploader-auth-revoke-text

Fix comment in export.proto (tensorflow#2853)

df5dc32

Fixes a description of a service response, given a slight change in semantics.

Improve error handling for exporting large experiments (tensorflow#2852)

69ec245

Improves error handling for exporting large experiments using the uploader. Tested: Adds unit tests for grpc.CANCELLED case.

googlebot added the cla: no label Oct 29, 2019

wchargin added cla: yes and removed cla: no labels Oct 29, 2019

wchargin mentioned this pull request Oct 29, 2019

doc: add FI to the sidebar #2857

Closed

uploader: link to /policy/terms/, not terms.html

c243b6d

Summary: We’ll have a redirect in place, but we’d prefer that this be the canonical path. wchargin-branch: uploader-terms-slash

wchargin added 2 commits October 29, 2019 15:24

Add 2.0.1 relnotes to RELEASE.md

6ab7f10

TensorBoard 2.0.1

9278fdf

wchargin force-pushed the wchargin-2.0.1 branch from eb231b1 to 9278fdf Compare October 29, 2019 22:24

googlebot added cla: no and removed cla: yes labels Oct 29, 2019

wchargin added cla: yes and removed cla: no labels Oct 29, 2019

nfelt self-requested a review October 29, 2019 23:06

nfelt approved these changes Oct 29, 2019

View reviewed changes

wchargin merged commit 77f2eb7 into tensorflow:2.0 Oct 29, 2019

wchargin deleted the wchargin-2.0.1 branch October 29, 2019 23:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

TensorBoard 2.0.1 #2859

TensorBoard 2.0.1 #2859

Uh oh!

wchargin commented Oct 29, 2019 •

edited

Loading

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

wchargin commented Oct 29, 2019

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

wchargin commented Oct 29, 2019

Uh oh!

nfelt left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

TensorBoard 2.0.1 #2859

TensorBoard 2.0.1 #2859

Uh oh!

Conversation

wchargin commented Oct 29, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

wchargin commented Oct 29, 2019

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

googlebot commented Oct 29, 2019

Uh oh!

wchargin commented Oct 29, 2019

Uh oh!

nfelt left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

wchargin commented Oct 29, 2019 •

edited

Loading