Skip to content

Conversation

@wchargin
Copy link
Contributor

@wchargin wchargin commented Oct 29, 2019

Cherrypicks:

Generated with:

printf 'git checkout 5946093fc60ef17b1763f627bf731232a11050e1 &&  # origin/2.0\n' \
&& git log --reverse \
    --author=bileschi --author=nfelt --author=wchargin \
    --format='git cherry-pick %H &&  # %s' \
    20e80a2ad00bc8b950abc57289a873a5f4a261f3^..bb30af7ed5ac7b97e39313c73698ebd9c6a67b91 \
    | grep -v 9b188c1bd4b9747f7c6f130f45dd18901a9831d7 \
&& printf '%s &&\n' \
    "sed -i -e 's#/x///#/x/#' tensorboard/program_test.py" \
    "git add tensorboard/program_test.py" \
    "git commit --fixup ':/add subcommand CLI support'" \
    'GIT_EDITOR=true git rebase -i --autosquash 5946093fc60ef17b1763f627bf731232a11050e1' \
&& printf ':\n'

and then manually adding the /policy/terms/ change onto the release
branch.

Requires Git 2.19 or higher for:
git/git@6b3351e

wchargin and others added 22 commits October 29, 2019 10:37
Summary:
This commit adds a subcommand interface to support invocations in the
manner of `apt-get install` or `git commit`. All existing invocations
remain valid, and are now also available under the `tensorboard serve`
subcommand. Other subcommands may be injected programmatically. For
instance, we could migrate the existing `tensorboard --inspect` to
`tensorboard inspect`, which makes more sense semantically (as it does
something quite different from starting a web server).

To support both `tensorboard --flags` and `tensorboard serve --flags`,
we require some hacks first to get off the ground at all in Python 2,
and then to have sane error messages without massive flag duplication in
all Python versions. Even recent Python versions don’t have a notion of
a _default_ subcommand, so we need to manually inspect `argv` to
determine what kind of argument parser to construct.

Test Plan:
Unit tests added, and the smoke test for the dynamic plugin (plus the
Pip package test script) serve as integration tests for existing
behavior. It’s important to test this in both Python 2 and Python 3. For
manual testing:

  - Check that the output of `tensorboard --help` still includes all the
    normal flags, and now has a list of subcommands (one item long).
  - Check that the output of `tensorboard serve --help` is just like the
    output of `tensorboard --help`, but without the list of subcommands.
  - Patch a `demo` subcommand into `main.py`, and check that the output
    of `tensorboard demo --help` is just that subcommand’s help (without
    flags to `serve`) and that `tensorboard --help` shows the `demo`
    subcommand.

Finally, note that changing the implementation of the monkey patch to
just `yield; return` causes both `argparse_util_test` and `program_test`
to fail on Python 2 (but not Python 3).

wchargin-branch: subcommands
Summary:
This commit teaches `tb_proto_library` a new `has_services` flag. When
set, the proto library will be compiled with gRPC service support, and
an additional `*_py_pb2_grpc` `py_library` target will be created.

The `py_proto_library` macro provided by the protobuf build rules no
longer suits our purpose, as it can only create one `*_py_pb2` build
target that includes both the `*_pb2` and `*_pb2_grpc` Python modules.
This is inconsistent with TensorFlow and some Google internal rules. We
now directly use `proto_gen` rather than using `py_proto_library`. This
is fine: that macro is marked as unstable because its interface may
change, and our goal is precisely to change the interface.

Test Plan:
Run

```
bazel query --output=build \
    'filter(".*protos_all.*", //tensorboard/...:all)'
```

and note that the output is identical before and after this change, so
existing protos will be unchanged.

Then, add a simple service to `tensorboard/plugins/hparams/api.proto`:

```proto
service TestService {
  rpc ListSesssionGroups(ListSessionGroupsRequest)
      returns (ListSessionGroupsResponse);
}
```

Build the Pip package and install it into a new virtualenv. Try to
import `tensorboard.plugins.hparams.api_pb2_grpc`, but note that
this fails. Then, add `has_services = True` to the hparams `BUILD` file,
and rebuild and reinstall the Pip package. The import should still fail.
Finally, add a `:protos_all_py_pb2_grpc` dep to `:hparams_plugin`, and
try once more. The import should now work, and the imported module
should have a `TestServiceStub` attribute.

wchargin-branch: proto-grpc
Summary:
With old versions of `argparse`, any `set_defaults` on the base parser
will take precedence over any `set_defaults` on the subparsers. This is
Python [bug #9351][bpo-9351], fixed in [af26c15110b7][fix]. Even though
the versions of argparse bundled with Python 2.7 and 3.5 each include
the fix, we can still avoid this pattern for compatibility.

To see if your Python is affected, run:

```python
import argparse
parser = argparse.ArgumentParser()
parser.set_defaults(test="AFFECTED")
parser.add_subparsers().add_parser("sub").set_defaults(test="OK")
print(parser.parse_args(["sub"]).test)
```

If it prints `AFFECTED`, then this patch affects your Python.

Googlers, see <http://b/143087149> for context.

[bpo-9351]: https://bugs.python.org/issue9351
[fix]: http://github.com/python/cpython/commit/af26c15110b76195e62a06d17e39176d42c0511c

Test Plan:
Running under an old version of `argparse`, existing tests fail before
this change and pass after.

wchargin-branch: program-defaults
Summary:
By happenstance, none of our existing proto definitions need build-level
dependencies (dependencies on proto files from other build targets). But
this is no fundamental restriction. We teach `tb_proto_library` a `deps`
argument, pointing to other `tb_proto_library` macro invocations.

Test Plan:
Unit tests added. After building the Pip package and installing it into
a new virtualenv, we can still use `tensorboard.compat.proto` modules.

wchargin-branch: proto-deps
Summary:
This commit adds `tensorboard.util.grpc_util`, which provides methods
for consumers of gRPC services.

Original implementation due to @nfelt.

Test Plan:
Unit tests included. They require that the `grpcio` Pip distribution
(which provides the `grpc` Python package) be installed in the enclosing
virtualenv. This is already a dependency of `tensorboard`, so on typical
setups no additional action should be required.

Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
wchargin-branch: grpc-util
Summary:
This commit includes a gRPC client for a hosted TensorBoard service,
which is accessible as a new subcommand under the main `tensorboard`
binary.

Test Plan:
Modify `dev_creds.py` to include credentials, then build the Pip package
(`bazel run //tensorboard/pip_package:extract_pip_package`) and install
it into a new virtualenv. Run

```
tensorboard dev --endpoint ENDPOINT upload --logdir ~/tensorboard_data/mnist
```

where `ENDPOINT` is a running API server (like `localhost:12345`), and
verify that the upload succeeds. Then verify that normal `tensorboard`
still works to serve locally.

Co-authored-by: ericdnielsen <ericdnielsen@gmail.com>
Co-authored-by: Karthik Ramasamy <karthikv2k@gmail.com>
Co-authored-by: Nick Felt <nfelt@users.noreply.github.com>
Co-authored-by: Stanley Bileschi <bileschi@google.com>
Co-authored-by: William Chargin <wchargin@gmail.com>
Summary:
We depend on `grpc.local_server_credentials`, which was added in 1.24.0.
Installing the Pip package into a fresh virtualenv generally picks this
up, because the `grpcio >= 1.6.3` will install the latest version, but
that won’t suffice for virtualenvs that already have an old version
installed. We now explicitly depend on `1.24.3`.

Travis wasn’t seeing this problem because the `grpcio-testing` dep pulls
in a matching version of `grpcio`.

Test Plan:
Create a new virtualenv and install `grpcio==1.16.0` (e.g.) into it.
Then, install the TensorBoard Pip package. Run

```
python -c 'import grpc; print(grpc.local_server_credentials)'
```

and note that this now passes instead of failing with `AttributeError`.

wchargin-branch: grpcio-1.24.3
Summary:
The code in question was always Google-owned and licensed as Apache 2.0;
this patch declares that explicitly at the build level.

Test Plan:
Note that this is the same as in (e.g.) `tensorboard/plugins/BUILD`.

wchargin-branch: uploader-license
Summary:
Some of the `//tensorboard/uploader/...` tests are currently disabled in
Python 2 because our backport of `mock` doesn’t support the required
assertions. We can’t easily upgrade the dependency (see tensorflow#2132), but we
can backport those assertions. The assertions backported are:

  - `assert_not_called`
  - `assert_called`
  - `assert_called_once`

This patch is the combination of (restrictions of) the following
upstream Git commits:

  - [`c7d6c7dcafd4482626194528b24f174dd15f5f5e`][1] (2014-04-17)
  - [`8882c2fccb9f7d3b8ba2b5534c0dd9d52343725f`][2] (2014-04-17)
  - [`7ca5d3afe293d5e4d769c1cb2dfccb0688ba1292`][3] (2016-03-11)
  - [`07532a75003ec8e871e494f9f1ab8f28fa246f08`][4] (2018-01-19)

I identified those commits by diffing our pinned version of `mock.py`
(the version at upstream commits `b14e284` through `dc1458a`) against
the version of `mock/mock.py` in the current repo, tracing the blame on
the implementation of the relevant assertions, then pruning commits that
relied on other changes not yet patched in (and were [only cosmetic,
anyway][5]).

[1]: testing-cabal/mock@c7d6c7d
[2]: testing-cabal/mock@8882c2f
[3]: testing-cabal/mock@7ca5d3a
[4]: testing-cabal/mock@07532a7
[5]: testing-cabal/mock@57ef3bb

Test Plan:
Unit tests pass in Python 2, and are now actually run, too. Changing
`uploader_test.py`’s `test_upload_swallows_rpc_failure` to comment out
the `_upload_once` call causes the subsequent `assert_called_once` to
fail with the correct error message, showing that the assertions are
wired up properly.

wchargin-branch: mock-backport
…2836)

* Remove --auth_type argument and make use of OAuth creds unconditional

* Add OAuth client configuration
Summary:
We now emit a nice message so that users don’t think that the uploader
has hung, and we avoid crashing when the user does interrupt it.

Test Plan:
Ran and then canceled an upload, and saw the following logs:

```
Upload started and will continue reading any new data as it's added
to the logdir. To stop uploading, press Ctrl-C.
View your TensorBoard live at: http://localhost:1234/experiment/123
^C
Upload stopped. View your TensorBoard at http://localhost:1234/experiment/123
```

…after which the process exited with 0.

wchargin-branch: uploader-forever
Summary:
This reverts commit e4a7049.

Test Plan:
CI suffices.

wchargin-branch: unpin-estimator-20190917-nightly
Summary:
In versions of Python 3 prior to Python 3.6, `json.dumps` must take a
Unicode string as argument. (This was relaxed in [bpo-17909].)

[bpo-17909]: https://bugs.python.org/issue17909

Test Plan:
In a Python 3.5 virtualenv, run `tensorboard dev auth revoke`, then run
`tensorboard dev upload --logdir /tmp/whatever`. Note that the OAuth
flow now works properly rather than failing with a `TypeError`. Note
that the only other use of `json.loads` under `tensorboard/uploader/` is
in `exporter_test.py`, and that test passes on Python 3.5.

wchargin-branch: uploader-oauth-unicode
Summary:
The correct invocation is `auth revoke`, not `--auth_revoke`, which was
a vestige of a previous formulation.

wchargin-branch: uploader-auth-revoke-text
Summary:
This will enable servers to detect when clients are using an old version
and send instructions to show the user how to upgrade, especially in the
case of known bugs.

We could implement this with gRPC’s client-side interceptor API, but
that API is loudly disclaimed as experimental, so I’ve gone with the
manual plumbing for now.

Test Plan:
Ran the new uploader with an existing server and verified that it still
works (extra metadata is safely ignored). Verified that a server can
be modified to read this metadata for all three flows (upload, delete,
export). Tested in both Python 2 and Python 3.

wchargin-branch: uploader-version
Summary:
Given `export --outdir "${path}"`, we run `makedirs(dirname(path))`. But
this fails when `dirname(path)` is empty, which occurs when `path` has
no directory separator.

Test Plan:
Unit test added: it fails before this change and passes after it.

wchargin-branch: export-no-slash
Summary:
Typical requests now include ~4700 scalars instead of ~37000 scalars
(and are correspondingly processed faster by the server). This is now
reliably hitting the client-side rate limiting, so we’re sending one
request per 5 seconds.

Test Plan:
The following patch is useful for testing:

```diff
diff --git a/tensorboard/uploader/uploader.py b/tensorboard/uploader/uploader.py
index 1a6e795..c3191746 100644
--- a/tensorboard/uploader/uploader.py
+++ b/tensorboard/uploader/uploader.py
@@ -130,6 +130,10 @@ class TensorBoardUploader(object):
       request_bytes = request.ByteSize()
-      logger.info("Trying request of %d bytes", request_bytes)
+      total_scalars = sum(
+          len(tag.points) for run in request.runs for tag in run.tags)
+      logger.warning(
+          "Trying request of %d bytes (%d scalars)",
+          request_bytes, total_scalars)
       self._upload(request)
       upload_duration_secs = time.time() - upload_start_time
-      logger.info(
+      logger.warning(
           "Upload for %d runs (%d bytes) took %.3f seconds",
```

wchargin-branch: uploader-batch-128kib
Summary:
This includes the Terms of Service link in the consent screen, as well
as the explicit disclaimer that all uploaded data is publicly visible.
It also patches the help text (`tensorboard -h`, `tensorboard dev -h`).

Test Plan:
Running `git grep "a hosted"` (…service) no longer returns any results.

wchargin-branch: tensorboard.dev
Test Plan:
Build the Pip package and install it into a new virtualenv. Using the
new package, revoke auth (`tensorboard dev auth revoke`), then upload an
experiment. Note that the correct Terms of Service and Privacy Policy
documents are clearly displayed in the consent prompt.

wchargin-branch: uploader-prod
Fixes a description of a service response, given a slight change in semantics.
Improves error handling for exporting large experiments using the uploader.

Tested:
  Adds unit tests for grpc.CANCELLED case.
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to determine that you authored the commits in this PR. Maybe you used a different email address in the git commits than was used to sign the CLA? If someone else authored these commits, then please add them to this pull request and have them confirm that they're okay with them being contributed to Google. If there are co-authors, make sure they're formatted properly.

In order to pass this check, please resolve this problem and then comment@googlebot I fixed it... If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@wchargin
Copy link
Contributor Author

CLAs are fine because all commit authors are Googlers.

Summary:
We’ll have a redirect in place, but we’d prefer that this be the
canonical path.

wchargin-branch: uploader-terms-slash
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to determine that you authored the commits in this PR. Maybe you used a different email address in the git commits than was used to sign the CLA? If someone else authored these commits, then please add them to this pull request and have them confirm that they're okay with them being contributed to Google. If there are co-authors, make sure they're formatted properly.

In order to pass this check, please resolve this problem and then comment@googlebot I fixed it... If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@wchargin
Copy link
Contributor Author

CLAs are fine because all commit authors are Googlers.

@nfelt nfelt self-requested a review October 29, 2019 23:06
Copy link
Contributor

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀 I approve this PR.

@wchargin wchargin merged commit 77f2eb7 into tensorflow:2.0 Oct 29, 2019
@wchargin wchargin deleted the wchargin-2.0.1 branch October 29, 2019 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants