Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions tensorboard/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -399,6 +399,14 @@ py_library(
visibility = ["//visibility:public"],
)

py_library(
name = "expect_requests_installed",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should take an explicit dep on this in our setup.py now that it's a direct dependency, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes—thanks!

# This is a dummy rule used as a requests dependency in open-source.
# We expect requests to already be installed on the system, e.g., via
# `pip install requests`.
visibility = ["//visibility:public"],
)

filegroup(
name = "tf_web_library_default_typings",
srcs = [
Expand Down
1 change: 1 addition & 0 deletions tensorboard/pip_package/setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
'markdown >= 2.6.8',
'numpy >= 1.12.0',
'protobuf >= 3.6.0',
'requests >= 2.22.0, < 3',
'setuptools >= 41.0.0',
'six >= 1.10.0',
'werkzeug >= 0.11.15',
Expand Down
26 changes: 26 additions & 0 deletions tensorboard/uploader/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -201,3 +201,29 @@ py_test(
"//tensorboard:test",
],
)

py_library(
name = "server_info",
srcs = ["server_info.py"],
deps = [
"//tensorboard:expect_requests_installed",
"//tensorboard:version",
"//tensorboard/uploader/proto:protos_all_py_pb2",
"@com_google_protobuf//:protobuf_python",
],
)

py_test(
name = "server_info_test",
size = "medium", # local network requests
timeout = "short",
srcs = ["server_info_test.py"],
deps = [
":server_info",
"//tensorboard:expect_futures_installed",
"//tensorboard:test",
"//tensorboard:version",
"//tensorboard/uploader/proto:protos_all_py_pb2",
"@org_pocoo_werkzeug",
],
)
2 changes: 2 additions & 0 deletions tensorboard/uploader/proto/BUILD
Original file line number Diff line number Diff line change
Expand Up @@ -6,11 +6,13 @@ licenses(["notice"]) # Apache 2.0

exports_files(["LICENSE"])

# TODO(@wchargin): Split more granularly.
tb_proto_library(
name = "protos_all",
srcs = [
"export_service.proto",
"scalar.proto",
"server_info.proto",
"write_service.proto",
],
has_services = True,
Expand Down
61 changes: 61 additions & 0 deletions tensorboard/uploader/proto/server_info.proto
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
syntax = "proto3";

package tensorboard.service;

// Request sent by uploader clients at the start of an upload session. Used to
// determine whether the client is recent enough to communicate with the
// server, and to receive any metadata needed for the upload session.
message ServerInfoRequest {
// Client-side TensorBoard version, per `tensorboard.version.VERSION`.
string version = 1;
}

message ServerInfoResponse {
// Primary bottom-line: is the server compatible with the client, and is
// there anything that the end user should be aware of?
Compatibility compatibility = 1;
// Identifier for a gRPC server providing the `TensorBoardExporterService` and
// `TensorBoardWriterService` services (under the `tensorboard.service` proto
// package).
ApiServer api_server = 2;
// How to generate URLs to experiment pages.
ExperimentUrlFormat url_format = 3;
}

enum CompatibilityVerdict {
VERDICT_UNKNOWN = 0;
// All is well. The client may proceed.
VERDICT_OK = 1;
// The client may proceed, but should heed the accompanying message. This
// may be the case if the user is on a version of TensorBoard that will
// soon be unsupported, or if the server is experiencing transient issues.
VERDICT_WARN = 2;
// The client should cease further communication with the server and abort
// operation after printing the accompanying `details` message.
VERDICT_ERROR = 3;
}

message Compatibility {
CompatibilityVerdict verdict = 1;
// Human-readable message to display. When non-empty, will be displayed in
// all cases, even when the client may proceed.
string details = 2;
}

message ApiServer {
// gRPC server URI: <https://github.com/grpc/grpc/blob/master/doc/naming.md>.
// For example: "api.tensorboard.dev:443".
string endpoint = 1;
}

message ExperimentUrlFormat {
// Template string for experiment URLs. All occurrences of the value of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want these to be absolute or relative?

Relative has the advantage that it allows a single frontend to be exposed across multiple domain names and clients would generate the appropriate URLs. Of course, we could I suppose extract this from the incoming /api/uploader request and do the substitution server-side; I could see an argument for either approach.

Another possibility would be adding a template placeholder for the base URL, which if included is basically the same as a relative URL, but could be omitted in the actual template to allow the frontend to return an absolute URL if it wanted to avoid ambiguity (e.g. if wanted to do that on prod).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I was doing the expansion server side based on the requested host.
If we used relative URLs, then the client would need to pass around the
origin with the ServerInfoResponse everywhere—the latter would no
longer contain all the needed information. Not the end of the world, but
the server-side handling (http://cl/278027294) seemed simple to me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SGTM, thanks for the explanation.

// `id_placeholder` field in this template string should be replaced with an
// experiment ID. For example, if `id_placeholder` is "{{EID}}", then
// `template` might be "https://tensorboard.dev/experiment/{{EID}}/".
// Should be absolute.
string template = 1;
// Placeholder string that should be replaced with an actual experiment ID.
// (See docs for `template` field.)
string id_placeholder = 2;
}
100 changes: 100 additions & 0 deletions tensorboard/uploader/server_info.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Initial server communication to determine session parameters."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

from google.protobuf import message
import requests

from tensorboard import version
from tensorboard.uploader.proto import server_info_pb2


# Request timeout for communicating with remote server.
_REQUEST_TIMEOUT_SECONDS = 10


def _server_info_request():
request = server_info_pb2.ServerInfoRequest()
request.version = version.VERSION
return request


def fetch_server_info(origin):
"""Fetches server info from a remote server.

Args:
origin: The server with which to communicate. Should be a string
like "https://tensorboard.dev", including protocol, host, and (if
needed) port.

Returns:
A `server_info_pb2.ServerInfoResponse` message.

Raises:
CommunicationError: Upon failure to connect to or successfully
communicate with the remote server.
"""
endpoint = "%s/api/uploader" % origin
post_body = _server_info_request().SerializeToString()
try:
response = requests.post(
endpoint, data=post_body, timeout=_REQUEST_TIMEOUT_SECONDS
)
except requests.RequestException as e:
raise CommunicationError("Failed to connect to backend: %s" % e)
if not response.ok:
raise CommunicationError(
"Non-OK status from backend (%d %s): %r"
% (response.status_code, response.reason, response.content)
)
try:
return server_info_pb2.ServerInfoResponse.FromString(response.content)
except message.DecodeError as e:
raise CommunicationError(
"Corrupt response from backend (%s): %r" % (e, response.content)
)


def create_server_info(frontend_origin, api_endpoint):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only used in tests, I think it's preferable to keep it in test code or utilities.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s used outside of tests (as of #2879), when you directly specify an
endpoint. The idea is:

  • tensorboard dev --origin=https://tensorboard.dev --api_endpoint=
    (normal prod usage; these are the default options)
  • tensorboard dev --origin=localhost:8080 (for development, running
    both a local frontend and a local backend)
  • tensorboard dev --api_endpoint=localhost:10000 (for development,
    running only a local backend)

Historically, it’s been possible to test the uploader by spinning up a
local gRPC server without having to also spin up a frontend. I wanted to
preserve that ability, because it’s convenient for testing flows that
don’t require a frontend (upload → delete, upload → export, etc.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I didn't check follow-on changes for usage.

If the goal is to allow locally overriding the API endpoint though, it seems better to me to just have --api_endpoint override whatever's returned from the frontend's ServerInfo endpoint, rather using it to construct a new ServerInfo without hitting the frontend at all. As-is, it's semantically not really right to always use /experiment/ as the URL pattern here, since that should be determined by the frontend.

Or concretely, if I were running local storzy but against the hosted-tensorboard-dev Spanner, I might want to be able to run tensorboard dev --origin=https://hosted-tensorboard-dev.appspot.com --api_endpoint=localhost:10000 and I would argue that should use the experiment pattern and other metadata fetched from the origin still.

If it would seem weird if specifying a local --api_endpoint but not specifying --origin for it to default to pinging tensorboard.dev, we could change the default for --origin so that it only defaults to tensorboard.dev when --api_endpoint is not passed, and otherwise it defaults to empty and then only in that case do we locally construct the ServerInfo (and maybe in that case we should just use a dummy experiment URL pattern).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; I also wasn’t fully content with the flags formulation in #2879.
Your approach sounds fine to me: I’ll revise that PR accordingly.

"""Manually creates server info given a frontend and backend.

Args:
frontend_origin: The origin of the TensorBoard.dev frontend, like
"https://tensorboard.dev" or "http://localhost:8000".
api_endpoint: As to `server_info_pb2.ApiServer.endpoint`.

Returns:
A `server_info_pb2.ServerInfoResponse` message.
"""
result = server_info_pb2.ServerInfoResponse()
result.compatibility.verdict = server_info_pb2.VERDICT_OK
result.api_server.endpoint = api_endpoint
url_format = result.url_format
placeholder = "{{EID}}"
while placeholder in frontend_origin:
placeholder = "{%s}" % placeholder
url_format.template = "%s/experiment/%s/" % (frontend_origin, placeholder)
url_format.id_placeholder = placeholder
return result


class CommunicationError(RuntimeError):
"""Raised upon failure to communicate with the server."""

pass
156 changes: 156 additions & 0 deletions tensorboard/uploader/server_info_test.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,156 @@
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Tests for tensorboard.uploader.server_info."""

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import errno
import os
import socket
from wsgiref import simple_server

from concurrent import futures
from werkzeug import wrappers

from tensorboard import test as tb_test
from tensorboard import version
from tensorboard.uploader import server_info
from tensorboard.uploader.proto import server_info_pb2


class FetchServerInfoTest(tb_test.TestCase):
"""Tests for `fetch_server_info`."""

def _start_server(self, app):
"""Starts a server and returns its origin ("http://localhost:PORT")."""
(_, localhost) = _localhost()
server_class = _make_ipv6_compatible_wsgi_server()
server = simple_server.make_server(localhost, 0, app, server_class)
executor = futures.ThreadPoolExecutor()
future = executor.submit(server.serve_forever, poll_interval=0.01)

def cleanup():
server.shutdown() # stop handling requests
server.server_close() # release port
future.result(timeout=3) # wait for server termination

self.addCleanup(cleanup)
return "http://localhost:%d" % server.server_port

def test_fetches_response(self):
expected_result = server_info_pb2.ServerInfoResponse()
expected_result.compatibility.verdict = server_info_pb2.VERDICT_OK
expected_result.compatibility.details = "all clear"
expected_result.api_server.endpoint = "api.example.com:443"
expected_result.url_format.template = "http://localhost:8080/{{eid}}"
expected_result.url_format.id_placeholder = "{{eid}}"

@wrappers.BaseRequest.application
def app(request):
self.assertEqual(request.method, "POST")
self.assertEqual(request.path, "/api/uploader")
body = request.get_data()
request_pb = server_info_pb2.ServerInfoRequest.FromString(body)
self.assertEqual(request_pb.version, version.VERSION)
return wrappers.BaseResponse(expected_result.SerializeToString())

origin = self._start_server(app)
result = server_info.fetch_server_info(origin)
self.assertEqual(result, expected_result)

def test_econnrefused(self):
(family, localhost) = _localhost()
s = socket.socket(family)
s.bind((localhost, 0))
self.addCleanup(s.close)
port = s.getsockname()[1]
with self.assertRaises(server_info.CommunicationError) as cm:
server_info.fetch_server_info("http://localhost:%d" % port)
msg = str(cm.exception)
self.assertIn("Failed to connect to backend", msg)
if os.name != "nt":
self.assertIn(os.strerror(errno.ECONNREFUSED), msg)

def test_non_ok_response(self):
@wrappers.BaseRequest.application
def app(request):
del request # unused
return wrappers.BaseResponse(b"very sad", status="502 Bad Gateway")

origin = self._start_server(app)
with self.assertRaises(server_info.CommunicationError) as cm:
server_info.fetch_server_info(origin)
msg = str(cm.exception)
self.assertIn("Non-OK status from backend (502 Bad Gateway)", msg)
self.assertIn("very sad", msg)

def test_corrupt_response(self):
@wrappers.BaseRequest.application
def app(request):
del request # unused
return wrappers.BaseResponse(b"an unlikely proto")

origin = self._start_server(app)
with self.assertRaises(server_info.CommunicationError) as cm:
server_info.fetch_server_info(origin)
msg = str(cm.exception)
self.assertIn("Corrupt response from backend", msg)
self.assertIn("an unlikely proto", msg)


class CreateServerInfoTest(tb_test.TestCase):
"""Tests for `create_server_info`."""

def test(self):
frontend = "http://localhost:8080"
backend = "localhost:10000"
result = server_info.create_server_info(frontend, backend)

expected_compatibility = server_info_pb2.Compatibility()
expected_compatibility.verdict = server_info_pb2.VERDICT_OK
expected_compatibility.details = ""
self.assertEqual(result.compatibility, expected_compatibility)

expected_api_server = server_info_pb2.ApiServer()
expected_api_server.endpoint = backend
self.assertEqual(result.api_server, expected_api_server)

url_format = result.url_format
actual_url = url_format.template.replace(url_format.id_placeholder, "123")
expected_url = "http://localhost:8080/experiment/123/"
self.assertEqual(actual_url, expected_url)


def _localhost():
"""Gets family and nodename for a loopback address."""
s = socket
infos = s.getaddrinfo(None, 0, s.AF_UNSPEC, s.SOCK_STREAM, 0, s.AI_ADDRCONFIG)
(family, _, _, _, address) = infos[0]
nodename = address[0]
return (family, nodename)


def _make_ipv6_compatible_wsgi_server():
"""Creates a `WSGIServer` subclass that works on IPv6-only machines."""
address_family = _localhost()[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a wee bit ugly IMO to have this resolved at class definition time. Optional, but we could instead pass the address_family into __init__ explicitly and set it on the instance then (prior to calling superclass init which is what constructs the socket). We'd have to explicitly construct the server rather than using make_server() but it's only 3 lines anyway.

https://github.com/python/cpython/blob/2.7/Lib/wsgiref/simple_server.py#L147

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK: overriding address_family directly on the class like this is
common (e.g., this Stack Overflow question), but I’ll grant that
it’s weird for the right-hand side to not be a compile-time constant
expression.

For some reason, TCPServer (the ancestor class that actually consumes
address_family) explicitly says not to override its initializer,
so even though I can’t directly see what might go wrong if we should
ignore this directive, I’ve gone ahead and just created the class
dynamically. Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm about 95% sure that the directive May be extended, do not override is a poorly worded instruction that subclasses that define __init__() should be sure to call super().__init__() and thus "extend" __init__(), as opposed to if they didn't call super() which would be "overriding".

Evidence for this would be that otherwise "extended" makes no sense, and also that plenty of subclasses of TCPServer exist that define their own __init__() including the werkzeug dev server we currently use, as well as some examples in stdlib like https://github.com/python/cpython/blob/484edbf9bf1a9e6bae0fcb10a0c165b89ea79295/Lib/logging/config.py#L869-L884

Manually constructing the type on the fly is... new to me :P But if you'd rather not touch __init__ then it will do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah—not “This class may be extended; do not override its initializer” but
rather ”You may extend this initializer, as long as you do not override
(= fail to invoke) its superclass implementation.” Yeah, that at least
makes some sense. I had also noted that TCPServer’s base class has the
same directive, and of course TCPServer defines an initializer, so
this is more evidence for your idea.

Either approach is fine with me, then, so I’ll keep it as is.

attrs = {"address_family": address_family}
bases = (simple_server.WSGIServer, object) # `object` needed for py2
return type("_Ipv6CompatibleWsgiServer", bases, attrs)


if __name__ == "__main__":
tb_test.main()