Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions tensorboard/plugins/core/core_plugin.py
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@
logger = tb_logging.get_logger()


# If no port is specified, try to bind to this port. See help for --port
# for more details.
DEFAULT_PORT = 6006


class CorePlugin(base_plugin.TBPlugin):
"""Core plugin for TensorBoard.

Expand Down Expand Up @@ -294,12 +299,14 @@ def define_flags(self, parser):
parser.add_argument(
'--port',
metavar='PORT',
type=int,
default=6006,
type=lambda s: (None if s == "default" else int(s)),
default="default",
help='''\
Port to serve TensorBoard on. Pass 0 to request an unused port selected
by the operating system. (default: %(default)s)\
''')
by the operating system, or pass "default" to try to bind to the default
port (%s) but search for a nearby free port if the default port is
unavailable. (default: "default").\
''' % DEFAULT_PORT)

parser.add_argument(
'--purge_orphaned_data',
Expand Down
79 changes: 56 additions & 23 deletions tensorboard/program.py
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,15 @@

import six
from six.moves import urllib
from six.moves import xrange # pylint: disable=redefined-builtin
from werkzeug import serving

from tensorboard import manager
from tensorboard import version
from tensorboard.backend import application
from tensorboard.backend.event_processing import event_file_inspector as efi
from tensorboard.plugins import base_plugin
from tensorboard.plugins.core import core_plugin
from tensorboard.util import tb_logging
from tensorboard.util import util

Expand Down Expand Up @@ -347,36 +349,67 @@ class WerkzeugServer(serving.ThreadedWSGIServer, TensorBoardServer):
def __init__(self, wsgi_app, flags):
self._flags = flags
host = flags.host

# base_port: what's the first port to which we should try to bind?
# should_scan: if that fails, shall we try additional ports?
(base_port, should_scan) = (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A multi-valued, multi-line ternary is kind of a complicated beast to read. Maybe consider just doing

should_scan = flags.port is None
base_port = flags.port if flags.port is not None else core_plugin.DEFAULT_PORT

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed; that’s cleaner. Will patch.

(flags.port, False)
if flags.port is not None
else (core_plugin.DEFAULT_PORT, True)
)
if base_port > 0xFFFF:
raise TensorBoardServerException(
'TensorBoard cannot bind to port %d > %d' % (base_port, 0xFFFF)
)
max_attempts = 10 if should_scan else 1
base_port = min(base_port + max_attempts, 65536) - max_attempts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use either 0xFFFF or 65536 consistently?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure: 0xFFFF and 0x10000 is probably cleaner.


self._auto_wildcard = False
if not host:
# Without an explicit host, we default to serving on all interfaces,
# and will attempt to serve both IPv4 and IPv6 traffic through one socket.
host = self._get_wildcard_address(flags.port)
host = self._get_wildcard_address(base_port)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strictly speaking this should be done per-port right? Though I agree it would be exotic to have the different ports have different wildcard addresses.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That’s true. I can move this, I think.

self._auto_wildcard = True
try:
super(WerkzeugServer, self).__init__(host, flags.port, wsgi_app)
except socket.error as e:
if hasattr(errno, 'EACCES') and e.errno == errno.EACCES:
raise TensorBoardServerException(
'TensorBoard must be run as superuser to bind to port %d' %
flags.port)
elif hasattr(errno, 'EADDRINUSE') and e.errno == errno.EADDRINUSE:
if flags.port == 0:

for (attempt_index, port) in (
enumerate(xrange(base_port, base_port + max_attempts))):
try:
# Yes, this invokes the super initializer potentially many
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is kind of scary... invoking super() multiple times seems like asking for trouble if the base class changes so this isn't safe. Maybe add a TODO to revisit this in a way that doesn't require that? (i.e. use a nested server class)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it’s scary—not that there’s anything inherently
problematic with invoking __init__ multiple times (it’s a normal
method: an initializer, not a constructor), but because so uncustomary
that it’s probably reasonable for the base classes to make changes under
the assumption that __init__ is only called once.

I will add the TODO (and probably should have done that originally).
Using a nested server is what I was thinking of, too; the change should
be mostly straightforward, though not necessarily trivial.

FWIW, I actually did check every supertype in the mro, so I think it’s
safe with current library versions, though of course I may have missed
something.

# times. This seems to work fine, and looking at the superclass
# chain (type(self).__mro__) it doesn't seem that anything
# _should_ go wrong (nor does any superclass provide a facility
# to do this natively).
super(WerkzeugServer, self).__init__(host, port, wsgi_app)
break
except socket.error as e:
if hasattr(errno, 'EACCES') and e.errno == errno.EACCES:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason you're not using os.strerror()?
It's ok to add additional strings for specific context, but I've always seen the standard strerror() used as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not new in this PR; it just got indented by two extra
spaces. You can ask GitHub to omit whitespace-only diffs under the “Diff
settings” button at the top of the page:

image

To the question: I didn’t write the original code, but using strerror
would be fine with me (not that this it should be changed in this PR).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you need to check both the attr and e.errno?

The examples I'm seeing just do

e.errno == errno.EACCES

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in question is not new in this PR, but I agree with the
conservative hasattr: not all platforms have all error codes, and thus
the values available on the errno module are platform-dependent. (In
practice, both Linux and Windows support the four codes checked here.)
I probably would have written this as

if e.errno == getattr(errno, "EADDRINUSE", None):

because it’s simpler, faster, and avoids hasattr, but it’s also
fine as written.

raise TensorBoardServerException(
'TensorBoard must be run as superuser to bind to port %d' %
port)
elif hasattr(errno, 'EADDRINUSE') and e.errno == errno.EADDRINUSE:
if attempt_index < max_attempts - 1:
continue
if port == 0:
raise TensorBoardServerException(
'TensorBoard unable to find any open port')
elif should_scan:
raise TensorBoardServerException(
'TensorBoard could not bind to any port around %s '
'(tried %d times)'
% (base_port, max_attempts))
else:
raise TensorBoardServerException(
'TensorBoard could not bind to port %d, it was already in use' %
port)
elif hasattr(errno, 'EADDRNOTAVAIL') and e.errno == errno.EADDRNOTAVAIL:
raise TensorBoardServerException(
'TensorBoard unable to find any open port')
else:
'TensorBoard could not bind to unavailable address %s' % host)
elif hasattr(errno, 'EAFNOSUPPORT') and e.errno == errno.EAFNOSUPPORT:
raise TensorBoardServerException(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unfortunate Python2 doesn't have exception chaining. It would be ideal to capture the true cause and not swallow it. Any ideas on how to do that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in question is not new in this PR, but: no, unfortunately.
Python exception objects don’t store the traceback at which they were
created or thrown. You can extract the traceback corresponding to the
current exception handler manually from sys.exc_info() and store it
on the exception, but the calling code would need to know how to extract
it and display it. I generally only do this when writing libraries that
operate generically over exceptions (e.g., LazyRace.get()).

In this particular case, I think that it’s probably fine. We’ve
completely characterized the contents of the original error already
(it is determined by its errno), so the only thing that we’re missing
is the traceback into the Werkzeug and Python standard library
internals—but the traceback past this point can be reasonably considered
an implementation detail of TensorBoard’s WerkzeugServer.

'TensorBoard could not bind to port %d, it was already in use' %
flags.port)
elif hasattr(errno, 'EADDRNOTAVAIL') and e.errno == errno.EADDRNOTAVAIL:
raise TensorBoardServerException(
'TensorBoard could not bind to unavailable address %s' % host)
elif hasattr(errno, 'EAFNOSUPPORT') and e.errno == errno.EAFNOSUPPORT:
raise TensorBoardServerException(
'Tensorboard could not bind to unsupported address family %s' %
host)
# Raise the raw exception if it wasn't identifiable as a user error.
raise
'Tensorboard could not bind to unsupported address family %s' %
host)
# Raise the raw exception if it wasn't identifiable as a user error.
raise

def _get_wildcard_address(self, port):
"""Returns a wildcard address for the port in question.
Expand Down