-
Notifications
You must be signed in to change notification settings - Fork 1.7k
uploader: only send data for allowed plugins #3300
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Summary: This enables us to flag features on the server side, so that we can stage releases, and so that we can roll back in a pinch without requiring an expedited PyPI push and end user version upgrade. Test Plan: No new behavior in this change. The protos still build. wchargin-branch: uploader-plugincontrol-protos wchargin-source: cc6c5c6e096bbba2c24cda7ba5b2b97cb142e8fd
Summary: When the server sends a `PluginControl` message in the initial handshake, we now only send data for plugins specified therein. Test Plan: Unit tests included. DO NOT SUBMIT until end-to-end tested on both old and new servers. wchargin-branch: uploader-plugincontrol wchargin-source: 8d294c291041b688c5cc42cd19d3f6e6fef8fc78
wchargin-branch: uploader-plugincontrol wchargin-source: 046c05ee3cf246ce38f9c4674e53077eecb3f8f9
|
Server-side change (Google-internal): http://cl/297460920 |
wchargin-branch: uploader-plugincontrol-protos wchargin-source: 4f801e62f79e9be1f54d473337da1ca092239184
wchargin-branch: uploader-plugincontrol wchargin-source: e32537bff8875fb6609c6fc2977d8d637fe2f12d
wchargin-branch: uploader-plugincontrol-protos wchargin-source: 929b65552dfeb9db6e7c5767a8ce1f253b3c3852
wchargin-branch: uploader-plugincontrol-protos wchargin-source: 929b65552dfeb9db6e7c5767a8ce1f253b3c3852
wchargin-branch: uploader-plugincontrol wchargin-source: 75c92834307a679ced4afad62aadfdbc127f8df7
wchargin-branch: uploader-plugincontrol wchargin-source: 73b86407637afc795a8753ffc874c485f84134c1
Summary: Tests create uploaders and request senders with dependency-injected stubs. As we add more such stubs, the test code bloats and requires broad changes. By centralizing helpers, we can inject defaults that are selectively overridden. Test Plan: Existing tests suffice. wchargin-branch: uploader-test-di-helpers wchargin-source: b5da05e5125eea1248b67841b80a3094b0d9d5a4
wchargin-branch: uploader-plugincontrol wchargin-source: b76b10849e0372570992b700b00842ad5d2bf502 # Conflicts: # tensorboard/uploader/uploader_test.py
wchargin-branch: uploader-plugincontrol wchargin-source: b76b10849e0372570992b700b00842ad5d2bf502
| # If later events arrive with a mismatching plugin_name, they are | ||
| # ignored with a warning. | ||
| metadata = self._tag_metadata.get(time_series_key) | ||
| first_in_time_series = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mechanism of detecting 'first time' is a little entangled hard to follow. How about maintaining a collection of plugin_names_seen and testing if plugin_name in plugin_names_seen:?
Related, the 'first time' gets reset for every _upload_once(). Is that desirable, or should we maintain a singleton plugin_names_seen (or similar) that persists over cycles? In this case you'd change the error message too, of course, so it's not time-series-specific but just appears once per plugin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mechanism of detecting 'first time' is a little entangled hard to
follow. How about maintaining a collection ofplugin_names_seenand
testingif plugin_name in plugin_names_seen:?
We warn once per time series, not once per plugin. So rather than a
collection of “plugin names seen”, we maintain a collection of “time
series seen”—but this is just the key set of self._tag_metadata.
Related, the 'first time' gets reset for every
_upload_once().
Hmm; I’m not able to reproduce that? We create a _BatchedRequestSender
only once, at create_experiment time, and the “already seen?” source
of truth is on its self._tag_metadata instance attribute, which is
never cleared.
It may help to look at the log output to address both questions. The
output looks like:
directory_loader.py:131] Loading data from path /HOMEDIR/tensorboard_data/mnist/lr_1E-03,conv=1,fc=2/events.out.tfevents.1563406327.HOSTNAME
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'input/image/0') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'input/image/1') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'input/image/2') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'conv/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'conv/biases') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'conv/activations') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/biases') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/activations') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/relu') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc2/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc2/biases') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc2/activations') with unsupported plugin name 'histograms'
uploader.py:548] Trying request of 8154 bytes
uploader.py:555] Upload for 1 runs (8154 bytes) took 0.254 seconds
uploader.py:548] Trying request of 8166 bytes
uploader.py:555] Upload for 1 runs (8166 bytes) took 0.311 seconds
directory_loader.py:131] Loading data from path /HOMEDIR/tensorboard_data/mnist/lr_1E-03,conv=2,fc=2/events.out.tfevents.1563406405.HOSTNAME
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'input/image/0') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'input/image/1') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'input/image/2') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'conv1/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'conv1/biases') with unsupported plugin name 'histograms'
Note that each time series (run-tag combination) appears only once, even
though we process the some time series in multiple upload cycles. For
instance, the second upload cycle here doesn’t print any “Skipping…”
messages, and the messages in the third upload cycle are for new time
series (conv=1 vs. conv=2).
wchargin-branch: uploader-plugincontrol wchargin-source: e23a7157fccccc5badb0531cdae3fc41e5e8c993 # Conflicts: # tensorboard/uploader/uploader_test.py
wchargin-branch: uploader-plugincontrol wchargin-source: e23a7157fccccc5badb0531cdae3fc41e5e8c993
wchargin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that my comment below addresses your comment without any need
for code change, so I’ll plan to merge this once CI passes. If I’ve
misunderstood, feel free to correct me either before or after merging
and I will fix it to your satisfaction.
| # If later events arrive with a mismatching plugin_name, they are | ||
| # ignored with a warning. | ||
| metadata = self._tag_metadata.get(time_series_key) | ||
| first_in_time_series = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mechanism of detecting 'first time' is a little entangled hard to
follow. How about maintaining a collection ofplugin_names_seenand
testingif plugin_name in plugin_names_seen:?
We warn once per time series, not once per plugin. So rather than a
collection of “plugin names seen”, we maintain a collection of “time
series seen”—but this is just the key set of self._tag_metadata.
Related, the 'first time' gets reset for every
_upload_once().
Hmm; I’m not able to reproduce that? We create a _BatchedRequestSender
only once, at create_experiment time, and the “already seen?” source
of truth is on its self._tag_metadata instance attribute, which is
never cleared.
It may help to look at the log output to address both questions. The
output looks like:
directory_loader.py:131] Loading data from path /HOMEDIR/tensorboard_data/mnist/lr_1E-03,conv=1,fc=2/events.out.tfevents.1563406327.HOSTNAME
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'input/image/0') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'input/image/1') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'input/image/2') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'conv/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'conv/biases') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'conv/activations') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/biases') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/activations') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc1/relu') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc2/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc2/biases') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=1,fc=2', 'fc2/activations') with unsupported plugin name 'histograms'
uploader.py:548] Trying request of 8154 bytes
uploader.py:555] Upload for 1 runs (8154 bytes) took 0.254 seconds
uploader.py:548] Trying request of 8166 bytes
uploader.py:555] Upload for 1 runs (8166 bytes) took 0.311 seconds
directory_loader.py:131] Loading data from path /HOMEDIR/tensorboard_data/mnist/lr_1E-03,conv=2,fc=2/events.out.tfevents.1563406405.HOSTNAME
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'input/image/0') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'input/image/1') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'input/image/2') with unsupported plugin name 'images'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'conv1/weights') with unsupported plugin name 'histograms'
uploader.py:345] Skipping time series ('lr_1E-03,conv=2,fc=2', 'conv1/biases') with unsupported plugin name 'histograms'
Note that each time series (run-tag combination) appears only once, even
though we process the some time series in multiple upload cycles. For
instance, the second upload cycle here doesn’t print any “Skipping…”
messages, and the messages in the third upload cycle are for new time
series (conv=1 vs. conv=2).
Summary:
When the server sends a `PluginControl` message in the initial
handshake, we now only send data for plugins specified therein.
Test Plan:
Unit tests included. As end-to-end tests:
- When running against prod (which does not send `plugin_control`),
experiment upload is unchanged. Info logs now include “Skipping time
series […] with unsupported plugin name 'histograms'”, just once per
time series.
- When running against a local server that serves scalars-only
explicitly—
```
plugin_control {
allowed_plugins: "scalars"
}
```
—the behavior is also the same.
- When running against a local server that serves “no plugins allowed”
explicitly—
```
plugin_control {
}
```
—nothing is uploaded, and the logs include “Skipping…” messages for
scalar time series.
wchargin-branch: uploader-plugincontrol
Summary:
When the server sends a `PluginControl` message in the initial
handshake, we now only send data for plugins specified therein.
Test Plan:
Unit tests included. As end-to-end tests:
- When running against prod (which does not send `plugin_control`),
experiment upload is unchanged. Info logs now include “Skipping time
series […] with unsupported plugin name 'histograms'”, just once per
time series.
- When running against a local server that serves scalars-only
explicitly—
```
plugin_control {
allowed_plugins: "scalars"
}
```
—the behavior is also the same.
- When running against a local server that serves “no plugins allowed”
explicitly—
```
plugin_control {
}
```
—nothing is uploaded, and the logs include “Skipping…” messages for
scalar time series.
wchargin-branch: uploader-plugincontrol
Summary:
When the server sends a
PluginControlmessage in the initialhandshake, we now only send data for plugins specified therein.
Test Plan:
Unit tests included. As end-to-end tests:
When running against prod (which does not send
plugin_control),experiment upload is unchanged. Info logs now include “Skipping time
series […] with unsupported plugin name 'histograms'”, just once per
time series.
When running against a local server that serves scalars-only
explicitly—
—the behavior is also the same.
When running against a local server that serves “no plugins allowed”
explicitly—
—nothing is uploaded, and the logs include “Skipping…” messages for
scalar time series.
wchargin-branch: uploader-plugincontrol