Skip to content

Conversation

@wchargin
Copy link
Contributor

Summary:
The MultiplexerDataProvider now respects its downsample parameter,
even though the backing PluginEventMultiplexer already performs its
own sampling. This serves two purposes:

  • It enforces that clients are always specifying the downsample
    argument, which is required.
  • It enables us to test plugins’ downsampling parameters to verify
    that they will behave correctly with other data providers.

Test Plan:
Unit tests included. Note that changing the _DEFAULT_DOWNSAMPLING
constant in (e.g.) the scalars plugin to a small number (like 5) now
actually causes charts in the frontend to be downsampled.

wchargin-branch: data-mux-downsample

Summary:
We add a `sampling_hints` attribute to the `TBContext` magic container,
which is populated with the parsed form of the `--samples_per_plugin`
flag. Existing plugins’ generic data modes are updated to read from this
map instead of using hard-coded thresholds.

Test Plan:
This change is not actually observable as is, because the multiplexer
data provider ignores its downsampling argument. But after patching in a
change to make the data provider respect the downsampling argument, this
change has the effect that increasing the `--samples_per_plugin` over
the default (e.g., `images=20`) now properly increases the number of
samples shown in generic data mode, whereas previously it had no effect.

wchargin-branch: data-downsampling-flag
wchargin-source: 50998be15abd790a0915458bac76091c79823f0f
Summary:
The `MultiplexerDataProvider` now respects its `downsample` parameter,
even though the backing `PluginEventMultiplexer` already performs its
own sampling. This serves two purposes:

  - It enforces that clients are always specifying the `downsample`
    argument, which is required.
  - It enables us to test plugins’ downsampling parameters to verify
    that they will behave correctly with other data providers.

Test Plan:
Unit tests included. Note that changing the `_DEFAULT_DOWNSAMPLING`
constant in (e.g.) the scalars plugin to a small number (like `5`) now
actually causes charts in the frontend to be downsampled.

wchargin-branch: data-mux-downsample
wchargin-source: 116ce4c206613e25e09fec31102b90ad80282496
@wchargin wchargin requested review from nfelt and stephanwlee and removed request for nfelt February 20, 2020 21:32
@wchargin
Copy link
Contributor Author

(Reassigning just to spread load.)

Copy link
Contributor

@stephanwlee stephanwlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It enforces that clients are always specifying the downsample
argument, which is required.

Is this saying all data_provider, in the future, will require downsample argument? Or is it only true for MultiplexerDataProvider?


def test_inorder(self):
xs = list(range(10000))
actual = data_provider._downsample(xs, k=100)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No AI. Simply noting: there is non-zero chance where this passes even if it behaved more like random.sample. :\

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, but that chance is 1 / (100!), which is somewhat small.

@wchargin wchargin changed the base branch from wchargin-data-downsampling-flag to master February 21, 2020 07:04
wchargin-branch: data-mux-downsample
wchargin-source: c6edc8bfebc0d4c040c346e8ca9729e399512f25
Copy link
Contributor Author

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick reviews!

Is this saying all data_provider, in the future, will require downsample argument? Or is it only true for MultiplexerDataProvider?

It’s required by the data provider interface (e.g., read_scalars),
so any particular data provider implementation is permitted to not
require it, but code that is written against arbitrary data providers
must provide it.


def test_inorder(self):
xs = list(range(10000))
actual = data_provider._downsample(xs, k=100)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, but that chance is 1 / (100!), which is somewhat small.

@wchargin wchargin merged commit 4670696 into master Feb 21, 2020
@wchargin wchargin deleted the wchargin-data-mux-downsample branch February 21, 2020 07:46
wchargin added a commit that referenced this pull request Mar 15, 2021
Summary:
As the comment says, the histograms plugin used to manually downsample
because the multiplexer data provider didn’t. As of #3272, it does, so
this code is vestigial.

Test Plan:
Note that changing the default histogram sampling constants (in the same
file) to very small numbers causes the frontend to show less data. Unit
tests for the histograms plugin also cover this: removing downsampling
from the multiplexer data provider causes the histogram tests to fail.

wchargin-branch: histogram-rm-downsample
wchargin-source: 33928fd1feb314d955ed06f25df5067c39d7f779
wchargin added a commit that referenced this pull request Mar 15, 2021
Summary:
As the comment says, the histograms plugin used to manually downsample
because the multiplexer data provider didn’t. As of #3272, it does, so
this code is vestigial.

Test Plan:
Note that changing the default histogram sampling constants (in the same
file) to very small numbers causes the frontend to show less data. Unit
tests for the histograms plugin also cover this: removing downsampling
from the multiplexer data provider causes the histogram tests to fail.

wchargin-branch: histogram-rm-downsample
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants