Skip to content

Conversation

@wchargin
Copy link
Contributor

@wchargin wchargin commented Dec 5, 2019

Summary:
Experiment IDs must be passed to the data provider, but the multiplexer
provider doesn’t actually need them, so code that forgets to pass the ID
will be silently broken on other providers. This commit updates the
multiplexer provider to explicitly fail when the ID is omitted.

This would have caught bugs in earlier drafts of both #2981 and #2991,
which were written by different people, so clearly the mistake is easy
to make. :-)

Test Plan:
Running with --generic_data true, the scalars, histograms, and
distributions dashboards all still work. The graphs dashboard has the
same error as prior to this change (see comments on #2991). Unit tests
have been updated to always pass experiment IDs.

wchargin-branch: muxprovider-safety-eid

Summary:
Experiment IDs must be passed to the data provider, but the multiplexer
provider doesn’t actually need them, so code that forgets to pass the ID
will be silently broken on other providers. This commit updates the
multiplexer provider to explicitly fail when the ID is omitted.

This would have caught bugs in earlier drafts of both #2981 and #2991,
which were written by different people, so clearly the mistake is easy
to make. :-)

Test Plan:
Running with `--generic_data true`, the scalars, histograms, and
distributions dashboards all still work. The graphs dashboard has the
same error as prior to this change (see comments on #2991). Unit tests
have been updated to always pass experiment IDs.

wchargin-branch: muxprovider-safety-eid
Copy link
Member

@davidsoergel davidsoergel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving for now because this just clarifies the status quo. However the need to do this at all suggests that the abstractions are off or leaky. We're imposing a requirement that doesn't actually exist, and then we pass fake data to meet the requirement.

Conversely, if we wanted to leave the API alone but tighten things up further, we could insist that logdirs are considered to have a specific stand-in experiment_id defined in a constant somewhere (in practice, perhaps the empty string). As it stands, even with this change, the same data is returned for different experiment_ids, which is unexpected.

@wchargin
Copy link
Contributor Author

wchargin commented Dec 5, 2019

However the need to do this at all suggests that the abstractions are
off or leaky. We're imposing a requirement that doesn't actually
exist, and then we pass fake data to meet the requirement.

Hmm. Perhaps? I’m not sure. I see the “need to do this” as simply
compensating for the lack of type safety in the language itself. For
instance, if this were in Java, and the interface were

interface DataProvider {
    String dataLocation(String experimentId);
    List<Run> listRuns(String experimentId);
    List<ScalarTimeSeries> listScalars(String experimentId, String pluginName, RunTagFilter rtf);
    // etc.
}

then we’d get the same level of safety without the extra validation in
the implementations. The multiplexer provider would then be just an
implementation that doesn’t touch the experiment ID at all, as it was
before this change.

The multiplexer provider is a compatibility bridge between old
semantics, which didn’t have a concept of experiments at all, and new
semantics, which do. It should be expected that the ontology is not
exactly the same.

@wchargin wchargin merged commit 5a5023f into master Dec 6, 2019
@wchargin wchargin deleted the wchargin-muxprovider-safety-eid branch December 6, 2019 00:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants