Skip to content

Conversation

@nfelt
Copy link
Contributor

@nfelt nfelt commented Mar 5, 2021

This changes the behavior of RunLoader when encountering a Summary.Value that doesn't actually contain any value, i.e. the value oneof field within the message is entirely unset.

Previously, RunLoader simply ignores these Values entirely. With this PR, we instead interpret them as though their value oneof had its tensor field populated with the so-called "null tensor" which we define to be the rank-1 length-0 float32 tensor. (The rationale for that particular tensor comes from #3386 as precedent.)

The immediate goal of this PR is to support hparams legacy summary data that didn't set any value, since all the actual information is persisted via the plugin content. The Python data_compat.py fix in #3386 implements a similar conversion for valueless summaries, but only does it when the summary metadata indicates the hparams plugin.

It seemed simpler and still reasonable to just to this conversion unconditionally (though I'm open to counterargument), especially in the Rust code where the dispatching logic is structured differently; conditioning on hparams here requires special-casing on the plugin in a new place where we currently don't have to, unlike the Python code (well, either that or introducing an Option indirection layer within EventValue to allow for valueless summaries, which seemed like it would overly complicate the common case to handle this edge case).

Test plan: confirmed that with this plus the diffbase #4734, the Hparams dashboard loads legacy data.

Part of #4422 tensor support.

@google-cla google-cla bot added the cla: yes label Mar 5, 2021
@nfelt nfelt changed the base branch from master to nfelt-rust-tensors-datacompat-basic March 5, 2021 06:43
@nfelt nfelt changed the title Nfelt rust tensors handle no value rust: interpret summaries with no value as null tensors Mar 5, 2021
@nfelt nfelt marked this pull request as draft March 5, 2021 06:45
@nfelt nfelt added the core:rustboard //tensorboard/data/server/... label Mar 5, 2021
@nfelt nfelt requested a review from wchargin March 5, 2021 07:17
@nfelt nfelt marked this pull request as ready for review March 5, 2021 07:33
Copy link
Contributor

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seemed simpler and still reasonable to just to this conversion unconditionally (though I'm open to counterargument),

Hmm, okay. I’m weakly −1 on the principle of this, because there isn’t
really a natural choice for the value of the null tensor and so I would
like to keep the hairiness scoped to the case in which it’s actually
needed. I had thought that it would suffice to read the summary metadata
from into_tensor, but I see the problem: there’s no EventValue to
construct. So that changes the situation.

In that case, agreed that your solution seems pretty reasonable. Adding
a plugin name comparison in the hottest path—every summary event—seems
questionable, and the hairiness only extends to invalid data, anyway.

Approved modulo data type (see inline).

})),
..Default::default()
})?;
// Write an empty summary with hparams metadata but no data_class.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice; this seems like a good thing to test.

// float vector is of minimal serialized length (6 bytes) among valid tensors.
fn null_tensor_proto() -> pb::TensorProto {
pb::TensorProto {
dtype: pb::DataType::DtString.into(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Want DtFloat, not DtString, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh. Done.

@nfelt
Copy link
Contributor Author

nfelt commented Mar 5, 2021

It seemed simpler and still reasonable to just to this conversion unconditionally (though I'm open to counterargument),

Hmm, okay. I’m weakly −1 on the principle of this, because there isn’t
really a natural choice for the value of the null tensor and so I would
like to keep the hairiness scoped to the case in which it’s actually
needed. I had thought that it would suffice to read the summary metadata
from into_tensor, but I see the problem: there’s no EventValue to
construct. So that changes the situation.

Thinking more about this, I guess I could just add EventValue::Empty (versus my original thought of putting an Option inside EventValue::Summary which felt like it interfered more with the normal case). It's still a little annoying though since we end up distributing the handling for this special case across multiple different places: RunLoader in a generic way, and then we need to special-case hparams both in Empty::initial_metadata() as well as EventValue::into_tensor() (assuming we convert into_scalar() to have a _ => Err(DataLoss) branch in its match self like into_blob_sequence() does, which would also cover the new Empty case).

(As an aside, this does make me wish there was some way attach data classes to EventValue in a way that lets us constrain the initial_metadata() and into_*() methods so that we can actually eliminate the codepaths can't be logically reached, like calling anything but into_tensor() on an EventValue::Empty. But I'm guessing that would complicate our type situation even more, so not entirely clear if it's worth it.)

Anyway, if you'd prefer that route to limit the hairiness I can do that instead - LMK?

@wchargin
Copy link
Contributor

wchargin commented Mar 5, 2021

It's still a little annoying though since we end up distributing the handling for this special case across multiple different places

Right. I think that what you’ve described is internally consistent and
probably a bit more principled, but clearly more complex than what you
have already. I’m not requesting it.

As an aside, this does make me wish there was some way attach data classes to EventValue […] But I'm guessing that would complicate our type situation even more

This is a typical case for GADTs. They are indeed quite complicated
compared to your standard ML-family type system. (For instance, type
checking is still possible, but full inference is undecidable.)

In this particular case, I don’t think it even suffices, because you can
hit those cases: a time series with a simple_value at step 0 and an
empty value at step 1 will call EventValue::Empty.into_scalar(), no?

Anyway, if you'd prefer that route to limit the hairiness I can do that instead - LMK?

I’m happy to keep it simple.

Base automatically changed from nfelt-rust-tensors-datacompat-basic to master March 5, 2021 22:22
Copy link
Contributor Author

@nfelt nfelt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this particular case, I don’t think it even suffices, because you can
hit those cases: a time series with a simple_value at step 0 and an
empty value at step 1 will call EventValue::Empty.into_scalar(), no?

I guess not. I think maybe what I meant was more just a way to refactor out some of the common EventValue -> DataClass logic that has to exist in initial_metadata() so that it could be used to auto-reject later events that aren't compatible with the dataclass that's been established for that time series, but I guess it's not obvious that can be cleanly pulled out without wasting intermediate results or otherwise doing unnecessary effort.

Anyway, will leave this as-is.

// float vector is of minimal serialized length (6 bytes) among valid tensors.
fn null_tensor_proto() -> pb::TensorProto {
pb::TensorProto {
dtype: pb::DataType::DtString.into(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D'oh. Done.

@nfelt nfelt merged commit a59140d into master Mar 6, 2021
@nfelt nfelt deleted the nfelt-rust-tensors-handle-no-value branch March 6, 2021 00:01
@wchargin wchargin mentioned this pull request Mar 6, 2021
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes core:rustboard //tensorboard/data/server/...

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants