-
Notifications
You must be signed in to change notification settings - Fork 1.7k
rust: interpret summaries with no value as null tensors #4735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
wchargin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seemed simpler and still reasonable to just to this conversion unconditionally (though I'm open to counterargument),
Hmm, okay. I’m weakly −1 on the principle of this, because there isn’t
really a natural choice for the value of the null tensor and so I would
like to keep the hairiness scoped to the case in which it’s actually
needed. I had thought that it would suffice to read the summary metadata
from into_tensor, but I see the problem: there’s no EventValue to
construct. So that changes the situation.
In that case, agreed that your solution seems pretty reasonable. Adding
a plugin name comparison in the hottest path—every summary event—seems
questionable, and the hairiness only extends to invalid data, anyway.
Approved modulo data type (see inline).
| })), | ||
| ..Default::default() | ||
| })?; | ||
| // Write an empty summary with hparams metadata but no data_class. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice; this seems like a good thing to test.
tensorboard/data/server/run.rs
Outdated
| // float vector is of minimal serialized length (6 bytes) among valid tensors. | ||
| fn null_tensor_proto() -> pb::TensorProto { | ||
| pb::TensorProto { | ||
| dtype: pb::DataType::DtString.into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Want DtFloat, not DtString, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh. Done.
Thinking more about this, I guess I could just add (As an aside, this does make me wish there was some way attach data classes to Anyway, if you'd prefer that route to limit the hairiness I can do that instead - LMK? |
Right. I think that what you’ve described is internally consistent and
This is a typical case for GADTs. They are indeed quite complicated In this particular case, I don’t think it even suffices, because you can
I’m happy to keep it simple. |
nfelt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this particular case, I don’t think it even suffices, because you can
hit those cases: a time series with asimple_valueat step 0 and an
empty value at step 1 will callEventValue::Empty.into_scalar(), no?
I guess not. I think maybe what I meant was more just a way to refactor out some of the common EventValue -> DataClass logic that has to exist in initial_metadata() so that it could be used to auto-reject later events that aren't compatible with the dataclass that's been established for that time series, but I guess it's not obvious that can be cleanly pulled out without wasting intermediate results or otherwise doing unnecessary effort.
Anyway, will leave this as-is.
tensorboard/data/server/run.rs
Outdated
| // float vector is of minimal serialized length (6 bytes) among valid tensors. | ||
| fn null_tensor_proto() -> pb::TensorProto { | ||
| pb::TensorProto { | ||
| dtype: pb::DataType::DtString.into(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
D'oh. Done.
This changes the behavior of
RunLoaderwhen encountering aSummary.Valuethat doesn't actually contain any value, i.e. thevalueoneof field within the message is entirely unset.Previously,
RunLoadersimply ignores theseValues entirely. With this PR, we instead interpret them as though theirvalueoneof had itstensorfield populated with the so-called "null tensor" which we define to be the rank-1 length-0 float32 tensor. (The rationale for that particular tensor comes from #3386 as precedent.)The immediate goal of this PR is to support hparams legacy summary data that didn't set any value, since all the actual information is persisted via the plugin content. The Python
data_compat.pyfix in #3386 implements a similar conversion for valueless summaries, but only does it when the summary metadata indicates the hparams plugin.It seemed simpler and still reasonable to just to this conversion unconditionally (though I'm open to counterargument), especially in the Rust code where the dispatching logic is structured differently; conditioning on
hparamshere requires special-casing on the plugin in a new place where we currently don't have to, unlike the Python code (well, either that or introducing anOptionindirection layer withinEventValueto allow for valueless summaries, which seemed like it would overly complicate the common case to handle this edge case).Test plan: confirmed that with this plus the diffbase #4734, the Hparams dashboard loads legacy data.
Part of #4422 tensor support.