Skip to content

Conversation

@nfelt
Copy link
Contributor

@nfelt nfelt commented Mar 5, 2021

This wires up data_compat.rs handling for tensors, as part of the tensor support subtask of #4422, which includes:

  • adding initial_metadata() special-cases for tensor-dataclass plugins
  • adding into_tensor() for enriching tensor values
  • calling into_tensor() from within RunLoader
  • tests for the above

I also adjusted some naming of other tests within data_compat.rs to be (IMO) slightly more precise and clear now that we have the new tensor test methods.

Note that we don't yet handle A) events with no summary value that should nonetheless be treated as having implicit Tensor values (the use case being hparams metadata persistence), and B) legacy histogram data stored in HistogramProto. I've left those for later PRs.

Test plan: I confirmed that with this change and the preceding RPC PR, we can now render the "normal" tensor dashboards. In particular, what now works is: text, histograms/distributions for data written with the tensor-based histogram summary APIs, PR curves, and NPMI (the visualization seems somewhat broken, at least with the demo data, but it's equivalently broken with and without --load_fast now). Hparams doesn't work yet per the note above. The mesh plugin hasn't been DataProviderized; I included it for completeness anyway but it's untested.

@nfelt nfelt added the core:rustboard //tensorboard/data/server/... label Mar 5, 2021
@nfelt nfelt requested a review from wchargin March 5, 2021 06:39
@google-cla google-cla bot added the cla: yes label Mar 5, 2021
@nfelt nfelt marked this pull request as ready for review March 5, 2021 07:14
@wchargin wchargin mentioned this pull request Mar 5, 2021
34 tasks
Copy link
Contributor

@wchargin wchargin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice; thanks! Only request is to remove nPMI special-casing if we can;
if that doesn’t work for some reason, happy to keep it. (I do want to
support it; I just think that we might get it for free.)

pub const TEXT: &str = "text";
pub const PR_CURVES: &str = "pr_curves";
pub const HPARAMS: &str = "hparams";
pub const MESH: &str = "mesh";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, interesting. I hadn’t planned to accommodate mesh because it
wasn’t obvious to me that the data fit well into the existing storage
classes. It’s not clearly Tensor because the data may be arbitrarily
large, and plausibly over our recommended limits. It’s not clearly
BlobSequence because it’s the wrong shape—though we could faithfully
reinterpret-cast without too much trouble, since all dimensions but one
are constant.

But I’m happy to just make it Tensor as you do here. I’m not sure how
widely used it is in conjunction with environments that would actually
care about the size limit (e.g., TensorBoard.dev uploads), so it may
just be a non-issue. As you note, the plugin backend still uses the
multiplexer, but we can fix that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess tensor seemed like the most obvious mapping, but it's true I hadn't really considered this deeply. Since this doesn't work anyway until the plugin is converted, I'll just omit it for now and we can cross that bridge when we get there.

FWIW though we may want to reassess the exact semantics we've assigned to the data classes, since text also doesn't really meet the bounded size criterion. Possibly what we really want is something more like "scalar" stored in f32, "bounded-size tensor with fixed-size elements" stored an inline TensorProto, "bounded-size tensor with unbounded-size elements" stored the way blob sequences are stored and where each element can be individually retrieved by index via its own RPC, and then some new thing for "unbounded-size tensor" where we only support essentially a streaming read of the entire tensor but it's chunked into roughly-constant-RPC-response-size batches to avoid excessive overhead when the individual elements are small but numerous?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah—in the past, I think, we’d agreed that it was “reasonable” to
assume that text would be small, even though it’s not bounded. The
unbounded-size tensor is also approximately what I was thinking, as
“basically sugar for a length-1 blob sequence, but we handle the
serialization for you”. From that lens, I don’t know that we actually
need a separate data type—emit_tensor(..., as_blob=True) or something
might suffice.

Happy to leave as is for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, I guess it basically is just sugar for a length-1 blob sequence, so it could be persisted as such. But I think we'd probably need a new data type at the DataProvider level if we wanted to both handle the Tensor serialization and also support a streaming read, right?

pub const PR_CURVES: &str = "pr_curves";
pub const HPARAMS: &str = "hparams";
pub const MESH: &str = "mesh";
pub const NPMI: &str = "npmi";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think nPMI data has always had DATA_CLASS_TENSOR at rest. Can
we get away with dropping the special case here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough, dropped.

Comment on lines 658 to 666
let md = pb::SummaryMetadata {
plugin_data: Some(PluginData {
plugin_name: "rando".to_string(),
content: Bytes::from_static(b"preserved!"),
..Default::default()
}),
data_class: pb::DataClass::Tensor.into(),
..Default::default()
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: We do have blank_with_plugin_content (per your suggestion), in
case you’d like to use that? Also suffices for the next test case, since
you can pass DataClass::Unknown.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(per your suggestion)

Done, lol.

/// This supports summaries with `tensor` populated.
//
// TODO(#4422): support conversion of other summary types to tensors.
pub fn into_tensor(self, _metadata: &pb::SummaryMetadata) -> Result<pb::TensorProto, DataLoss> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious what the thought process is behind taking _metadata here?
I omitted it from into_scalar because it’s not actually needed, but
that does make the signatures different and makes the RunLoader
integration ever so slightly more awkward. I don’t feel strongly either
way, just wondering about the rationale.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The thought process was "I mistakenly figured I was going to need it later (to handle empty hparams conversions) and then realized I didn't but wasn't sure it was worth taking it back out."

I did notice it's omitted from into_scalar(). I'm fine doing the same thing here if you think it's a bit cleaner to avoid unused parameters when they aren't required by an interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mistakenly figured I was going to need it later (to handle empty hparams conversions)

Me, too. :-)

No preference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack. Will leave as-is in that case.


#[test]
fn test_metadata_tensor_without_dataclass() {
for plugin_name in vec![
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clippy says:

warning: useless use of `vec!`
   --> data_compat.rs:678:32
    |
678 |               for plugin_name in vec![
    |  ________________________________^
679 | |                 plugin_names::HISTOGRAMS,
680 | |                 plugin_names::TEXT,
681 | |                 plugin_names::PR_CURVES,
...   |
684 | |                 plugin_names::NPMI,
685 | |             ] {
    | |_____________^
    |
    = note: `#[warn(clippy::useless_vec)]` on by default
    = help: for further information visit https://rust-lang.github.io/rust-clippy/master/index.html#useless_vec
help: you can use a slice directly
    |
678 |             for plugin_name in &[plugin_names::HISTOGRAMS,
679 |                 plugin_names::TEXT,
680 |                 plugin_names::PR_CURVES,
681 |                 plugin_names::HPARAMS,
682 |                 plugin_names::MESH,
683 |                 plugin_names::NPMI] {
    |

(In case you’re wondering, the reason that these checks don’t show up
inline is that our Cargo package is not at repository root.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, thanks.

@nfelt nfelt requested a review from wchargin March 5, 2021 19:59
/// This supports summaries with `tensor` populated.
//
// TODO(#4422): support conversion of other summary types to tensors.
pub fn into_tensor(self, _metadata: &pb::SummaryMetadata) -> Result<pb::TensorProto, DataLoss> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mistakenly figured I was going to need it later (to handle empty hparams conversions)

Me, too. :-)

No preference.

pub const TEXT: &str = "text";
pub const PR_CURVES: &str = "pr_curves";
pub const HPARAMS: &str = "hparams";
pub const MESH: &str = "mesh";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah—in the past, I think, we’d agreed that it was “reasonable” to
assume that text would be small, even though it’s not bounded. The
unbounded-size tensor is also approximately what I was thinking, as
“basically sugar for a length-1 blob sequence, but we handle the
serialization for you”. From that lens, I don’t know that we actually
need a separate data type—emit_tensor(..., as_blob=True) or something
might suffice.

Happy to leave as is for now.

@nfelt nfelt merged commit 42c4d4b into master Mar 5, 2021
@nfelt nfelt deleted the nfelt-rust-tensors-datacompat-basic branch March 5, 2021 22:22
wchargin added a commit that referenced this pull request Mar 15, 2021
Summary:
This patch replaces the multiplexer code in the mesh plugin with data
provider code. It is not gated behind a flag.

We treat all mesh data as tensors, since that’s most expedient at this
time and the local data providers (multiplexer and RustBoard) don’t
actually have any size constraints. See also discussion in #4734.

Test Plan:
The dashboard still works with the standard demo data, both with and
without `--load_fast`.

wchargin-branch: mesh-generic
wchargin-source: 2a995b31efea5c58dee9d05fc7e66bbdb833a32e
wchargin added a commit that referenced this pull request Mar 16, 2021
Summary:
This patch replaces the multiplexer code in the mesh plugin with data
provider code. It is not gated behind a flag.

We treat all mesh data as tensors, since that’s most expedient at this
time and the local data providers (multiplexer and RustBoard) don’t
actually have any size constraints. See also discussion in #4734.

Test Plan:
The dashboard still works with the standard demo data, both with and
without `--load_fast=true`.

wchargin-branch: mesh-generic
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes core:rustboard //tensorboard/data/server/...

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants