Skip to content

Conversation

@wchargin
Copy link
Contributor

Summary:
We implement ListBlobSequences, ReadBlobSequences, and ReadBlob
for the Rust data server, and the corresponding data provider methods
for the Python data provider. Run graphs now work end to end, and
getting more blobs to work is a matter of implementing data compat
transformations on the Rust side.

Test Plan:
Unit tests included on both ends (solid ones on the Rust side, mocking
ones on the Python side). Running TensorBoard with --load_fast now
shows a working graphs plugin.

wchargin-branch: rust-blob-sequence-rpcs

Summary:
The `Commit` structure now has a `blob_sequences` store in addition to
its `scalars` store. For now, we only actually load run graphs into
memory (since it’s trivial, and everything else is merely easy).

Test Plan:
Unit tests included. There are no user-facing changes to the RPC layer.
Loading an MNIST dataset with scalars, graphs, and images no longer
prints warnings about `graph_def`s, and still serves scalars properly.

wchargin-branch: rust-load-graphs
wchargin-source: a16be4c845d05f74034e5a038439c210d8b3e937
Summary:
The existing `ListPlugins` RPC now explores blob sequence time series
for plugin names, in addition to scalar time series. This patch also
expands the `commit::test_data` builders to support blob sequences.

Test Plan:
Unit tests updated. As an end-to-end test, pointing TensorBoard at a
directory with scalars and graphs now causes the graphs dashboard to
appear active in addition to the scalars dashboard. The dashboard
doesn’t yet work, since `ListBlobSequences`, et al. aren’t implemented.

wchargin-branch: rust-listplugins-blobs
wchargin-source: fbab637da1881c51f9b5559abb97141b58fe0387
wchargin-branch: rust-listplugins-blobs
wchargin-source: 366fe5ecc221b2101e0190146b1d9e4e33de305e
Summary:
The [`tokio-stream`] crate makes it easier to work with async streams,
and in particular to iterate over them. This is useful for test code
that needs to iterate over the responses of server-streaming RPC like
`ReadBlob`.

[`tokio-stream`]: https://crates.io/crates/tokio-stream

Test Plan:
It builds: `bazel build //third_party/rust:tokio_stream`.

wchargin-branch: rust-dep-tokio-stream
wchargin-source: f106742eeeb3113fd968232c973715f9ecb269c0
…s' and 'origin/wchargin-rust-dep-tokio-stream' into HEAD
Summary:
We implement `ListBlobSequences`, `ReadBlobSequences`, and `ReadBlob`
for the Rust data server, and the corresponding data provider methods
for the Python data provider. Run graphs now work end to end, and
getting more blobs to work is a matter of implementing data compat
transformations on the Rust side.

Test Plan:
Unit tests included on both ends (solid ones on the Rust side, mocking
ones on the Python side). Running TensorBoard with `--load_fast` now
shows a working graphs plugin.

wchargin-branch: rust-blob-sequence-rpcs
wchargin-source: 3870627314e30a8d5ede4cb004b887ab1e1f68ba
@wchargin wchargin added the core:rustboard //tensorboard/data/server/... label Jan 15, 2021
@google-cla google-cla bot added the cla: yes label Jan 15, 2021
@wchargin
Copy link
Contributor Author

wchargin commented Jan 15, 2021

Note: this is diffbased against the combination of #4562 and #4546
(which in turn depends on #4544).

To reviewers: the diff is rather larger than I usually like to send. If
you like, I can separate out the Python changes—the Rust unit tests are
sufficient for good coverage. But there’s something nice about having
an end-to-end integration here, so I’ve coupled them for now. Open to
input.

@wchargin wchargin requested a review from stephanwlee January 15, 2021 04:55
wchargin-branch: rust-load-graphs
wchargin-source: 33d56123cfe3da7c79b5a6bbfa9951159038ebcb
wchargin-branch: rust-listplugins-blobs
wchargin-source: 1d0bdb4ae5370b1b94c30e81ec333c5e3ca1e0eb
@wchargin wchargin changed the base branch from wchargin-diffbase-blobs-rpcs to wchargin-rust-listplugins-blobs January 15, 2021 05:46
wchargin-branch: rust-blob-sequence-rpcs
wchargin-source: 92bef146c06286fc2d5d7d7bb3427a594fdfae36
Base automatically changed from wchargin-rust-listplugins-blobs to master January 15, 2021 21:10
wchargin-branch: rust-blob-sequence-rpcs
wchargin-source: cb548c2370622404b8b4921e3dd9e6164f0da8e2
wchargin-branch: rust-blob-sequence-rpcs
wchargin-source: cb548c2370622404b8b4921e3dd9e6164f0da8e2
Copy link
Contributor

@stephanwlee stephanwlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good but I want to hear your response before the rubber stamp.

tags = {}
result[run_entry.run_name] = tags
for tag_entry in run_entry.tags:
ts = tag_entry.metadata
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does ts stand for? Tags? or does it mean Tag Series?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

“Time series”. Can rename to something else if you prefer?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No AI required. Is it anti-pattern to name variables in descriptive/verbose manner?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lol :-) I think I was in Golang headspace when I first wrote this, and
in Golang the answer to your question is “yes”. I’ve embiggened a few of
the more obscure variable names.

Comment on lines +319 to +336
let runs = self.read_runs()?;

let mut res: data::ReadBlobSequencesResponse = Default::default();
for (run, data) in runs.iter() {
if !run_filter.want(run) {
continue;
}
let data = data
.read()
.map_err(|_| Status::internal(format!("failed to read run data for {:?}", run)))?;
let mut run_res: data::read_blob_sequences_response::RunEntry = Default::default();
for (tag, ts) in &data.blob_sequences {
if !tag_filter.want(tag) {
continue;
}
if plugin_name(&ts.metadata) != Some(&want_plugin) {
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not saying repetition is bad so please let me know if this is unsound.

I imagine run tag filter on RunData to be quite common across list_{blob,scalar,tensor}. Would it make sense to create a iterable (if that's idiomatic) that has data filtered at the very least by run and tags?

Copy link
Contributor Author

@wchargin wchargin Jan 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah; I’ve been playing around with this in my head but haven’t actually
sketched it out. I agree both that repetition isn’t intrinsically bad
and that it feels like this common behavior could be factored out to
save a bunch of lines. It’s made a bit tricky because the response
types (like data::read_blob_sequences_response::RunEntry) all differ,
and RPCs access different commit fields (e.g., &data.blob_sequences).

I also agree that some kind of iterable would make sense: something like

for item in self.iter_data(rtf, plugin_filter, |rd| &rd.blob_sequences) {
    let (run, tag, time_series) = item?; // might return "failed to read..."
    // ...
}

// given:
fn iter_data<T, F: FnMut(&RunData) -> T>(
    &self,
    rtf: RunTagFilter,
    pf: PluginFilter,
    extract: F,
) -> impl Iterator<Item = Result<(Run, Tag, T), Status>>;

with the catch that inside the loop we wouldn’t have run_res like we
do right now.

I was planning to wait until we have the three data classes implemented
before trying to pull things out. What do you think? Any suggestions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No immediate suggestion. I will let you decide what should be done at places where we do read_runs and use run_tag_filter.

let blob_refs = (0..value.len())
.map(|i| {
let bk = BlobKey {
experiment_id: Cow::Borrowed(eid),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting; I should have asked this question before but when does it turn into mutable (to_mut)? It is not 100% obvious to me reading the blob_key.rs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Asking now is fine! :-) Depending on exactly what you’re asking, I think
maybe the answer is “when we call bk.to_string() below”? but let me
provide some more context.

Blob keys have string fields defined as Cow<'a, str> because we can
get blob keys in two different ways. First, in ReadBlobSequences, we
synthesize blob keys from the runs and tags that we already have in
memory. Second, in ReadBlob, we parse a blob key from user input.
When we parse a blob key from a base64 string, we have to allocate new
memory for the eid/run/tag: no way around it, since they’re not stored
anywhere else. But when we’re synthesizing a blob key, we’d like to
avoid cloning the eid/run/tag, since we already have references to them.
This is where the Cow<'a, str> comes in: you can think of it as
“either an owned string or a borrowed string, plus a bit indicating
which is which so that we know whether to free memory when this is
dropped”.

So on this line, we create a blob key that just points into the existing
string structures. Then, when we create the blob reference below, we
encode it all as base64-over-JSON, which allocates a new String.

Maybe this is what you meant by “turn into mutable (to_mut)”, or maybe
not; does this help? if not, can you clarify?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I think you are answering most of my question. However, my confusions really lies with the basic of string manipulation. For instance, when invoking bk.to_string(), I suspect it would be "concatenating" run, tag, index, and step and when doing so, why you would have to "clone". I definitely see why you would have to allocate a new space to store the new serialized string but I am not sure why that's perceived as cloning, for example, run and others.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when invoking bk.to_string(), I suspect it would be "concatenating" run, tag, index, and step and when doing so, why you would have to "clone". I definitely see why you would have to allocate a new space to store the new serialized string but I am not sure why that's perceived as cloning, for example, run and others.

Right. You don’t have to clone run, but you do have to memcpy its
contents into a new serialized string.

By “when we’re synthesizing a blob key, we’d like to avoid cloning the
eid/run/tag”, I mean that we don’t want to do this:

let bk = BlobKey {
    experiment_id: eid.clone(),
    run: run.clone(),
    // ...
};

—which would be a literal clone—and so instead we use Cow<'a, str>s
so that the blob key can just borrow the data.

@wchargin wchargin requested a review from stephanwlee January 15, 2021 23:14
wchargin-branch: rust-blob-sequence-rpcs
wchargin-source: 361768949089e75849ff4468bb7a6e6aaa2d0eca
@wchargin
Copy link
Contributor Author

thanks for your quick review of this large PR, and the good questions!

@wchargin wchargin merged commit 5acd97e into master Jan 16, 2021
@wchargin wchargin deleted the wchargin-rust-blob-sequence-rpcs branch January 16, 2021 01:03
@wchargin wchargin mentioned this pull request Jan 16, 2021
34 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes core:rustboard //tensorboard/data/server/...

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants