-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Added categorization for data_selector.Selection #1324
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here is a little confusing I think because experiment and run are actually paired up here rather than forming separate layers of hierarchy. Maybe reword to something like "A SeriesCategory corresponds to a tagPrefix; it contains all tags with that prefix and the corresponding experiment-run 'series' pairs for each tag."
Also, looks like there's some copy/paste issues in the remaining part of the comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like searchTagToItems ends up being a sub-map of tagToItems, so it seems a little redundant to generate and store both of them.
Could we instead just accumulate a list of searchTags, produced by concatenating together the searchCategory.items for each respective experiment? Then when constructing the final actual searchCategory to be returned, instead of directly using searchTagToItems, we would instead iterate over searchTags and return essentially (tag, tagToItems.get(tag)) for each tag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe replace "items" with "series" if that's what we're using elsewhere? Here and for searchTagToItems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point this should just be a regular JOIN. Once you're filtering out rows where Tags.plugin_name is NULL, then it effectively becomes a JOIN instead of a LEFT JOIN anyway. The comment about "tag can be missing" below should be updated too.
Either that, or you'd want to change the filtering condition to be AND (Tags.tag_id IS NULL OR Tags.plugin_name IS NOT NULL).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I don't see logic here that populates the .tags field on the runs in _selectionMap? But categorizeSelection() expects the field to be there on the selection it receives, which is generated from _selectionMap (via this._setSelection(), which as an aside doesn't actually seem to be defined anywhere?).
I think the fireChange() code in tf-data-select-row would need to populate that field when propagating the run information up from the individual row to the overall data selector.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tags field has been populated from BE but FE omitted it from the type declaration.
Like you indicated, it is already populated in the tf-data-select-row. This comment is about maybe filtering out tags for inactive plugins in the tf-data-select-row.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the TODO comment is not exactly related to my comment, but GitHub only lets you comment on changed lines +/- 3 lines so I just picked a line that seemed roughly relevant.
How does the tags field get propagated into _selectionMap? The _selectionChanged() handler here passes through event.detail.runs, and in tf-data-select-row the _fireChange() method that fires selection-changed does so by passing runs as defined to contain only the id, name, and startTime fields:
https://github.com/stephanwlee/tensorboard/blob/84026254338fb3436339270d10f478d25b0f3fbd/tensorboard/components/tf_data_selector/tf-data-select-row.ts#L187-L193
So even if tf-data-select-row._runs contains the .tags field fetched from the backend, it looks to me like it would be dropped in _fireChanged().
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like I have sycned my local to a wrong version and was not up-to-date.
Addressed. Thanks for catching that!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
The utility divides the data, based on selection, into tag prefix and tags. After those division, it does not make assumption about how the data is used and returns list of experiment/run pair that matches the criteria.
Background: Tags are created for a specific plugin. Reason: Categorization is used near visual component to distill tagfilter/tag/experiment/runs grouping from a selection. It is also important to note that a categorization will wildly differ depending on which plugin is invoking. To be able to filter out unrelated tag categorization, now, categorization utils take the pluginName to filter out unrelated tags based on information we pipe from the backend.
As opposed to square brackets
| export type RunTagCategory = Category<{tag: string, run: string}>; | ||
|
|
||
| /** | ||
| * Organize data by tagPrefix, tag, then llist of series which is comprised of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: llist
| metadata: { | ||
| type: CategoryType.SEARCH_RESULTS, | ||
| validRegex: false, | ||
| universalRegex: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the false values for these properties (validRegex, universalRegex) actually matter here? Or will they get overridden and/or ignored later on somehow?
| selection: tf_data_selector.Selection[], pluginName: string): | ||
| SeriesCategory[] { | ||
| const tagToSeries = new Map(); | ||
| const searchTags = new Set(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we probably want to resort searchTags at some point so that the result is still in "tag-comparison order" (same as now) within the "search" category pane. Right now if I'm not mistaken, each individual batch of tags is sorted but they aren't sorted overall, and Set just preserves insertion order, so you might get a result like "aa, zz, ab, ac" if "aa, zz" are in the first experiment and "aa, ab, ac" are in the second one, which seems undesirable.
Maybe good to have a test for this case too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! Thanks for the insight
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks!
This reverts the new behavior introduced as part of tensorflow#1324. It used to remove pane that has no matching tag/run.
This reverts the new behavior introduced as part of #1324. It used to remove pane that has no matching tag/run.
The utility divides the data, based on selection, into
tag prefix and tags. After those division, it does not make
assumption about how the data is used and returns list of
experiment/run pair that matches the criteria.