Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to compare plots with different definitions or specs #3676

Closed
dberenbaum opened this issue Apr 12, 2023 · 23 comments
Closed

How to compare plots with different definitions or specs #3676

dberenbaum opened this issue Apr 12, 2023 · 23 comments
Assignees
Labels
A: plots Area: plots webview, side panel and everything related 📦 product Needs product input or is being actively worked on

Comments

@dberenbaum
Copy link
Contributor

Related: iterative/dvc#9025

We need to decide how to compare plots across the workspace, queue temp dirs, branches, and other revisions when the plots have different IDs/definitions or properties/specs. Let's decide what works for VS Code, and then we can decide whether this makes sense in DVC CLI.

Users may select and deselect experiments in the UI, and the plots should remain stable (like the colors):

  • Adding experiments should not make plots disappear or change the visualization of existing experiments on those plots.
  • Dropping experiments should not change the visualization or drop plots for remaining experiments.

At the same time, the plots should also be flexible:

  • Adding a plot ID in my workspace should show a new plot.
  • Adding an experiment with a plot ID that didn't exist in other experiments should show a new plot.
  • Changing plot properties in the workspace should update the plot for all selected experiments.
  • I should be able to compare multiple experiments with the same plot ID but different properties.

Thoughts on this approach?

  • Show plots for all plot IDs across all selected experiments.
  • Use the most recent version of the plot properties for each plot ID.

Questions:

  • Should the workspace always be considered the most recent?
  • What about in comparison to running experiments in the queue/tmp dir?

For reference, how it works today in each product:

  • DVC CLI merges plots definitions and specs based on the order in which they are passed to dvc plots diff rev1 rev2 ....
  • VS Code calls dvc plots diff .... How do we decide the order @mattseddon? VS Code also overrides properties like color to keep them stable.
  • Studio uses a combination of DVC internals and its own logic. When plots definitions and specs differ, Studio shows them separately.
@shcheklein shcheklein added 📦 product Needs product input or is being actively worked on A: plots Area: plots webview, side panel and everything related labels Apr 12, 2023
@mattseddon
Copy link
Member

VS Code calls dvc plots diff .... How do we decide the order @mattseddon? VS Code also overrides properties like color to keep them stable.

We always give the workspace the top priority and then order revisions based on their timestamp.

@dberenbaum
Copy link
Contributor Author

@mattseddon Sounds good, thanks. Any thoughts about what priority to give running experiments in the queue/tmp dir?

@shcheklein
Copy link
Member

We always give the workspace the top priority and then order revisions based on their timestamp.

this is only if workspace is selected, right?

@mattseddon
Copy link
Member

@mattseddon Sounds good, thanks. Any thoughts about what priority to give running experiments in the queue/tmp dir?

I would give them a lower priority than the workspace (if selected) but higher than all completed experiments.

We always give the workspace the top priority and then order revisions based on their timestamp.

this is only if workspace is selected, right?

Yes

@dberenbaum
Copy link
Contributor Author

Thoughts on this approach?

* Show plots for all plot IDs across all selected experiments.

* Use the most recent version of the plot properties for each plot ID.

@mattseddon Does this make sense?

@shcheklein
Copy link
Member

Studio uses a combination of DVC internals and its own logic. When plots definitions and specs differ, Studio shows them separately.

Can we expand on this please, how that would be different @dberenbaum from what you suggest?

Use the most recent version of the plot properties for each plot ID.

It means that plots won't be stable, right? Depending on if workspace is selected or not for example, we'll see the different result for all other commits? Not saying it's wrong, trying to wrap my mind around it once again.

@shcheklein
Copy link
Member

DVC CLI merges plots definitions and specs

How does DVC do it? Specifically, how does it merge specs (properties) if they differ? What does it mean to merge plot definitions?

@shcheklein shcheklein changed the title plots: how to compare plots with different definitions or specs How to compare plots with different definitions or specs Apr 18, 2023
@mattseddon
Copy link
Member

Thoughts on this approach?

  • Show plots for all plot IDs across all selected experiments.

  • Use the most recent version of the plot properties for each plot ID.

@mattseddon Does this make sense?

By most recent do you mean by the timestamp of an experiment/commit? Makes sense to me though. This is what I would expect. A superset of all the available plots.

@dberenbaum
Copy link
Contributor Author

Studio uses a combination of DVC internals and its own logic. When plots definitions and specs differ, Studio shows them separately.

Can we expand on this please, how that would be different @dberenbaum from what you suggest?

Studio will create an additional plot for every different plot spec. For example, if I change the x_label between revisions, it will show the plots like this (instead of in the same plot):

Screenshot 2023-04-19 at 12 53 12 PM

Use the most recent version of the plot properties for each plot ID.

It means that plots won't be stable, right? Depending on if workspace is selected or not for example, we'll see the different result for all other commits? Not saying it's wrong, trying to wrap my mind around it once again.

Correct, that's the downside vs the Studio approach. I can't think of anyway to satisfy all the "requirements" I listed above in one approach unfortunately.

How does DVC do it? Specifically, how does it merge specs (properties) if they differ? What does it mean to merge plot definitions?

I think it's easiest to refer to iterative/dvc#9025 (comment).

By most recent do you mean by the timestamp of an experiment/commit?

👍 I think so. Still trying to work through it myself, but this seems reasonable to me.

@shcheklein
Copy link
Member

I think it's easiest to refer to iterative/dvc#9025 (comment).

So, the difference between what you suggest and DVC would be that we show in DVC some older plots with some older specs?

  • Let's say we have workspace and commit A.
  • In commit A we have plots p1 and p2 and in workspace we have p2 and p3.
  • In commit A spec for p2 is s2a and in workspace it's s2w.

Can we please outline the logic using this basic example, just to be sure that we are on the same page?

@dberenbaum
Copy link
Contributor Author

  • Let's say we have workspace and commit A.

  • In commit A we have plots p1 and p2 and in workspace we have p2 and p3.

  • In commit A spec for p2 is s2a and in workspace it's s2w.

  • We show plots p1, p2, and p3.
  • For p2, we use spec s2w (for both workspace and commit A) because the workspace takes priority over prior commits.

@shcheklein
Copy link
Member

okay, how about VS Code now and DVC now and Studio now?

@dberenbaum
Copy link
Contributor Author

Here's an example based on https://github.com/iterative/lstm_seq2seq/tree/plots.

The changes over the last 3 commits:

  • 2f52e2b (HEAD on branch plots): only top-level plots; corrects loss plot to y_label: Loss
  • 41fe94d (HEAD~1): dvclive-generated plots and top-level plots; loss plot has wrong y_label: Accuracy in loss plot
  • 67ba6c0 (HEAD~2/tip of branch main): only dvclive-generated plots
full log
$ git log -p
commit 2f52e2b57de94e8c6f302f6b1ad4b2863076aa92 (HEAD -> plots)
commit 2f52e2b57de94e8c6f302f6b1ad4b2863076aa92 (HEAD -> plots)
Author: dberenbaum <dave@iterative.ai>
Date:   Thu Apr 20 13:47:16 2023 -0400

    clean up plots

diff --git a/dvc.yaml b/dvc.yaml
index dfee14b..52c4100 100644
--- a/dvc.yaml
+++ b/dvc.yaml
@@ -36,4 +36,4 @@ plots:
     y:
       results/plots/metrics/train/epoch/loss.tsv: loss
       results/plots/metrics/val/loss.tsv: loss
-    y_label: Accuracy
+    y_label: Loss
diff --git a/results/dvc.yaml b/results/dvc.yaml
index 8f5a7e9..4930788 100644
--- a/results/dvc.yaml
+++ b/results/dvc.yaml
@@ -1,5 +1,2 @@
 metrics:
 - metrics.json
-plots:
-- plots/metrics:
-    x: step

commit 41fe94db95799844746b55d4af983b662e678c94
Author: dberenbaum <dave@iterative.ai>
Date:   Thu Apr 20 13:44:22 2023 -0400

    add top-level plots

diff --git a/dvc.yaml b/dvc.yaml
index bfd4e38..dfee14b 100644
--- a/dvc.yaml
+++ b/dvc.yaml
@@ -24,3 +24,16 @@ stages:
     - results/plots:
         cache: false
         persist: true
+plots:
+- Accuracy:
+    x: step
+    y:
+      results/plots/metrics/train/epoch/acc.tsv: acc
+      results/plots/metrics/val/acc.tsv: acc
+    y_label: Accuracy
+- Loss:
+    x: step
+    y:
+      results/plots/metrics/train/epoch/loss.tsv: loss
+      results/plots/metrics/val/loss.tsv: loss
+    y_label: Accuracy

commit 67ba6c070c75e81b5e8ca0e35070e1016ff00109 (origin/main, origin/HEAD, main)
Author: Olivaw[bot] <olivaw@iterative.ai>
Date:   Thu Apr 13 14:22:16 2023 +0000
...

Plots in VS Code:

Screenshot 2023-04-20 at 2 09 20 PM

Notes:

  • The underlying data is the same, so the lines for each revision aren't always visible since they overlap.
  • The order of dvc diff in the output at the bottom is fixed. @mattseddon explained that it is ordered by timestamp, with workspace always taking top priority.

Plots in DVC CLI:

In DVC CLI, I can manipulate the order to get different results. For example, by reversing the order to dvc plots diff 67ba6c0 41fe94d 2f52e2b, I get this:

Screenshot 2023-04-20 at 2 04 32 PM

Notes:

  • The top-level plots now show up at the end.
  • The loss plot has the outdated y_label: Accuracy (the latest rev corrects it to y_label: Loss).
  • As mentioned above, the static data makes the lines overlap and it's hard to see each revision, but the legend shows that not all 3 revisions are rendered for any of the plots (only if the plot ID is defined in that revision).

Plots in Studio:

Screenshot 2023-04-20 at 2 07 36 PM

Notes:

  • Since the loss plot's y_label value changes between revisions, Studio renders two separate plots, one for each revision.

I think the VS Code logic is reasonable and we should mostly try to keep it as is. If we agree, I see a few action points:

  • Incorporate live results from experiments running in the queue (may require DVC changes).
  • Show as many revisions as possible in each plot (may require DVC changes). I mention above that none of the plots currently show all 3 revisions, even though they all have the same data. For example, after the top-level plots are added, there's no way to see them for the previous revision even though the data exists in that revision.
  • Try to make this consistent across products. Is the flexibility of being able to choose the order of revisions in the CLI powerful or confusing? Would it be better to combine plots in Studio so they can be compared more easily?

@shcheklein
Copy link
Member

@dberenbaum since you have it in your head, could you please translate it into that W, commit A, p1, p2, p3 model please? It would be way easier for me personally to wrap my mind around this then. Or do you feel it's not possible, we would miss some details?

@dberenbaum
Copy link
Contributor Author

I think it's not totally possible. There's no workspace in Studio, for example, and I think it's more clear to show a working example than for me to take your hypothetical and explain how I think it would look in each product. Happy to do anything else to summarize the example better so you don't have to spend a lot of time figuring out the context.

@dberenbaum
Copy link
Contributor Author

cc @jellebouwman @amritghimire

@shcheklein
Copy link
Member

@dberenbaum I agree on the direction for DVC, VS Code - and it feels we don't need that much, it's already done the way it should be?

Show as many revisions as possible in each plot (may require DVC changes). I mention above that none of the plots currently show all 3 revisions, even though they all have the same data. For example, after the top-level plots are added, iterative/dvc#7913 even though the data exists in that revision.

I'm not sure I understand this. Is there a possibility that it's just hidden because data is the exactly the same?

I would deprioritize touching Studio for now tbh. Let's create a ticket and get there when we have a bit more time. Feels like it can be involved?

@dberenbaum
Copy link
Contributor Author

dberenbaum commented Apr 21, 2023

@dberenbaum I agree on the direction for DVC, VS Code - and it feels we don't need that much, it's already done the way it should be?

Yes, I think we only truly need the first action point above.

Show as many revisions as possible in each plot (may require DVC changes). I mention above that none of the plots currently show all 3 revisions, even though they all have the same data. For example, after the top-level plots are added, iterative/dvc#7913 even though the data exists in that revision.

I'm not sure I understand this. Is there a possibility that it's just hidden because data is the exactly the same?

No, I have verified that it's not the case. It's a separate issue already tracked with DVC, so if VS Code continues to rely on the DVC logic, we can track it in #7913. I think I would need it as a user since my plots will evolve over time, but we can wait until we hear from users about it.

I would deprioritize touching Studio for now tbh. Let's create a ticket and get there when we have a bit more time. Feels like it can be involved?

Yup, will do. Edit: Let's not even open an issue yet. There's enough noise in Studio.

@mattseddon
Copy link
Member

  • Incorporate live results from experiments running in the queue (may require DVC changes).
  • Show as many revisions as possible in each plot (may require DVC changes). I mention above that none of the plots currently show all 3 revisions, even though they all have the same data. For example, after the top-level plots are added, there's no way to see them for the previous revision even though the data exists in that revision.

Perhaps we can tackle both of these things on the DVC side at the same time? I'd still be happy to get involved.

@dberenbaum
Copy link
Contributor Author

We can discuss it during sprint meetings tomorrow and I'll let you know.

@dberenbaum dberenbaum added this to DVC Apr 24, 2023
@github-project-automation github-project-automation bot moved this to Backlog in DVC Apr 24, 2023
@dberenbaum
Copy link
Contributor Author

Incorporate live results from experiments running in the queue (may require DVC changes).

Getting the datapoints for this should be easy, but not sure yet how to get the spec for live experiments.

Need to also consider how to collect live image data.

@dberenbaum
Copy link
Contributor Author

  • Show as many revisions as possible in each plot (may require DVC changes). I mention above that none of the plots currently show all 3 revisions, even though they all have the same data. For example, after the top-level plots are added, there's no way to see them for the previous revision even though the data exists in that revision.

I think this part expands the scope too much, so let's cut it and strictly focus on incorporating live results from experiments running in the queue.

@dberenbaum
Copy link
Contributor Author

Opened iterative/dvc#9369, so I think we can close this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: plots Area: plots webview, side panel and everything related 📦 product Needs product input or is being actively worked on
Projects
No open projects
Archived in project
Development

No branches or pull requests

3 participants