-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
plots: improve performance of plots collection #8786
Comments
Related #8777 ... do you have any specific scenarios in mind to test? |
I don't yet. Maybe you do 🙏 ? |
@dberenbaum got it, it's more of a placeholder then rather than something coming from your recent experience? (just clarifying this). For me the simplest test is open vscode-demo project, select 5-7 experiments and run plots. That was enough to get 50seconds+ time last time I checked (project evolves a bit), but even after my fix it's still 7-9seconds. I think at this point it will be about optimizing things like:
Then we can run the profiler again, and see where bottlenecks are. |
This is related to #8787. In both cases, what we need to be doing is serializing |
yeah, I am waiting on that persistent index to see what improvement it makes. It's hard to say at this time how much it'll save. Persistent index will still require us to read plots, parse etc. Another idea that I am thinking of is caching the per-commit dictionary result of all operations so that we can reuse the old computed result (if there are no errors). But if persistent index is enough, we don't have to do that. |
I think there also may be some issues related to error handling, as the way we handle errors across different commands is kind of a mess. For one example, We also have problems where all of the error handling has been added piecemeal over time to meet specific studio or vscode requirements, so the internal APIs for handling errors in one place is not the same as in other places. So getting a quick/easy caching implementation that works for |
I'm not sure persistent data index will be enough here, since a lot of the things we need for plots/metrics/params are not DVC-tracked data. I think what we will still need is the serialized "dvc commit" object, that is really just a json dump of the aggregate |
We had something like that before, but we never used it as it was not very smart, and the serialization/deserialization was slow. Loading all the cached results back to our in-memory format is also not always easy. Lines 307 to 316 in 1180a2f
I think it'd be faster/simpler (but a bit fragile) to just cache results at high level. |
In main, I see ~4.5s out of total ~8s runtime being spent on Repro for the benchmark: git clone https://github.com/iterative/vscode-dvc-demo.git
cd vscode-dvc-demo
dvc fetch --all-commits
time dvc plots diff workspace $(git log --oneline --pretty=format:"%h" -n 10) --json --cprofile-dump dump.prof > /dev/null |
🙏 I can take that one |
@daavoo, do you know what's the effort to get rid of |
Given that we are making breaking changes to We would benefit from the existing exp show cache and VSCode only calls |
taking a look today |
As you can see above, the rendering/_show_json takes 6.5 s out of 8s. Half of the rest is just imports. That leaves <1s to parse and load data from 10 commits. I am not sure if vscode-dvc-demo is a good representative, but that's what we are using as a benchmark at the moment. Even though if we cache, we still have to do the rendering. Given vscode only compares two commits and since we don't have a way to differentiate retryable/non-retryable errors, I don't find caching to be worth it at this point. |
Not if we cache the rendered plot |
Anyhow, already have a patch removing all the overhead from dumps, will send it |
This is how it looks with iterative/dvc-render#124.
|
Not sure it makes a meaningful difference, but to clarify, VS Code can compare up to 7 commits. |
@dberenbaum, I only saw it invoke |
@dberenbaum, what happens when you reselect some of those experiments? Does it cache or re-invokes
I have a patch for that. See #9183 |
It looks like VS Code tries to cache them. cc @mattseddon By the way, I don't think 2 vs 7 commits is likely to be the difference in whether to spend more time on this issue, I just thought it was worth clarifying VS Code behavior. |
We discussed this today regarding caching. The way we squash definitions/properties prevent us from caching the rendered plots, and the way VSCode does caching, that squashing won't work anyway. Studio similarly has a different behaviour. We'll have different output on VSCode/Studio/DVC, and it's unclear what the expected behaviour here should be. |
When considering moving plots data into
Relevant parts of #9025 (comment)
If we can agree on the baseline behaviour and simplify the problem then maybe (hopefully) we can move caching out of the extension and back into DVC. WDYT? |
We can account for this with the potential So you could have something like
where the result is just an unordered dictionary mapping the requested named revision to the In this scenario, for a named branch/exp DVC would return the latest/tip matching the given name. So for running experiments, it would return the latest/live value. For a finished experiment it would return the tip commit for checkpoints or only/single commit for regular experiments. So when rendering plots, vscode would call some This way, we can exclude the plots datapoints from the regular This may also be useful if/when vscode wants to refresh data for a specific regular table row without requesting the entire Essentially, this is how a hypothetical |
@mattseddon, a question that I have, how much time does the extension take to render plots? |
Times can vary greatly but cached in the extension vs un-cached does show a significant time difference. Here is an example using the demo project (not much data): Screen.Recording.2023-03-17.at.2.52.56.pm.movWhen a revision is first selected we request the data from For reference in the above project for 2 revisions |
@mattseddon Is that example pulling the data from the remote, or is it already pulled? I think #9183 was already intended to make sure we avoid re-downloading. I think caching the plots data on the DVC side seems nice to have, but I'm not that clear on if it's worth prioritizing right now since it sounds like it will require work from both products to replicate what VS Code is already doing, right? |
I'll suggest opening a new issue for caching. @daavoo's last PR halved (3x?) the runtime of |
@skshetry Good idea, but let's try to decide whether it's worth opening a new issue first. @mattseddon @shcheklein Thoughts? |
Yes, when I was initially complaining about plots performance it was about dealing with 1 minute + delays because of some issues (bugs, luck of clear parallelism, etc). I think we already went beyond and above (thanks!) and solved more than I personally anticipated and expected (at least from reading these threads). I agree that removing cache from VS Code and caching on the DVC is nice to have (unless Matt has a strong opinion about this, .e.g if it becomes very painful for us to deal with errors etc - we need to evaluate the effort then, agree about the separate issue for this). |
All sounds reasonable. Caching is a bit broken on the extension side for the reasons listed here: #9025 (comment) The Tl;dr is that revisions are mutable depending on the combination passed to Happy to close this and open a new issue for moving caching at a later date. |
Closing this one. Let's open a new issue if there are specific issues like caching we need to follow up on. |
Important for VS Code to be able to work smoothly. Also could help Studio rely on less custom logic?
The text was updated successfully, but these errors were encountered: