Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create DVC experiment on live.end. #366

Merged
merged 2 commits into from
Dec 1, 2022
Merged

Create DVC experiment on live.end. #366

merged 2 commits into from
Dec 1, 2022

Conversation

daavoo
Copy link
Contributor

@daavoo daavoo commented Nov 22, 2022

Closes #310
Closes #311

Requires iterative/dvc#8599 and iterative/dvc#8529

src/dvclive/utils.py Outdated Show resolved Hide resolved
src/dvclive/live.py Outdated Show resolved Hide resolved
@daavoo daavoo force-pushed the exp-save branch 5 times, most recently from 7714f8b to 965de7d Compare November 24, 2022 13:25
@codecov-commenter
Copy link

codecov-commenter commented Nov 24, 2022

Codecov Report

Base: 96.63% // Head: 95.23% // Decreases project coverage by -1.40% ⚠️

Coverage data is based on head (82a4440) compared to base (e9ed5a0).
Patch coverage: 97.15% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #366      +/-   ##
==========================================
- Coverage   96.63%   95.23%   -1.41%     
==========================================
  Files          36       37       +1     
  Lines        1811     1950     +139     
  Branches      160      151       -9     
==========================================
+ Hits         1750     1857     +107     
- Misses         35       66      +31     
- Partials       26       27       +1     
Impacted Files Coverage Δ
src/dvclive/studio.py 91.11% <ø> (ø)
src/dvclive/report.py 96.38% <33.33%> (ø)
src/dvclive/dvc.py 88.05% <92.10%> (+4.18%) ⬆️
src/dvclive/live.py 96.72% <98.79%> (+0.59%) ⬆️
src/dvclive/env.py 100.00% <100.00%> (ø)
src/dvclive/lightning.py 90.90% <100.00%> (ø)
tests/test_dvc.py 100.00% <100.00%> (ø)
tests/test_main.py 100.00% <100.00%> (ø)
tests/test_report.py 100.00% <100.00%> (ø)
tests/test_studio.py 100.00% <100.00%> (ø)
... and 3 more

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@daavoo daavoo added A: dvc DVC integration feature labels Nov 30, 2022
@daavoo daavoo changed the title Save results as DVC experiment Create DVC experiment on live.end. Nov 30, 2022
@daavoo daavoo marked this pull request as ready for review November 30, 2022 16:46
self._inside_dvc_exp = True
else:
# `Python Only` execution
# TODO: How to handle `dvc repro` execution?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dberenbaum what do you think we should do if DVCLive is used inside a dvc pipeline that has been executed with dvc repro?

I think we should skip the experiment creation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's skip it. We may still want to call make_dvcyaml, but we can skip that also for now if it simplifies things.

assert load_yaml(live.dvc_file) == {
"metrics": ["metrics.json"],
"params": ["params.yaml"],
"plots": ["plots"],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow up: configure sklearn plots

src/dvclive/dvc.py Outdated Show resolved Hide resolved


def make_dvcyaml(live):
if not os.path.exists(live.dvc_file):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will dvc.yaml get deleted each time a new live instance is created? Maybe we need some option to configure whether to overwrite it. In some cases, I may log new stuff and want to overwrite, but in others I might have added some custom config that I want to keep.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will dvc.yaml get deleted each time a new live instance is created?

It should not be deleted and it should preserve the manual modifications from users.

In some cases, I may log new stuff and want to overwrite, but in others I might have added some custom config that I want to keep.

What would be overwritten? The current version is extremely simple as it doesn't configure sklearn plots (added a follow-up #371)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. Not a blocker.

What would be overwritten?

As you say, it will include sklearn plots config, which seems like the most complicated scenario, so let's discuss there.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably just write to the root of the repo, not to its own file by default. It is the dvc way, and this is against the principle of having one file to rule them all.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 But the dvclive way is to create a self-contained folder 😄 . IMHO dvclive should not write to the user's root dvc.yaml file, because it it makes it hard to share only the dvclive output, and it mixes machine-generated config with user-generated config. Also, if there is some conflict between the existing config and what dvclive generates, I would rather have dvclive run successfully and be able to later debug the conflicts between the files.

self._exp_name = random_exp_name(
self._dvc_repo, self._baseline_rev
)
make_dvcyaml(self)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not as sure that we should skip make_dvcyaml, but we can follow up on this separate from this PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another follow up: pass an experiment name to use.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And another: consider whether to include the dvclive user script/notebook in the tracked files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And another: consider whether to include the dvclive user script/notebook in the tracked files.

Do you mean to include it in the include_tracked list passed to experiments save or other thing?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, whether to include it in include_tracked. I'm not sure it's needed, so it was more of a discussion point than a request. There's something clean about the default being to only save the live dir.

Copy link
Collaborator

@dberenbaum dberenbaum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great so far! Very cool to see this come to life.

@dberenbaum
Copy link
Collaborator

Demo from @daavoo:

exp-onboarding.mp4

Enable with `save_dvc_exp=True`. Defaults to `False`.

Refactor `__init__` method. Split into private `_init_{component}` methods. Add `_` prefix to private properties.

Use env vars from iterative/dvc#8630 to skip Studio `start` and `done` events.
Use same env vars to skip creating DVC exp.
@@ -17,11 +17,16 @@ def __init__(
dir: Optional[str] = None, # noqa pylint: disable=redefined-builtin
resume: bool = False,
report: Optional[str] = "auto",
save_dvc_exp: bool = False,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dberenbaum named the option like this. wdyt?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if it makes sense for the dvc.yaml saving part and whether those should be coupled together, but I think it's fine for now.

Copy link
Contributor Author

@daavoo daavoo Dec 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I though your concerns applied to both the git ref and the dvc.yaml creation

@daavoo daavoo merged commit 63d1b20 into main Dec 1, 2022
@daavoo daavoo deleted the exp-save branch December 1, 2022 18:14
@daavoo daavoo mentioned this pull request Jul 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: dvc DVC integration feature
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Configure DVC experiment from DVCLive Save results as DVC experiment
5 participants