Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exp show: include running experiment #5965

Closed
dberenbaum opened this issue May 4, 2021 · 10 comments · Fixed by #6174
Closed

exp show: include running experiment #5965

dberenbaum opened this issue May 4, 2021 · 10 comments · Fixed by #6174
Assignees
Labels
A: experiments Related to dvc exp product: VSCode Integration with VSCode extension

Comments

@dberenbaum
Copy link
Collaborator

Show currently running experiment in the table. cc @shcheklein @pmrowla

@dberenbaum dberenbaum added the product: VSCode Integration with VSCode extension label May 4, 2021
@dberenbaum
Copy link
Collaborator Author

@shcheklein Please follow up to clarify anything that I missed on this. Also, can you clarify the motivation? Is it to give users with long-running experiments some confirmation that their experiment is running?

I think this introduces some complications that raise new questions:

  • As far as I understand today, no experiment branch is created until the experiment completes. Should an experiment branch with name be generated as soon as the experiment starts, or should it be left blank until completion? If the experiment fails, should it be removed from the table, and should its branch be deleted?
  • Since the commit sha for the experiment won't exist until it completes, does it matter if there is no sha or it changes after the experiment completes?
  • Should the same behaviors apply to queued experiments?

@pmrowla
Copy link
Contributor

pmrowla commented May 5, 2021

What should we even display for a running experiment? For workspace runs, we already have the workspace table entry, which reflects whatever is happening in the running experiment.

Do we just want an equivalent table entry for whatever is in temp directory for --temp/--queue runs?

It's not exactly clear what DVC should display, because outputs don't have to exist until a stage is completed. So for a long running stage command, nothing in the repo state may actually appear modified (from DVC's perspective) until the stage has completed.

@karajan1001
Copy link
Contributor

What should we even display for a running experiment? For workspace runs, we already have the workspace table entry, which reflects whatever is happening in the running experiment.

How about running and in the queue?

@shcheklein
Copy link
Member

We should always see all the running/queued/executed experiments in the table. Can it be done but marking workspace with a special flag - probably, yes. It's a good point. But we still need to do this in the dvc exp show output, right? Is it enough for the --temp, etc - I don't know to be honest, I don't have enough experience. But as I run something I want to have a way to see what's going on now and to be able to stop them for example. I mean, I want in this case to see the fact that it's running at least.

Should the same behaviors apply to queued experiments?

yes, in a sense that I'd like to see that it's running now and being able to stop it. Also, can we show that it was cancelled? E.g. to launch it with a different set of params?

Since the commit sha for the experiment won't exist until it completes, does it matter if there is no sha or it changes after the experiment completes?

my initial take - it doesn't matter that much, sha is probably needed for operation that should not be available until it's done, right?

As far as I understand today, no experiment branch is created until the experiment completes

curious why do we call it branch? :)

Should an experiment branch with name be generated as soon as the experiment starts, or should it be left blank until completion? If the experiment fails, should it be removed from the table, and should its branch be deleted?

good questions. I think it's related to Peter's point to some extent. We already have corresponding entries for pretty much all experiments, right? workspace, or queued entries ... we can utilize those, I guess? change their state somehow ...

On the other hand, how do we show a running checkpoint? Ideally it should be "branched" out of the previous, not from the workspace? Not 100% sure here.

@dberenbaum dberenbaum added the discussion requires active participation to reach a conclusion label May 5, 2021
@dberenbaum
Copy link
Collaborator Author

dberenbaum commented May 5, 2021

But as I run something I want to have a way to see what's going on now and to be able to stop them for example.

Okay, this seems like the use case we are missing. There's no way to cancel an ongoing experiment now except to ctrl-c in the window where the experiment is running, and it's obvious it's running in that window. I can see how in vscode there might need to be something like a stop button in the table.

curious why do we call it branch? :)

I called it a branch since the only difference between an experiment and a Git branch is the path to the ref (.git/refs/exps vs .git/refs/heads). I meant that this named ref for an experiment (.git/refs/exps/.../exp-12345) is not generated until the experiment completes. I suppose one option to consider is to generate that ref immediately with the commit off of which the experiment starts and then update the ref when it completes, but need to give it more thought and hear what others think.

Edit: Also, it seems this is getting towards functionality in #5615

@dberenbaum
Copy link
Collaborator Author

@shcheklein If we start with adding any running experiments and checkpoints to the table, whether in the workspace or not, is that enough for now? Do we want to open a separate ticket for stopping an experiment?

My opinions on how it should look:

  • Add some signal (similar to the asterisk for the queue) that shows which rows are for running experiments.
  • Having the final exp name at the start would be nice, but it's not necessary if it's easier to give temporary names like the queue today.
  • It's okay to leave cancelled experiments in the table or to drop them when cancelled. If they are left in the table, they probably need some other signal that they are cancelled/incomplete, and it would be nice to have a way to clean them up similar to dvc exp rm --queue.

@shcheklein
Copy link
Member

whether in the workspace or not, is that enough for now?

yes! that would be a great start

do we want to open a separate ticket for stopping an experiment?

yes, we can discuss and address it separately

My opinions on how it should look:

For VS Code we care about the API (--show-json) but probably it will be solved if we solve in the table.

@dberenbaum
Copy link
Collaborator Author

@pmrowla Any questions or concerns?

@pmrowla
Copy link
Contributor

pmrowla commented May 20, 2021

  • It's okay to leave cancelled experiments in the table or to drop them when cancelled. If they are left in the table, they probably need some other signal that they are cancelled/incomplete, and it would be nice to have a way to clean them up similar to dvc exp rm --queue.

I'm not sure what you mean here? Whatever is cancelled or incomplete in the workspace is just the workspace row. If it's a queued run, cancelled or failed runs currently just go back into the queue so they can be retried. Do we want to change this behavior?

@dberenbaum

@dberenbaum
Copy link
Collaborator Author

I'm not sure what you mean here? Whatever is cancelled or incomplete in the workspace is just the workspace row. If it's a queued run, cancelled or failed runs currently just go back into the queue so they can be retried. Do we want to change this behavior?

No, that's fine. I'm not sure how that will work in combination with showing the running experiment, but I can wait and see what it looks like.

@dberenbaum dberenbaum removed the discussion requires active participation to reach a conclusion label May 25, 2021
@pmrowla pmrowla self-assigned this May 25, 2021
@dberenbaum dberenbaum added the A: experiments Related to dvc exp label Jun 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: experiments Related to dvc exp product: VSCode Integration with VSCode extension
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants