-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Epic: exp queue CLI #7592
Comments
Should we close #5615 in favor of this one? |
I think we can leave it open for now, this is more just tasking for the current planned work, but the original discussion contains some more features that are currently unplanned/out of scope |
@dberenbaum |
This is looking good now! I wanted to summarize some of the questions/issues I still have:
|
I would say the first two are critical to resolve before making it public. Also, there's one more general bug I have seen: sometimes the queue does not reflect the order in which experiments were added. Let me know if you need more info. |
I think the existing commands should essentially be aliases to the new behavior, and formally deprecated, so when the user runs them we would output something like
For the aliased behavior:
For now I think we can just add this to |
This is an underlying celery issue, the basic filesystem broker which we use doesn't absolutely guarantee that tasks will be executed in FIFO order (but in most cases it will end up running things in FIFO order). This is something we can fix upstream, but I think we should get any other issues on the DVC side sorted out first. |
This is also related to the
There's a couple questions that need to be clarified here:
Basically we need to define what exactly is the difference between On the queue side of things, removing a failed or successful exp is meant to be used when the logs are no longer needed (it's a cleanup step). This is separate from removing the experiment git/dvc data (i.e. So maybe we should just have a This way we could have:
In the And in |
This is also related to the bug where generated final exp names for checkpoint exps can't be used with We probably need to just revisit exp automatic naming, the simplest solution here would be to just generate random exp names when they are queued instead of using the current method of naming exps based on the pipeline result. However, this means we would lose the "prevent duplicate exps" behavior which we have now (but in previous discussions it's been raised that this may not be desirable behavior in the first place). Basically, if a user runs If having duplicate exps is acceptable then we can just auto generate names right away, so queued exp names will always match the final ones. |
This one had ready been ready, it calls |
What if there are other experiments in the queue?
It could be unexpected if users are running jobs where they assume
👍 We don't claim that |
👍 |
Do we need
👍 |
Having separate queue and exp names actually makes much more sense now that the queue is more distinct, and it might even be helpful to keep them separate given the discussion above about the differences between queued and completed experiments. I think it's fine to keep it for now if we clean up all the queue UI so that it clearly references queue IDs and tasks rather than experiments or revs.
Let's discuss in #7879 since I think it's out of scope for this epic. |
From planning discussion:
|
To clarify, I don't think
Other items:
Finally, are there any docs issues open for |
Once |
If our support for the multi-worker is only testing, following multi-output simultaneously might not need for now. Here the requirement is that we automatically follow the currently
So for
I don't remember there was any of this. |
Yes, |
docs issue: iterative/dvc.org#3658 |
@dberenbaum @pmrowla |
@karajan1001 Do you mean |
Hi. It would be great to have a summary of the changes to existing commands and options (e.g. |
Followed from iterative#7592 1. Seperate remove method to a new file. 2. Add queued, failed, processed flags to remove. 3. Add new unit tests for queue status --queue/fail/success 4. Implement methods to remove done tasks. 5. add some new unit and functional test for celery_queue.remove 6. bump into dvc-task version 0.0.13
Followed from #7592 1. Seperate remove method to a new file. 2. Add queued, failed, processed flags to remove. 3. Add new unit tests for queue status --queue/fail/success 4. Implement methods to remove done tasks. 5. add some new unit and functional test for celery_queue.remove 6. bump into dvc-task version 0.0.13
Followed from #7592 1. Seperate remove method to a new file. 2. Add queued, failed, processed flags to remove. 3. Add new unit tests for queue status --queue/fail/success 4. Implement methods to remove done tasks. 5. add some new unit and functional test for celery_queue.remove 6. bump into dvc-task version 0.0.13
Followed from #7592 1. Seperate remove method to a new file. 2. Add queued, failed, processed flags to remove. 3. Add new unit tests for queue status --queue/fail/success 4. Implement methods to remove done tasks. 5. add some new unit and functional test for celery_queue.remove 6. bump into dvc-task version 0.0.13
Followed from #7592 1. Seperate remove method to a new file. 2. Add queued, failed, processed flags to remove. 3. Add new unit tests for queue status --queue/fail/success 4. Implement methods to remove done tasks. 5. add some new unit and functional test for celery_queue.remove 6. bump into dvc-task version 0.0.13
Meta-issue for initial planned
exp queue
issues. (This does not cover "nice to haves" as outlined in the notion proposal)dvc-task/exp backend tasks:
exp queue
CLI tasks:The text was updated successfully, but these errors were encountered: