Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to configure the path of dvc.lock #5557

Closed
courentin opened this issue Mar 5, 2021 · 6 comments
Closed

Ability to configure the path of dvc.lock #5557

courentin opened this issue Mar 5, 2021 · 6 comments
Labels
awaiting response we are waiting for your reply, please respond! :)

Comments

@courentin
Copy link
Contributor

Hello people,
With my team, we're thinking of a way to deal with a debugging dataset that could be used to iterate faster and test in the CI (in the same spirit of what was done in git flow for dvc).
I was thinking of creating a dvc wrapper that would :

  1. Change the dvc.lock by dvc_debug.lock
  2. Change the params.yaml to set debug: true
  3. Run the dvc command
  4. Change back the dvc.lock to the original value
  5. Change back the params.yaml

Here is a more detailed implementation idea.

I feel it is a bit of a hack and I'd love to have your feedback on it.

I was thinking if this could be part of dvc, basically if we could configure dvc to work with another dvc.lock file that would make the things so powerful. We would just need to do something like: dvc config --local lock.path dvc_debug.lock during debugging and CI tests.

If this is something that would fit in the dvc product, I'd be happy to contribute.

@pmrowla
Copy link
Contributor

pmrowla commented Mar 9, 2021

I'm wondering if the 2.0 experiments feature would just work as an out of the box solution for this?

If you use dvc exp run --temp, execution of your pipeline will be done in a separate temp directory without touching your main workspace at all. The resulting experiment is actually just a git commit under the hood, and that commit would contain a dvc.lock file with your "debugging" values (while your original workspace dvc.lock remains unchanged).

@courentin
Copy link
Contributor Author

Oh yes, I haven't thought about it. I'll try a POC and let you know if any limitations. Thank you @pmrowla !

@efiop efiop added the awaiting response we are waiting for your reply, please respond! :) label Mar 9, 2021
@courentin
Copy link
Contributor Author

Except I missed something, I think using dvc experiment does not answer the question: how to permanently keep track of 2 pipelines that will change based on what's in params.yaml:debug?

@efiop
Copy link
Contributor

efiop commented Apr 26, 2021

@courentin Sorry for the delay. Could you elaborate, please?

Do I understand correctly that you want to switch locks to not re-run the whole pipeline when the params are restored? If so, we also have run-cache feature that caches stage runs and will try to restore lock without re-running your commands, if it finds that hashes for command and dependencies match an existing entry.

@karajan1001
Copy link
Contributor

karajan1001 commented Apr 27, 2021

I'm sorry, I didn't understand it.

If you want to switch between different datasets and have a slight difference in parameters?

Why not just create two pipelines?

Here is a more detailed implementation idea.

Sorry, I do not have access to Lalilo.

@efiop
Copy link
Contributor

efiop commented Jun 7, 2021

Closing as stale.

@efiop efiop closed this as completed Jun 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting response we are waiting for your reply, please respond! :)
Projects
None yet
Development

No branches or pull requests

4 participants