Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

status: add missing to local workspace status #9532

Closed
dberenbaum opened this issue Jun 2, 2023 · 6 comments
Closed

status: add missing to local workspace status #9532

dberenbaum opened this issue Jun 2, 2023 · 6 comments
Labels
A: status Related to the dvc diff/list/status p2-medium Medium priority, should be done, but less important

Comments

@dberenbaum
Copy link
Collaborator

In https://dvc.org/doc/command-reference/status#local-workspace-status, a dep or out can be in the following states:

  • new: An output is found in the workspace, but there is no corresponding file hash saved in the dvc.lock or .dvc file yet.
  • modified: An output or dependency is found in the workspace, but the corresponding file hash in the dvc.lock or .dvc file is not up to date.
  • deleted: The output or dependency is referenced in a dvc.lock or .dvc file, but does not exist in the workspace.
  • not in cache: An output exists in the workspace, and the corresponding file hash in the dvc.lock or .dvc file is up to date, but there is no corresponding cache file or directory.

If the dep/out is deleted and missing from the cache, can dvc report it as missing rather than deleted? The most common scenario is when data has not yet been pulled, and deleted is misleading.

Related:

@dberenbaum dberenbaum added p2-medium Medium priority, should be done, but less important A: status Related to the dvc diff/list/status labels Jun 2, 2023
@dberenbaum
Copy link
Collaborator Author

@daavoo Let me know what you think about this as you work on #9530. Since it would be a breaking change, it might be worth considering doing this (and maybe #8327) now.

@shcheklein
Copy link
Member

A thing to consider - how will it look like in the VS Code interface when someone actually deletes a file. I think now it's designated / marked as "D" similar to Git, and "M" stands for modified. Will we need to change those marks as well? Will it confuse people even more? (No easy answer here - clearly two use cases that are not easy to disambiguate 🤔 )

@dberenbaum
Copy link
Collaborator Author

how will it look like in the VS Code interface when someone actually deletes a file.

Would you want to change the mark for the deleted files or the ones with the new "missing" status? I think D for actually deleted files is much more in line with what people expect (I can't find now but recall people asking in both CLI and VS Code for why everything shows as deleted when they clone/don't pull). I don't think there's a Git analog for "missing" and we could come up with something else for that (agree that it's not ideal that it starts with M).

@mattseddon Have you heard about it from users? What's your take?

@mattseddon
Copy link
Contributor

We can only display one status (badge + decoration) per file. Under the above circumstances we show NC + .gitignored decoration to encourage the user to dvc pull. The SCM tree looks like this:

image

The files are duplicated under the "Uncommitted" and "Not In Cache" groups.

It would be good to make this more intuitive/less complicated.

I do feel like we should still have to explain exactly what "Missing" (or any other term) means. My $0.02 is that we need some succinct way of explaining that there is no local copy of the file/directory but there probably is one on the remote.

@dberenbaum
Copy link
Collaborator Author

I do feel like we should still have to explain exactly what "Missing" (or any other term) means. My $0.02 is that we need some succinct way of explaining that there is no local copy of the file/directory but there probably is one on the remote.

What do you think of "Not Pulled"?

@mattseddon
Copy link
Contributor

What do you think of "Not Pulled"?

I think it is fine. Would "not pulled" co-exist with "not in cache"? We could still end up with the files missing on the remote. Could we use the missing status then?... it gets complicated fast.

@dberenbaum dberenbaum closed this as not planned Won't fix, can't repro, duplicate, stale Apr 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A: status Related to the dvc diff/list/status p2-medium Medium priority, should be done, but less important
Projects
None yet
Development

No branches or pull requests

3 participants