-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ls: handle broken symlink #7226
Conversation
dvc/repo/ls.py
Outdated
path = ( | ||
fs.path.name(fs_path) | ||
if fs_path == info | ||
else fs.path.relpath(info, fs_path) | ||
) | ||
|
||
try: | ||
metadata = fs.metadata(info) | ||
if metadata.output_exists or not dvc_only: | ||
ret[path] = { | ||
"isout": metadata.is_output, | ||
"isdir": metadata.isdir, | ||
"isexec": metadata.is_exec, | ||
} | ||
|
||
except FileNotFoundError: | ||
# if walk found it but there is no metadata, its broken symlink | ||
if not dvc_only: | ||
ret[path] = { | ||
"isout": False, | ||
"isdir": False, | ||
"isexec": False, | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that you are ignoring FileNotFoundError if dvc_only
, which could be a dangerous bug. Also duplicating code, so how about smth like:
try:
metadata = fs.metadata(info)
except FileNotFoundError:
if dvc_only:
raise
metadata = Metadata()
if metadata.output_exists or not dvc_only:
path = (
fs.path.name(fs_path)
if fs_path == info
else fs.path.relpath(info, fs_path)
)
ret[path] = {
"isout": metadata.is_output,
"isdir": metadata.isdir,
"isexec": metadata.is_exec,
}
Btw, how does CLI ls handle broken symlinks? Should we colorize it too? If so, how hard would that be? |
CLI ls handles it the same way as symlinks - we get a color for symlink. We could consider implementing it. The problem with that is that currently, we are differentiating the colors by determining whether the target is output/dir/executable. If none of the above, we use an extension to determine it. So far there has been no talking about links. In order to provide support, we would need to pass The question is whether we should do that. CLI ls is about inspecting the filesystem so it makes sense to mark the links. However the DVC is about our ML project - the most important is to differentiate between tracked and non tracked assets. I think w should not expand that. Even now, the behavior might be a bit counterintuitive. For example: if our pipeline will be producing executable, it will still be colored as output, because it is the first check we do. I think in this case we should not follow As an alternative, we could consider how to "mix" colors depending on the multiple properties of a given asset. |
8727a23
to
0f206c0
Compare
2f0a008
to
14bfd91
Compare
dvc/repo/ls.py
Outdated
if dvc_only: | ||
raise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the current approach (creating empty Metadata) I think it makes sense to remove this check.
it shouldn't matter if we use dvc_only
or not. In my initial version lack of this was a bug, silently skipping the error, but now it seems ok to remove it. Users should not be punished for a specific check. If there is a broken symlink, other commands will make sure of erroring out
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, in the case of dvc_only
files, we should never get here: repo_fs will be able to get the metadata even for the removed file
14bfd91
to
5f6755b
Compare
Fixes: #7120
β I have followed the Contributing to DVC checklist.
π If this PR requires documentation updates, I have created a separate PR (or issue, at least) in dvc.org and linked it here.
Thank you for the contribution - we'll try to review it as soon as possible. π