-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
external outputs: broken if pipeline output doesn't exist during stage initialization #8757
Comments
Not sure if this should be a new issue, but I'm seeing the same problem now because we have Line 968 in 8f768ad
|
I didn't test its removal but it was supposed to be a legacy option no longer needed iterative/dvc-objects#176 (comment) |
So @dberenbaum to see if I got it right, is this because fsspec raises FileNotFound error? |
Correct. It also wasn't clear to me whether it's intended. |
Bug Report
Description
S3 external outputs are broken for pipelines since 7211bd0 because of a bug in s3fs (and probably in other filesystems). They will only break if running a stage for which an output doesn't already exist. When initializing the stage, DVC will try to remove the nonexistent output and raise a
FileNotFound
error.Reproduce
dvc repro
will break if there is an external output and that output does not exist yet.In a new repo, using some
<s3_path>
that doesn't exist yet, do this:Expected
dvc repro
shouldn't fail while removing outputs. In this case, it fails because of what seems like a bug or at least inconsistent behavior in fsspec. Like mentioned in #5961 (comment),output.remove
for s3fs and other async filesystems calls _expand_path. When the path doesn't exist andrecursive=True
,_expand_path
raisesFileNotFoundError
. Whenrecursive=False
, it returns the path. It also returns the path for theLocalFileSystem
regardless of whetherrecursive=True
, so not sure if it was intended to raise an error only for this specific scenario.The text was updated successfully, but these errors were encountered: