Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

index: skip collection for fs with invalid fsid #510

Merged
merged 1 commit into from
Mar 23, 2024

Conversation

pmrowla
Copy link
Contributor

@pmrowla pmrowla commented Feb 21, 2024

related: iterative/dvc#10309

In dvcfs, trying to read fs.fsid forces us to load the erepo which may end up raising an SCMError in the event we cannot clone the source repo. In this case, we will not be able to collect anything from that fs, so we should just ignore trying to collect this data/storage

@pmrowla pmrowla self-assigned this Feb 21, 2024
@codecov-commenter
Copy link

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (d89a3f2) 63.04% compared to head (03aea4b) 62.98%.

Files Patch % Lines
src/dvc_data/index/collect.py 25.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #510      +/-   ##
==========================================
- Coverage   63.04%   62.98%   -0.07%     
==========================================
  Files          62       62              
  Lines        4338     4341       +3     
  Branches      733      734       +1     
==========================================
- Hits         2735     2734       -1     
- Misses       1446     1449       +3     
- Partials      157      158       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@@ -101,6 +101,12 @@ def collect( # noqa: C901, PLR0912
fsid = data.fs.fsid
except (NotImplementedError, AttributeError):
fsid = data.fs.protocol
except BaseException as exc: # noqa: BLE001
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if we update dvcfs to re-raise the SCMError as an AttributeError instead when loading fs.fsid, this just pushes the problem further down the line. In this case we will still hit the same SCMError when trying to clone the repo again upon the next fs function call.

I'm not sure if catching the broad exception here is right though, we could re-raise ValueError in dvcfs (and then catch that here instead of BaseException)

cc @efiop

Copy link
Contributor

@efiop efiop Feb 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, BaseException indeed feels a bit too broad.

In this case we will still hit the same SCMError when trying to clone the repo again upon the next fs function call.

This might be fine, as we try to connect to check for auth errors anyway. Not sure if AttirubteError specifically is ideal though, maybe it is. Or BaseException might actually be alright to handle here like you did already, since it is about whatever fsid might throw at us... So sounds like I reach your coclusion as well here 😄 Though again error reporting is kinda odd...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we still collect by falling back to data.fs.protocol?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the general case it would probably be better to have fallback behavior, but for dvcfs/gitfs falling back here will just delay the failure to the next gitfs call (when it will retry cloning the repo and fail)

@skshetry skshetry merged commit 632a420 into iterative:main Mar 23, 2024
16 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pull specific_file.dvc: SCM-Error when dvc import without access in same repository
4 participants