Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scm: use dulwich backend when fetching exps during clone/pull #9023

Merged
merged 1 commit into from
Feb 14, 2023

Conversation

skshetry
Copy link
Member

refspecs uses pygit2 backend by default, which does not support git credential helpers. This PR forces to use dulwich backend during clone/pull operations. Other uses of refspecs (in experiments in particular) is broken for users of git credential-helpers though.

Fixes #9016.

refspecs uses pygit2 backend by default, which does not support git credential
helpers. This PR forces to use dulwich backend during clone/pull operations.
@codecov
Copy link

codecov bot commented Feb 14, 2023

Codecov Report

Base: 93.07% // Head: 93.06% // Decreases project coverage by -0.02% ⚠️

Coverage data is based on head (683643d) compared to base (eff775c).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #9023      +/-   ##
==========================================
- Coverage   93.07%   93.06%   -0.02%     
==========================================
  Files         455      455              
  Lines       36753    36753              
  Branches     5316     5316              
==========================================
- Hits        34208    34203       -5     
- Misses       2023     2026       +3     
- Partials      522      524       +2     
Impacted Files Coverage Δ
dvc/external_repo.py 81.81% <100.00%> (ø)
dvc/repo/experiments/utils.py 82.08% <100.00%> (ø)
dvc/scm.py 86.42% <100.00%> (ø)
dvc/repo/experiments/queue/celery.py 85.92% <0.00%> (-1.86%) ⬇️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

Copy link
Contributor

@daavoo daavoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not blocking this) Should we set up some tests using credential manager?

@skshetry
Copy link
Member Author

We don't run any dockerized tests in dvc at the moment. We could test in scmrepo, but the whole exp-refs/external_repo logic remains in dvc.

@skshetry
Copy link
Member Author

I can take a look though.

@pmrowla
Copy link
Contributor

pmrowla commented Feb 14, 2023

Seems like this should be resolved on the scmrepo end. We can move credentials.py up a level into scmrepo.git since the generic logic for checking if there is a configured credential for a given git URL is not dulwich specific. And from there we just have to check if there is a configured credentialstore for our remote URL in the pygit backend before doing fetch/push, and raise NotImplementedError when there is one.

(and from there DVC doesn't need to do anything, the automatic backend stuff will try pygit first and then fall back to dulwich when it has to)

@skshetry
Copy link
Member Author

Decided not to add, I think this should be natively supported by our backends. We cannot and should not test every kind of configuration.

@skshetry
Copy link
Member Author

just have to check if there is a configured credentialstore for our remote URL in the pygit backend

if we do that, we should just add support for git credential-helpers.

@pmrowla
Copy link
Contributor

pmrowla commented Feb 14, 2023

just have to check if there is a configured credentialstore for our remote URL in the pygit backend

if we do that, we should just add support for git credential-helpers.

We already have generic logic for doing this checking the configuration as it only requires reading .git/config which is supported in every backend. Actually using the credential helper in pygit requires adding support for it in libgit2, which I would consider non-trivial as it will require writing C

@skshetry
Copy link
Member Author

I'd just prioritize having parity between different backends instead of using complicated logic to choose backends. Implementing git credential-helpers will offer the same performance.

@skshetry
Copy link
Member Author

Actually using the credential helper in pygit requires adding support for it in libgit2, which I would consider non-trivial as it will require writing C

I looked into it briefly, and it seems we can implement it by extending pygit2.RemoteCallbacks.credentials().

@pmrowla
Copy link
Contributor

pmrowla commented Feb 14, 2023

Actually, looking at what's in credentials.py we can probably just reuse what the dulwich backend is already doing in the scmrepo pygit wrapper instead of needing to add upstream support

@skshetry
Copy link
Member Author

skshetry commented Feb 14, 2023

Yes, we can reuse most of the stuff, we need to interface with RemoteCallbacks.credentials() and return pygit2.UserPass() from it.

@skshetry skshetry merged commit ed1ab7f into iterative:main Feb 14, 2023
@skshetry skshetry deleted the fix-clone-pygit2 branch February 14, 2023 13:33
@dberenbaum
Copy link
Collaborator

@skshetry Thank you for the fix! Can you open a follow-up issue for adding credential helpers in pygit2?

@skshetry
Copy link
Member Author

iterative/scmrepo#190

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix fixes bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

pull: pulling from private github dataregistry repo with git credential helper not working anymore
4 participants