-
Notifications
You must be signed in to change notification settings - Fork 1.2k
dvc import broken authentication #7898
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
CC @dtrifiro |
I reverted to version 2.9.5 and can now perform the dvc import of a remote resource as expected. I would also point out that dvc update was also not working - failing for the same authentication issue. |
Hey @wdixon, would you mind providing a full traceback? Thanks |
I don't have a configuration that I can run from the same system.... but here is the same issue run from a windows system (the traceback doesn't appear to be as verbose - but the line numbers seem to match the traceback from the earlier linux system). Does this have the info you need? $ dvc doctor
DVC version: 2.11.0 (exe)
---------------------------------
Platform: Python 3.8.10 on Windows-10-10.0.19042-SP0
Supports:
azure (adlfs = 2021.9.1, knack = 0.8.2, azure-identity = 1.10.0),
gdrive (pydrive2 = 1.10.0),
gs (gcsfs = 2021.10.1),
hdfs (fsspec = 2021.10.1, pyarrow = 6.0.0),
webhdfs (fsspec = 2021.10.1),
http (aiohttp = 3.8.0, aiohttp-retry = 2.4.6),
https (aiohttp = 3.8.0, aiohttp-retry = 2.4.6),
s3 (s3fs = 2021.10.1, boto3 = 1.17.106),
ssh (sshfs = 2021.11.2),
oss (ossfs = 2021.8.0),
webdav (webdav4 = 0.9.3),
webdavs (webdav4 = 0.9.3)
Cache types: hardlink, symlink
Cache directory: NTFS on C:\
Caches: local
Remotes: None
Workspace directory: NTFS on C:\
Repo: dvc, git $ dvc import --verbose https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git pose/S1FV_PS_A_20220606-092610-0558-0.0003.hdf5
2022-06-16 11:22:48,163 DEBUG: Removing output 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5' of stage: 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5.dvc'.
2022-06-16 11:22:48,163 DEBUG: Removing 'C:\cygwin64\home\200003581\onwing_s1b_keyframe\onwing_s1b_keyframe\ckpts\tmp\S1FV_PS_A_20220606-092610-0558-0.0003.hdf5'
Importing 'pose/S1FV_PS_A_20220606-092610-0558-0.0003.hdf5 (https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git)' -> 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5'
2022-06-16 11:22:48,173 DEBUG: Computed stage: 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5.dvc' md5: 'fb7f834659a494bd33f117228c03a0d7'
2022-06-16 11:22:48,173 DEBUG: 'md5' of stage: 'S1FV_PS_A_20220606-092610-0558-0.0003.hdf5.dvc' changed.
2022-06-16 11:22:48,173 DEBUG: Creating external repo https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git@None
2022-06-16 11:22:48,173 DEBUG: erepo: git clone 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to a temporary dir
2022-06-16 11:22:48,674 ERROR: failed to import 'pose/S1FV_PS_A_20220606-092610-0558-0.0003.hdf5' from 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git'. - Failed to clone repo 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp8zsh7ht1dvc-clone'
------------------------------------------------------------
Traceback (most recent call last):
File "scmrepo\git\backend\dulwich\__init__.py", line 196, in clone
File "dulwich\porcelain.py", line 443, in clone
File "dulwich\client.py", line 747, in clone
File "dulwich\client.py", line 824, in fetch
File "dulwich\client.py", line 2079, in fetch_pack
File "dulwich\client.py", line 1938, in _discover_references
File "dulwich\client.py", line 2219, in _http_request
dulwich.client.HTTPUnauthorized: No valid credentials provided
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "dvc\scm.py", line 145, in clone
File "scmrepo\git\__init__.py", line 143, in clone
File "scmrepo\git\backend\dulwich\__init__.py", line 199, in clone
scmrepo.exceptions.CloneError: Failed to clone repo 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp8zsh7ht1dvc-clone'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "dvc\commands\imp.py", line 15, in run
File "dvc\repo\imp.py", line 6, in imp
File "dvc\repo\__init__.py", line 49, in wrapper
File "dvc\repo\scm_context.py", line 152, in run
File "dvc\repo\imp_url.py", line 83, in imp_url
File "funcy\decorators.py", line 45, in wrapper
File "dvc\stage\decorators.py", line 36, in rwlocked
File "funcy\decorators.py", line 66, in __call__
File "dvc\stage\__init__.py", line 535, in run
File "funcy\decorators.py", line 45, in wrapper
File "dvc\stage\decorators.py", line 36, in rwlocked
File "funcy\decorators.py", line 66, in __call__
File "dvc\stage\__init__.py", line 559, in _sync_import
File "dvc\stage\imports.py", line 47, in sync_import
File "dvc\dependency\repo.py", line 68, in download
File "dvc\dependency\repo.py", line 97, in get_used_objs
File "dvc\dependency\repo.py", line 111, in _get_used_and_obj
File "contextlib.py", line 113, in __enter__
File "dvc\external_repo.py", line 39, in external_repo
File "dvc\external_repo.py", line 169, in _cached_clone
File "funcy\decorators.py", line 45, in wrapper
File "funcy\flow.py", line 274, in wrap_with
File "funcy\decorators.py", line 66, in __call__
File "dvc\external_repo.py", line 239, in _clone_default_branch
File "dvc\scm.py", line 150, in clone
dvc.scm.CloneError: Failed to clone repo 'https://github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp8zsh7ht1dvc-clone'
------------------------------------------------------------
2022-06-16 11:22:48,689 DEBUG: Analytics is enabled.
2022-06-16 11:22:48,689 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\cygwin64\\tmp\\tmp13dedzle']'
2022-06-16 11:22:48,705 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\cygwin64\\tmp\\tmp13dedzle']' |
Hi @wdixon, yes. Thank you. This is a known issue (e.g. see #7670, #6586) The problem is that since 2.9.5, I'm assuming the http auth credentials are being passed to git using a credential helper defined in your git configuration. Clone breaks because If you wish to use dvc > 2.9.5 the workaround for http auth is providing username/password in the URL you wish to import: dvc import https://username:password@github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git pose/S1FV_PS_A_20220606-092610-0558-0.0003.hdf5 The good news is that there's an open PR (jelmer/dulwich#976) which adds support for credential helpers. If you want to test it out, you can try installing it: pip install git+https://github.com/dtrifiro/dulwich.git@feature/credential-helper Any feedback would be appreciated! |
Our backend enterprise git is configured to disallow username:password in the URL; however, it does accept tokens in the URL.... But unfortunately this doesn't seem to work work either - though this would be a more common way to pass credentials that including the password in the clear.... Using the token in the URL produces a different exception (see below). Traceback (most recent call last):
File "scmrepo\git\backend\dulwich\__init__.py", line 196, in clone
File "dulwich\porcelain.py", line 443, in clone
File "dulwich\client.py", line 747, in clone
File "dulwich\client.py", line 824, in fetch
File "dulwich\client.py", line 2079, in fetch_pack
File "dulwich\client.py", line 1938, in _discover_references
File "dulwich\client.py", line 2223, in _http_request
dulwich.errors.GitProtocolError: unexpected http resp 403 for https://0b9ad3ae_rest_of_token@github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git/info/refs?service=git-upload-pack
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "dvc\scm.py", line 145, in clone
File "scmrepo\git\__init__.py", line 143, in clone
File "scmrepo\git\backend\dulwich\__init__.py", line 199, in clone
scmrepo.exceptions.CloneError: Failed to clone repo 'https://0b9ad3ae_rest_of_token@github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp2vfmbeqjdvc-clone'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "dvc\commands\imp.py", line 15, in run
File "dvc\repo\imp.py", line 6, in imp
File "dvc\repo\__init__.py", line 49, in wrapper
File "dvc\repo\scm_context.py", line 152, in run
File "dvc\repo\imp_url.py", line 83, in imp_url
File "funcy\decorators.py", line 45, in wrapper
File "dvc\stage\decorators.py", line 36, in rwlocked
File "funcy\decorators.py", line 66, in __call__
File "dvc\stage\__init__.py", line 535, in run
File "funcy\decorators.py", line 45, in wrapper
File "dvc\stage\decorators.py", line 36, in rwlocked
File "funcy\decorators.py", line 66, in __call__
File "dvc\stage\__init__.py", line 559, in _sync_import
File "dvc\stage\imports.py", line 47, in sync_import
File "dvc\dependency\repo.py", line 68, in download
File "dvc\dependency\repo.py", line 97, in get_used_objs
File "dvc\dependency\repo.py", line 111, in _get_used_and_obj
File "contextlib.py", line 113, in __enter__
File "dvc\external_repo.py", line 39, in external_repo
File "dvc\external_repo.py", line 169, in _cached_clone
File "funcy\decorators.py", line 45, in wrapper
File "funcy\flow.py", line 274, in wrap_with
File "funcy\decorators.py", line 66, in __call__
File "dvc\external_repo.py", line 239, in _clone_default_branch
File "dvc\scm.py", line 150, in clone
dvc.scm.CloneError: Failed to clone repo 'https://0b9ad3ae_rest_of_token@github.build.company.com/auto-inspection/BIT_Tool_Artifacts.git' to 'C:\cygwin64\tmp\tmp2vfmbeqjdvc-clone'
------------------------------------------------------------
2022-06-16 12:37:28,104 DEBUG: Analytics is enabled.
2022-06-16 12:37:28,107 DEBUG: Trying to spawn '['daemon', '-q', 'analytics', 'C:\\cygwin64\\tmp\\tmp4cmxuv_z']'
2022-06-16 12:37:28,117 DEBUG: Spawned '['daemon', '-q', 'analytics', 'C:\\cygwin64\\tmp\\tmp4cmxuv_z']' |
This comment was marked as outdated.
This comment was marked as outdated.
The token does comply with the RFC.... It is simply the case where no password is provided, only the user (as the token) Directly from the RFC text: //<user>:<password>@<host>:<port>/<url-path>
password
An optional password. If present, it follows the user
name separated from it by a colon. |
When authenticating using basic auth and only providing username (no password) the default value of `None` was cast to string, resulting in an attempt to authenticate using `<username>:None` instead of providing an empty password. This broke authentication when providing an auth token as username. See iterative/dvc#7898 (comment)
I'm sorry, you're right. That is a bug in |
Recap:
|
@wdixon dulwich has merged the basic auth fix, so by using dulwich >= 0.20.44 you should be able to use |
@dtrifiro FWIW I tried installing
and I'm unfortunately still running into the same issue:
Downgrading to 2.9 fixes the issue. I'm on macOS. |
@d-miketa thanks! Could you post a full traceback using |
I got rid of the import but here's a similar issue when trying to do
|
And for the
|
Hey @d-miketa, thanks for the feedback. If you test out the same branch, it should be working now |
Thanks @dtrifiro, that did the trick!! :) |
Solved in |
We seem to no longer be able to import a remote dvc resources from a repository that requires authentication. This worked at some point prior to 2.10. This worked previously.
The output of the import is as follows:
There was some previous commit appears to fix a similar issue, which was part of the 2.11 release:
issue #7670
however, updating to 2.11 did not fix the issue.
DVC version: 2.11.0 (pip) --------------------------------- Platform: Python 3.8.0 on Linux-3.10.0-1160.66.1.el7.x86_64-x86_64-with-glibc2.27 Supports: webhdfs (fsspec = 2022.5.0), http (aiohttp = 3.8.1, aiohttp-retry = 2.4.6), https (aiohttp = 3.8.1, aiohttp-retry = 2.4.6), s3 (s3fs = 2022.5.0, boto3 = 1.21.21) Cache types: hardlink, symlink Cache directory: nfs on LEB1MLNAS.hpc.company.com:/leb1mlnas_projects Caches: local Remotes: None Workspace directory: nfs on LEB1MLNAS.hpc.company.com:/leb1mlnas_projects Repo: dvc, git
The text was updated successfully, but these errors were encountered: