-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dvc.api.get_url: it doesn't find "md5" tag #7977
Comments
Hi Nelson, can you resolve this bug? I have the same problem |
I could resove the problem. This work for me: |
I don't even think this is really a Here's the example from the docs but with a cloud remote: $ dvc remote add -d -f cloud s3://dave-sandbox-versioning/test/
$ dvc add https://data.dvc.org/get-started/data.xml --to-remote
$ dvc pull -v
2023-02-28 07:46:27,683 DEBUG: v2.45.2.dev46+g35b648bbc, CPython 3.10.2 on macOS-13.1-arm64-arm-64bit
2023-02-28 07:46:27,683 DEBUG: command: /Users/dave/miniforge3/envs/dvc/bin/dvc pull -v
2023-02-28 07:46:28,162 WARNING: Output 'data.xml'(stage: 'data.xml.dvc') is missing version info. Cache for it will not be collected. Use `dvc repro` to get your pipeline up to date.
2023-02-28 07:46:28,167 WARNING: No file hash info found for '/Users/dave/repo/data.xml'. It won't be created.
1 file failed
2023-02-28 07:46:28,167 ERROR: failed to pull data from the cloud - Checkout failed for following targets:
/Users/dave/repo/data.xml
Is your cache up to date?
<https://error.dvc.org/missing-files>
Traceback (most recent call last):
File "/Users/dave/Code/dvc/dvc/commands/data_sync.py", line 31, in run
stats = self.repo.pull(
File "/Users/dave/Code/dvc/dvc/repo/__init__.py", line 58, in wrapper
return f(repo, *args, **kwargs)
File "/Users/dave/Code/dvc/dvc/repo/pull.py", line 47, in pull
stats = self.checkout(
File "/Users/dave/Code/dvc/dvc/repo/__init__.py", line 58, in wrapper
return f(repo, *args, **kwargs)
File "/Users/dave/Code/dvc/dvc/repo/checkout.py", line 109, in checkout
raise CheckoutError(stats["failed"], stats)
dvc.exceptions.CheckoutError: Checkout failed for following targets:
/Users/dave/repo/data.xml Also, there's a question of what |
For the record: looks like |
Yeah, just missing a hash_name option in |
Bug Report
Description
I'm trying to use
dvc.api.get_url
to read a DataFrame in Google Cloud Storage, but the dvc file didn't create an md5 keyword, but an "etag" keyword in the.csv.dvc
file. The python commandapi.get_url()
looks for an "md5" keyword and generates an error.Reproduce
these are the steps I followed in macos with zsh:
python -m venv deploy_env
source deploy_env/bin/activate
python -m pip install --upgrade pip
Create these requirements in path requirements/dev.txt
pip install -r requirements/dev.txt
dvc init
dvc remote add dataset-track gs://model-data-tracker-775/dataset
dvc remote add model-track gs://model-data-tracker-775/model
dvc add dataset/finantials.csv --to-remote -r dataset-track
dvc add model/model.pkl --to-remote -r model-track
src/prepare.py
path:rm dataset/finantials.csv
python src/prepare.py
Expected
An error looking for an unexistent "md5" keyword:
Environment information
Output of
dvc doctor
:Additional Information (if any):
When I perform the
pull
command, it doesn't give me any error:The text was updated successfully, but these errors were encountered: