run -d hdfs://: output does not exist #7561
Labels
bug
Did we break something?
fs: hdfs
Related to the HDFS filesystem
p1-important
Important, aka current backlog of things to do
Bug Report
Description
I am trying to add an external dependency which is stored on HDFS. I am running it inside this docker image: https://hub.docker.com/r/oneoffcoder/spark-jupyter which has hdfs installed and configured. The command to run the docker is there too.
When I run
dvc run -v --force -n download_file -d hdfs://localhost/data.csv -o data.csv hdfs dfs -copyToLocal hdfs://localhost/data.csv data.csv
I get the following error:The file data.csv exists in hdfs:
Reproduce
docker exec -it <container_id> /bin/bash
apt-get update && apt-get install "dvc[hdfs]" git
dvc init
touch data.csv
hdfs classpath --glob
Expected
I expect that a file dvc.yaml will be created with the external dependency hdfs://localhost/data.csv
Environment information
everything runs inside docker. the only change is
Output of
dvc doctor
:Additional Information (if any):
The issue is not present in DVC version 2.8.3
The text was updated successfully, but these errors were encountered: