Skip to content

dvc add a file from data directory #3218

Closed
@dmpetrov

Description

@dmpetrov

From a discussion with users: https://opendatascience.slack.com/archives/CGGLZJ119/p1579704007005600 (you need access to this community)

$ mkdir datadir
$ cp ~/whatever/* datadir/
$ dvc add datadir/
WARNING: Output 'datadir' of 'datadir.dvc' changed because it is 'modified'
To track the changes with git, run:

          git add datadir.dvc
$ cp ~/Downloads/newfile.csv datadir/jan2020.csv
$ dvc add datadir/jan2020.csv
ERROR: Paths for outs:                  # <-- error is terrible btw (not related to this issue).
'datadir'('datadir.dvc')
'datadir/file4'('datadir/file4.dvc')
overlap. To avoid unpredictable behaviour, rerun command with non overlapping outs paths.

The last command fails because the file is inside a data dir and you suppose to update (dvc add) the entire dir. However, a user intuition says (for some users) to add a single file.

Ideally, this should work:

$ dvc add datadir/jan2020.csv
'jan2020' was added to dir 'datadir' and 'datadir.dvc' changed because it is 'modified'
100% Add|██████████████████████████████████|1.00/1.00 [00:00<00:00,  2.76file/s]

To track the changes with git, run:

	git add datadir.dvc

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions