diff --git a/content/docs/command-reference/commit.md b/content/docs/command-reference/commit.md
index 97cb30c2a3..10604461dd 100644
--- a/content/docs/command-reference/commit.md
+++ b/content/docs/command-reference/commit.md
@@ -249,7 +249,6 @@ $ git status -s
M src/train.py
$ dvc status
-
train.dvc:
changed deps:
modified: src/train.py
@@ -275,7 +274,6 @@ dependencies ['src/train.py'] of 'train.dvc' changed.
Are you sure you commit it? [y/n] y
$ dvc status
-
Data and pipelines are up to date.
```
diff --git a/content/docs/command-reference/fetch.md b/content/docs/command-reference/fetch.md
index 7e6c843796..8a3bbd22cf 100644
--- a/content/docs/command-reference/fetch.md
+++ b/content/docs/command-reference/fetch.md
@@ -154,8 +154,8 @@ into our local cache.
```dvc
$ dvc status --cloud
...
- deleted: data/features/train.pkl
- deleted: model.pkl
+ deleted: data/features/train.pkl
+ deleted: model.pkl
$ dvc fetch
diff --git a/content/docs/command-reference/get.md b/content/docs/command-reference/get.md
index 21e682703e..8bc4c2e8b3 100644
--- a/content/docs/command-reference/get.md
+++ b/content/docs/command-reference/get.md
@@ -31,20 +31,19 @@ directory. (Analogous to `wget`, but for repos.)
> directories to download.
The `url` argument specifies the address of the DVC or Git repository containing
-the data source. Both HTTP and SSH protocols are supported for online repos
-(e.g. `[user@]server:project.git`). `url` can also be a local file system path
-to an "offline" repo (if it's a DVC repo without a default remote, instead of
-downloading, DVC will try to copy the target data from its cache).
+the data source. Both HTTP and SSH protocols are supported (e.g.
+`[user@]server:project.git`). `url` can also be a local file system path.
The `path` argument is used to specify the location of the target to download
within the source repository at `url`. `path` can specify any file or directory
-in the source repo, either tracked by DVC (including paths inside tracked
-directories) or by Git. Note that DVC-tracked targets must be found in a
-`dvc.yaml` or `.dvc` file of the repo.
-
-⚠️ The project should have a default
-[DVC remote](/doc/command-reference/remote), containing the actual data for this
-command to work.
+tracked by either Git or DVC (including paths inside tracked directories). Note
+that DVC-tracked targets must be found in a `dvc.yaml` or `.dvc` file of the
+repo.
+
+⚠️ DVC repos should have a default [DVC remote](/doc/command-reference/remote)
+containing the target actual for this command to work. The only exception is for
+local repos, where DVC will try to copy the data from its cache
+first.
> See `dvc get-url` to download data from other supported locations such as S3,
> SSH, HTTP, etc.
diff --git a/content/docs/command-reference/import-url.md b/content/docs/command-reference/import-url.md
index d4b283be2d..6eca56502b 100644
--- a/content/docs/command-reference/import-url.md
+++ b/content/docs/command-reference/import-url.md
@@ -109,8 +109,12 @@ $ dvc run -n download_data \
wget https://data.dvc.org/get-started/data.xml -O data.xml
```
-`dvc import-url` generates an import stage `.dvc` file and `dvc run` a regular
-stage (in `dvc.yaml`).
+`dvc import-url` generates an import stage `.dvc` file and
+`dvc run` a regular stage (in `dvc.yaml`).
+
+⚠️ DVC won't push or pull imported data to/from
+[remote storage](/doc/command-reference/remote), it will rely on it's original
+source.
## Options
diff --git a/content/docs/command-reference/import.md b/content/docs/command-reference/import.md
index 341d4fd587..3f9d9f7a6b 100644
--- a/content/docs/command-reference/import.md
+++ b/content/docs/command-reference/import.md
@@ -34,21 +34,19 @@ updating the import later, if it has changed in its data source. (See
> directories to import.
The `url` argument specifies the address of the DVC or Git repository containing
-the data source. Both HTTP and SSH protocols are supported for online repos
-(e.g. `[user@]server:project.git`). `url` can also be a local file system path
-to an "offline" repo (if it's a DVC repo without a default remote, instead of
-downloading, DVC will try to copy the target data from its cache).
+the data source. Both HTTP and SSH protocols are supported (e.g.
+`[user@]server:project.git`). `url` can also be a local file system path.
The `path` argument is used to specify the location of the target to download
within the source repository at `url`. `path` can specify any file or directory
-in the source repo, either tracked by DVC (including paths inside tracked
-directories) or by Git. Note that DVC-tracked targets must be found in a
-`dvc.yaml` or `.dvc` file of the repo. Chained imports (importing data that was
-imported into the source repo at `url`) are not supported, however.
+tracked by either Git or DVC (including paths inside tracked directories). Note
+that DVC-tracked targets must be found in a `dvc.yaml` or `.dvc` file of the
+repo.
-⚠️ The project should have a default
-[DVC remote](/doc/command-reference/remote), containing the actual data for this
-command to work.
+⚠️ DVC repos should have a default [DVC remote](/doc/command-reference/remote)
+containing the target actual for this command to work. The only exception is for
+local repos, where DVC will try to copy the data from its cache
+first.
> See `dvc import-url` to download and track data from other supported locations
> such as S3, SSH, HTTP, etc.
@@ -66,6 +64,10 @@ path in the workspace. It records enough metadata about the
imported data to enable DVC efficiently determining whether the local copy is
out of date.
+⚠️ DVC won't push or pull imported data to/from
+[remote storage](/doc/command-reference/remote), it will rely on it's original
+source.
+
To actually [version the data](/doc/tutorials/get-started/data-versioning),
`git add` (and `git commit`) the import stage.
@@ -74,6 +76,9 @@ Note that import stages are considered always
they won't be updated. Use `dvc update` to update the downloaded data artifact
from the source repo.
+Also note that chained imports (importing data that was imported into the source
+repo at `url`) are not supported.
+
## Options
- `-o `, `--out ` - specify a path to the desired location in the
@@ -112,9 +117,10 @@ Importing 'data/data.xml (git@github.com:iterative/example-get-started)'
```
In contrast with `dvc get`, this command doesn't just download the data file,
-but it also creates an import stage (`.dvc` file) with a link to the data source
-(as explained in the description above). (This import stage can later be used to
-[update](/doc/command-reference/update) the import.) Check `data.xml.dvc`:
+but it also creates an import stage (`.dvc` file) with a link to
+the data source (as explained in the description above). (This import stage can
+later be used to [update](/doc/command-reference/update) the import.) Check
+`data.xml.dvc`:
```yaml
md5: 7de90e7de7b432ad972095bc1f2ec0f8
diff --git a/content/docs/command-reference/install.md b/content/docs/command-reference/install.md
index 35461f2ed5..24dac7b7d1 100644
--- a/content/docs/command-reference/install.md
+++ b/content/docs/command-reference/install.md
@@ -247,7 +247,6 @@ M model.pkl
M data/features/
$ dvc status
-
Data and pipelines are up to date.
```
diff --git a/content/docs/command-reference/list.md b/content/docs/command-reference/list.md
index 7347303170..651c5537ff 100644
--- a/content/docs/command-reference/list.md
+++ b/content/docs/command-reference/list.md
@@ -21,7 +21,7 @@ DVC, by effectively replacing data files, models, directories with `.dvc` files
files when you browse a DVC repository on Git hosting (e.g.
GitHub), you just see the `dvc.yaml` and `.dvc` files. This makes it hard to
navigate the project to find data artifacts for use with `dvc get`,
-`dvc import`, or `dvc.api`.
+`dvc import`, or `dvc.api` functions.
`dvc list` prints a virtual view of a DVC repository, as if files and
directories tracked by DVC were found directly in the remote Git repo. Only the
@@ -36,10 +36,9 @@ $ dvc pull
$ ls
```
-The `url` argument specifies the address of the Git repository containing the
-data source. Both HTTP and SSH protocols are supported for online repos (e.g.
-`[user@]server:project.git`). `url` can also be a local file system path to an
-"offline" Git repo.
+The `url` argument specifies the address of the DVC or Git repository containing
+the data source. Both HTTP and SSH protocols are supported (e.g.
+`[user@]server:project.git`). `url` can also be a local file system path.
The optional `path` argument is used to specify a directory to list within the
source repository at `url` (including paths inside tracked directories). It's
diff --git a/content/docs/command-reference/metrics/diff.md b/content/docs/command-reference/metrics/diff.md
index daec243ab7..dd381aecc3 100644
--- a/content/docs/command-reference/metrics/diff.md
+++ b/content/docs/command-reference/metrics/diff.md
@@ -41,7 +41,7 @@ lists all the current metrics without comparisons.
## Options
-- `--targets ` - limit command scope to these metric files. Using -R,
+- `--targets ` - limit command scope to these metric files. Using `-R`,
directories to search metric files in can also be given. When specifying
arguments for `--targets` before `revisions`, you should use `--` after this
option's arguments, e.g.:
diff --git a/content/docs/command-reference/move.md b/content/docs/command-reference/move.md
index f2bbe24df5..e564b45653 100644
--- a/content/docs/command-reference/move.md
+++ b/content/docs/command-reference/move.md
@@ -109,7 +109,7 @@ $ dvc commit -f
- `-v`, `--verbose` - displays detailed tracing information.
-## Example: change the file name
+## Example: Change the file name
We first use `dvc add` to track file with DVC. Then, we change its name using
`dvc move`.
@@ -130,7 +130,7 @@ $ tree
└── other.csv.dvc
```
-## Example: change the location
+## Example: Change a file location
We use `dvc add` to track a file with DVC, then we use `dvc move` to change its
location. If the target path is a directory and already exists, the data file is
@@ -166,7 +166,7 @@ $ tree
└── foo.dvc
```
-## Example: change an imported directory name and location
+## Example: Move a directory
Let's try the same with an entire directory imported from an external DVC
repository with `dvc import`. Note that, as in the previous cases, the
diff --git a/content/docs/command-reference/pull.md b/content/docs/command-reference/pull.md
index 1bcc61e8cc..4e5b65f640 100644
--- a/content/docs/command-reference/pull.md
+++ b/content/docs/command-reference/pull.md
@@ -192,6 +192,7 @@ such that the data in some of these stages should be updated in the
```dvc
$ dvc status -c
+...
deleted: data/features/test.pkl
deleted: data/features/train.pkl
deleted: model.pkl
diff --git a/content/docs/command-reference/push.md b/content/docs/command-reference/push.md
index e2151ccfcd..09def79221 100644
--- a/content/docs/command-reference/push.md
+++ b/content/docs/command-reference/push.md
@@ -149,9 +149,10 @@ Imagine the project has been modified such that the
```dvc
$ dvc status --cloud
- new: data/model.p
- new: data/matrix-test.p
- new: data/matrix-train.p
+...
+ new: data/model.p
+ new: data/matrix-test.p
+ new: data/matrix-train.p
```
One could do a simple `dvc push` to share all the data, but what if you only
@@ -258,7 +259,6 @@ $ tree ~/vault/recursive
10 directories, 10 files
$ dvc status --cloud
-
Data and pipelines are up to date.
```
diff --git a/content/docs/command-reference/status.md b/content/docs/command-reference/status.md
index f9a5766a0f..c45e63edf8 100644
--- a/content/docs/command-reference/status.md
+++ b/content/docs/command-reference/status.md
@@ -160,11 +160,11 @@ bar.dvc:
modified: bar
changed outs:
not in cache: foo
-foo.dvc
+foo.dvc:
changed outs:
deleted: foo
changed checksum
-prepare.dvc
+prepare.dvc:
changed outs:
new: bar
always changed
@@ -180,11 +180,11 @@ This shows that for stage `bar.dvc`, the dependency `foo` and the
```dvc
$ dvc status foo.dvc dobar
-foo.dvc
+foo.dvc:
changed outs:
deleted: foo
changed checksum
-dobar
+dobar:
changed deps:
modified: bar
changed outs:
@@ -220,7 +220,7 @@ $ dvc status model.p
Data and pipelines are up to date.
$ dvc status model.p --with-deps
-matrix-train.p
+matrix-train.p:
changed deps:
modified: code/featurization.py
```
@@ -243,10 +243,11 @@ remote yet:
```dvc
$ dvc status --remote storage
-new: data/model.p
-new: data/eval.txt
-new: data/matrix-train.p
-new: data/matrix-test.p
+...
+ new: data/model.p
+ new: data/eval.txt
+ new: data/matrix-train.p
+ new: data/matrix-test.p
```
The output shows where the location of the remote storage is, as well as any
diff --git a/content/docs/user-guide/dvcignore.md b/content/docs/user-guide/dvcignore.md
index cc0b0578b7..d826859e03 100644
--- a/content/docs/user-guide/dvcignore.md
+++ b/content/docs/user-guide/dvcignore.md
@@ -149,12 +149,10 @@ adding new file:
```dvc
$ dvc status
-
Data and pipelines are up to date.
$ mv data/data1 data/data3
$ dvc status
-
data.dvc:
changed outs:
modified: data
diff --git a/content/docs/user-guide/external-dependencies.md b/content/docs/user-guide/external-dependencies.md
index 2a3846b616..e89283bb47 100644
--- a/content/docs/user-guide/external-dependencies.md
+++ b/content/docs/user-guide/external-dependencies.md
@@ -146,27 +146,39 @@ $ dvc run -n download_file \
-## Example: DVC remote aliases
+## Example: Using DVC remote aliases
-If instead of a URL you'd like to use an alias that can be managed
-independently, or if the external dependency location requires access
-credentials, you may use `dvc remote add` to define this location as a DVC
-Remote, and then use a special URL with format `remote://{remote_name}/{path}`
-to define an external dependency.
+You may want to encapsulate external locations as configurable entities that can
+be managed independently. This is useful if multiple dependencies (or stages)
+reuse the same location, or if its likely to change in the future. And if the
+location requires authentication, you need a way to configure it in order to
+connect.
-For example, for an HTTPs remote/dependency:
+[DVC remotes](/doc/command-reference/remote) can do just this. You may use
+`dvc remote add` to define them, and then use a special URL with format
+`remote://{remote_name}/{path}` (remote alias) to define the external
+dependency.
+
+Let's see an example using SSH. First, register and configure the remote:
+
+```dvc
+$ dvc remote add myssh ssh://myserver.com
+$ dvc remote modify --local myssh user myuser
+$ dvc remote modify --local myssh password mypassword
+```
+
+> Please refer to `dvc remote add` for more details like setting up access
+> credentials for the different remote types.
+
+Now, use an alias to this remote when defining the stage:
```dvc
-$ dvc remote add example https://example.com
$ dvc run -n download_file \
- -d remote://example/data.txt \
+ -d remote://myssh/path/to/data.txt \
-o data.txt \
wget https://example.com/data.txt -O data.txt
```
-Please refer to `dvc remote add` for more details like setting up access
-credentials for the different remotes.
-
## Example: `import-url` command
In the previous examples, special downloading tools were used: `scp`,
@@ -205,11 +217,11 @@ determine whether the source has changed and we need to download the file again.
-## Example: Using import
+## Example: Imports
`dvc import` can download a data artifact from any DVC
-project or Git repository. It also creates an external dependency in its
-import `.dvc` file.
+project, or any file from a Git repository. It also creates an external
+dependency in its import `.dvc` file.
```dvc
$ dvc import git@github.com:iterative/example-get-started model.pkl
diff --git a/content/docs/user-guide/merge-conflicts.md b/content/docs/user-guide/merge-conflicts.md
index 33d77948a6..231cd3d9a2 100644
--- a/content/docs/user-guide/merge-conflicts.md
+++ b/content/docs/user-guide/merge-conflicts.md
@@ -103,11 +103,6 @@ To resolve conflicted `.dvc` files generated by `dvc import` or
`dvc import-url`, remove the conflicted hashes altogether:
```yaml
-< < < < < < < HEAD
-md5: 263395583f35403c8e0b1b94b30bea32
-=======
-md5: 520d2602f440d13372435d91d3bfa176
-> > > > > > > branch
frozen: true
deps:
- path: get-started/data.xml
@@ -115,15 +110,15 @@ deps:
url: https://github.com/iterative/dataset-registry
< < < < < < < HEAD
rev_lock: f31f5c4cdae787b4bdeb97a717687d44667d9e62
-=======
+= = = = = = =
rev_lock: 06be1104741f8a7c65449322a1fcc8c5f1070a1e
->>>>>>> branch
+> > > > > > > branch
outs:
< < < < < < < HEAD
- md5: a304afb96060aad90176268345e10355
-=======
+= = = = = = =
- md5: 35dd1fda9cfb4b645ae431f4621fa324
-> > > > > > >
+> > > > > > > branch
path: data.xml
```
@@ -139,4 +134,8 @@ outs:
- path: data.xml
```
-And then `dvc update` the `.dvc` file.
+And then `dvc update` the `.dvc` file to download the latest data from its
+original source.
+
+> Note that updating will bring in the latest version of the data found in its
+> source, which may not correspond with any of the hashes that was removed.
diff --git a/content/docs/user-guide/what-is-dvc.md b/content/docs/user-guide/what-is-dvc.md
index 18f86f0acb..045de098a8 100644
--- a/content/docs/user-guide/what-is-dvc.md
+++ b/content/docs/user-guide/what-is-dvc.md
@@ -1,6 +1,6 @@
# What Is DVC?
-**Data Version Control** is a new type of data versioning, workflow and
+**Data Version Control** is a new type of data versioning, workflow, and
experiment management software, that builds upon [Git](https://git-scm.com/)
(although it can work stand-alone). DVC reduces the gap between established
engineering tool sets and data science needs, allowing users to take advantage