Skip to content

Commit

Permalink
respond to misc review comments
Browse files Browse the repository at this point in the history
  • Loading branch information
casperdcl committed Apr 9, 2021
1 parent 7ac3fb9 commit abd8af5
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 12 deletions.
2 changes: 1 addition & 1 deletion content/docs/start/data-and-model-access.md
Original file line number Diff line number Diff line change
Expand Up @@ -114,5 +114,5 @@ with dvc.api.open(
'get-started/data.xml',
repo='https://github.com/iterative/dataset-registry'
) as fd:
# fd is a file descriptor which can be used here
# fd is a file descriptor which can be processed normally
```
24 changes: 13 additions & 11 deletions content/docs/start/data-and-model-versioning.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,8 @@ To start tracking a file or directory, use `dvc add`:

### ⚙️ Expand to get an example dataset.

Having initialized a project in the previous section, get the data file which we
will be using later like this:
Having initialized a project in the previous section, we can get the data file
(which we'll be using later) like this:

```dvc
$ dvc get https://github.com/iterative/dataset-registry \
Expand All @@ -48,16 +48,18 @@ $ dvc add data/data.xml
```

DVC stores information about the added file (or a directory) in a special `.dvc`
file named `data/data.xml.dvc` - a small text file with a human-readable
[format](/doc/user-guide/project-structure/dvc-files). This metadata file can be
easily versioned like source code with Git. The original data, meanwhile, is
listed in `.gitignore`:
file named `data/data.xml.dvc` a small text file with a human-readable
[format](/doc/user-guide/project-structure/dvc-files). This metadata file is a
placeholder for the original data, and can be easily versioned like source code
with Git:

```dvc
$ git add data/data.xml.dvc data/.gitignore
$ git commit -m "Add raw data"
```

The original data, meanwhile, is listed in `.gitignore`.

<details>

### 💡 Expand to see what happens under the hood.
Expand Down Expand Up @@ -89,7 +91,7 @@ outs:
You can upload DVC-tracked data or model files with `dvc push`, so they're
safely stored [remotely](/doc/command-reference/remote). This also means they
can be retrieved on other environments later with `dvc pull`. First, we need to
setup a storage provider:
setup a storage location:

```dvc
$ dvc remote add -d storage s3://mybucket/dvcstore
Expand All @@ -103,7 +105,7 @@ $ git commit -m "Configure remote storage"

<details>

### ⚙️ Expand to set up a remote storage provider ☁
### ⚙️ Expand to set up a remote storage location.

DVC remotes let you store a copy of the data tracked by DVC outside of the local
cache (usually a cloud storage service). For simplicity, let's set up a _local
Expand Down Expand Up @@ -156,7 +158,7 @@ run it after `git clone` and `git pull`.

<details>

### ⚙️ Expand to refresh the project ⟳
### ⚙️ Expand to delete locally cached data.

If you've run `dvc push`, you can delete the cache (`.dvc/cache`) and
`data/data.xml` to experiment with `dvc pull`:
Expand Down Expand Up @@ -237,8 +239,8 @@ $ git commit data/data.xml.dvc -m "Revert dataset updates"

</details>

Yes, DVC is technically not even a version control system! `.dvc` files' content
defines data file versions. Git itself provides the version control. DVC in turn
Yes, DVC is technically not even a version control system! `.dvc` file contents
define data file versions. Git itself provides the version control. DVC in turn
creates these `.dvc` files, updates them, and synchronizes DVC-tracked data in
the <abbr>workspace</abbr> efficiently to match them.

Expand Down

0 comments on commit abd8af5

Please sign in to comment.