Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulleted points end with : or . #495

Merged
merged 10 commits into from
Aug 12, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion static/docs/commands-reference/fetch.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,7 +100,7 @@ specified in DVC-files currently in the workspace are considered by `dvc fetch`
of a DVC-file ([experiments](/doc/get-started/experiments)), not just the
current one.

- `-T`, `--all-tags` - fetch cache for all tags. Similar to `-a` above
- `-T`, `--all-tags` - fetch cache for all tags. Similar to `-a` above.

- `--show-checksums` - show checksums instead of file names when printing the
download progress.
Expand Down
8 changes: 4 additions & 4 deletions static/docs/commands-reference/import-url.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@ In some cases it's convenient to add a data file or directory from a remote
location into the workspace, such that it will be automatically updated (by
`dvc repro`) when the external data source changes. Examples:

- a remote system may produce occasional data files that are used in other
projects;
- a batch process running regularly updates a data file to import; and
- a shared dataset on a remote storage that is managed and updated outside DVC.
- A remote system may produce occasional data files that are used in other
projects.
- A batch process running regularly updates a data file to import.
- A shared dataset on a remote storage that is managed and updated outside DVC.

The `dvc import-url` command helps the user create such an external data
dependency. The `url` argument specifies the external location of the data to be
Expand Down
12 changes: 6 additions & 6 deletions static/docs/commands-reference/index.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Using DVC Commands

DVC is a command-line tool. The typical use case for DVC goes as follows
DVC is a command-line tool. The typical use case for DVC goes as follows:

- In an existing Git repository, initialize a DVC repository with `dvc init`,
- In an existing Git repository, initialize a DVC repository with `dvc init`.
- Copy source code files for modeling into the repository and convert the files
into DVC data files with `dvc add` command;
into DVC data files with `dvc add` command.
- Process raw data files through your data processing and modeling code using
the `dvc run` command;
the `dvc run` command.
- Use `--outs` option to specify `dvc run` command outputs which will be
converted to DVC data files after the code runs;
converted to DVC data files after the code runs.
- Clone a git repo with the code of your ML application pipeline. However, this
will not copy your DVC cache. Use
[data remotes](/doc/commands-reference/remote) and `dvc push` to share the
cache (data);
cache (data).
- Use `dvc repro` to quickly reproduce your pipeline on a new iteration, after
your data item files or source code of your ML application are modified.
4 changes: 2 additions & 2 deletions static/docs/commands-reference/install.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,9 +46,9 @@ The installed Git hook automates executing `dvc push`.
## Installed Git hooks

- Git `pre-commit` hook executes `dvc status` before `git commit` to inform the
user about the workspace status;
user about the workspace status.
- Git `post-checkout` hook executes `dvc checkout` after `git checkout` to
automatically synchronize the data files with the new workspace state;
automatically synchronize the data files with the new workspace state.
- Git `pre-push` hook executes `dvc push` before `git push` to upload files and
directories under DVC control to remote.

Expand Down
1 change: 0 additions & 1 deletion static/docs/commands-reference/pull.md
Original file line number Diff line number Diff line change
Expand Up @@ -200,4 +200,3 @@ the `model.p.dvc` stage occurs later, its data was not pulled.
Then we ran `dvc pull` specifying the last stage, `model.p.dvc`, and its data
was downloaded. Finally, we ran `dvc pull` with no options to make sure that all
data was already pulled with the previous commands.

dnabanita7 marked this conversation as resolved.
Show resolved Hide resolved
1 change: 0 additions & 1 deletion static/docs/commands-reference/push.md
Original file line number Diff line number Diff line change
Expand Up @@ -339,4 +339,3 @@ Pipelines are up to date. Nothing to reproduce.

And running `dvc status --cloud` verifies that indeed there are no more files to
upload to the remote cache.

dnabanita7 marked this conversation as resolved.
Show resolved Hide resolved
2 changes: 1 addition & 1 deletion static/docs/commands-reference/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ pipeline.
dependencies can be specified like this: `-d data.csv -d process.py`. Usually,
each dependency is a file or a directory with data, or a code file, or a
configuration file. DVC also supports certain
[external dependencies](/doc/user-guide/external-dependencies)
[external dependencies](/doc/user-guide/external-dependencies).

DVC builds a computation graph and this list of dependencies is a way to
connect different stages with each other. When you run `dvc repro` to
Expand Down
6 changes: 3 additions & 3 deletions static/docs/commands-reference/unprotect.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,10 +29,10 @@ on this process.

`dvc unprotect` can be an expensive operation (involves copying data), check
first whether your task matches one of the cases that are considered safe, even
when cache protected mode is enabled:
when cache protected mode is enabled by:

- Adding more files to a directory input data set (say, images or videos)
- Deleting files from a directory data set
- Adding more files to a directory input data set (say, images or videos).
- Deleting files from a directory data set.
dnabanita7 marked this conversation as resolved.
Show resolved Hide resolved

## Options

Expand Down
1 change: 1 addition & 0 deletions static/docs/commands-reference/version.md
Original file line number Diff line number Diff line change
Expand Up @@ -125,3 +125,4 @@ Platform: Linux-4.15.0-50-generic-x86_64-with-debian-buster-sid
Binary: False
Filesystem type (workspace): ('ext4', '/dev/sdb3')
```

4 changes: 2 additions & 2 deletions static/docs/get-started/agenda.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,9 +29,9 @@ datasets, etc., then you may want to:
- Capture and save those <abbr>data artifacts</abbr> the same way we capture
code
dnabanita7 marked this conversation as resolved.
Show resolved Hide resolved
- Track and switch between different versions of the data easily
- Being able to answer the question of how data artifacts (e.g. ML models) were
- Be able to answer the question of how data artifacts (e.g. ML models) were
built in the first place
- Being able to compare them
- Be able to compare them
- Bring best practices to your team and get everyone on the same page

Then you are in a good place! Click the `Next` button below to start ↘
32 changes: 19 additions & 13 deletions static/docs/user-guide/dvc-files-and-directories.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,14 @@ Once initialized in a project, DVC populates its installation directory

### Special DVC internal files and directories
dnabanita7 marked this conversation as resolved.
Show resolved Hide resolved

`.dvc/config` - this is a configuration file. The config file can be edited by
hand or with a special command: `dvc config`.
- `.dvc/config` - this is a configuration file. The config file can be edited by
hand or with a special command: `dvc config`.

- `.dvc/config.local` - this is a local configuration file, that will overwrite
options in `.dvc/config`. This is useful when you need to specify private
options in your config that you don't want to track and share through Git
(credentials, private locations, etc). The local config file can be edited by
hand or with a special command: `dvc config --local`.

- `.dvc/cache` - the [cache directory](#structure-of-cache-directory) will
contain your data files. (The data directories of DVC repositories will only
Expand All @@ -19,22 +25,22 @@ hand or with a special command: `dvc config`.
> the Git repository, only [DVC-files](/doc/user-guide/dvc-file-format) that
> are needed to reproduce them.

`.dvc/state` - this file is used for optimization. It is a SQLite db, that
contains checksums for files in a project with respective timestamps and
inodes to avoid unnecessary checksum computations. It also contains a list of
links (from cache to workspace) created by dvc and is used to cleanup your
workspace when calling `dvc checkout`.
- `.dvc/state` - this file is used for optimization. It is a SQLite db, that
contains checksums for files in a project with respective timestamps and
inodes to avoid unnecessary checksum computations. It also contains a list of
links (from cache to workspace) created by dvc and is used to cleanup your
workspace when calling `dvc checkout`.

`.dvc/state-journal` - temporary file for SQLite operations
- `.dvc/state-journal` - temporary file for SQLite operations

`.dvc/state-wal` - another SQLite temporary file
- `.dvc/state-wal` - another SQLite temporary file

`.dvc/updater` - this file is used store latest available version of dvc,
which is used to remind user to upgrade.
- `.dvc/updater` - this file is used store latest available version of dvc,
which is used to remind user to upgrade.

`.dvc/updater.lock` - a lock file for `.dvc/updater`.
- `.dvc/updater.lock` - a lock file for `.dvc/updater`.

`.dvc/lock` - a lock file for the whole dvc project.
- `.dvc/lock` - a lock file for the whole dvc project.

## Structure of cache directory

Expand Down