Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: regular update (early August) #522

Merged
merged 16 commits into from
Aug 10, 2019
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
16 commits
Select commit Hold shift + click to select a range
50212bd
import-url: try and update example; review term "Get Started" through…
jorgeorpinel Jul 30, 2019
c0cb7ed
update: change references to import stages being unlockable
jorgeorpinel Aug 4, 2019
be18aad
cmd ref: standardize example headers and expandables up to `dvc import`
jorgeorpinel Aug 4, 2019
a388712
cmd ref: standardize example headers and expandables up to `dvc update`
jorgeorpinel Aug 7, 2019
6946023
cmd ref: standardize example headers and expandables (finished)
jorgeorpinel Aug 7, 2019
e641349
term: update "workspace" glossary entry title and desc
jorgeorpinel Aug 8, 2019
f1dc617
cmd ref: updated shared example intro in checkout, commit, and fetch
jorgeorpinel Aug 8, 2019
2536b99
lint: Make YAML blocks valid
jorgeorpinel Aug 8, 2019
686561b
term: update "workspace" glossary entry and
jorgeorpinel Aug 9, 2019
fc28eb0
get-started: updates for more and better use of term "workspace"
jorgeorpinel Aug 9, 2019
8781a71
glossary: use new format with ES6 template literals
jorgeorpinel Aug 10, 2019
328b2ec
get-started: make noted text about git commit in add-files into a reg…
jorgeorpinel Aug 10, 2019
b949520
pipeline: small improvement in cmd ref short desc.
jorgeorpinel Aug 10, 2019
796e799
Merge remote-tracking branch 'upstream/master'
jorgeorpinel Aug 10, 2019
ee1cf35
tutorial: change `<<<<<<<` etc Git conflict markers to HTML symbols
jorgeorpinel Aug 10, 2019
6d1f6c1
term: Review "cache directory" usage and add <abbr> tags for glossary…
jorgeorpinel Aug 10, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 22 additions & 17 deletions static/docs/get-started/add-files.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@ DVC allows storing and versioning data files, ML models, directories,
intermediate results with Git, without checking the file contents into Git.
Let's get a sample dataset to play with:

```dvc
$ mkdir data
$ wget https://dvc.org/s3/get-started/data.xml -O data/data.xml
```

<details>

### Expand if you're on Windows or having problems downloading from command line
Expand All @@ -16,47 +21,42 @@ into `data` subdirectory. To download, right-click

</details>

```dvc
$ mkdir data
$ wget https://dvc.org/s3/get-started/data.xml -O data/data.xml
```

To take a file (or a directory) under DVC control just run `dvc add`, it accepts
any file or directory:
To take a file (or a directory) under DVC control just run `dvc add` on it. For
example:

```dvc
$ dvc add data/data.xml
```

DVC stores information about your data file in a special DVC-file, that has a
human-readable [format](/doc/user-guide/dvc-file-format) and can be committed to
Git to track versions of your file:
DVC stores information about the added data in a special **DVC-file** that has a
human-readable [format](/doc/user-guide/dvc-file-format). It can be committed to
Git:

```dvc
$ git add data/.gitignore data/data.xml.dvc
$ git commit -m "add raw data to DVC"
```

> Committing these special files to Git allows us to tack different versions of
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
> the data as it evolves with the source code udner Git control.

<details>

### Expand to learn about DVC internals

You can see that actual data file has been moved to the `.dvc/cache` directory,
while the entries in the workspace may be links to the actual files in the DVC
cache. (See
[File link types](/docs/user-guide/large-dataset-optimization#file-link-types-for-the-dvc-cache)
to learn about the supported file linking options, their tradeoffs, and how to
enable them).
cache.

```dvc
$ ls -R .dvc/cache
.dvc/cache/a3:
04afb96060aad90176268345e10355
```

where `a304afb96060aad90176268345e10355` is an MD5 hash of the `data.xml` file.
And if you check the `data/data.xml.dvc` DVC-file you will see that it has this
hash inside.
`a304afb96060aad90176268345e10355` from above is an MD5 hash of the `data.xml`
file we just added to DVC. And if you check the `data/data.xml.dvc` DVC-file you
will see that it has this hash inside.

</details>

Expand Down Expand Up @@ -85,6 +85,11 @@ and `dvc config cache` for more information.

</details>

If your <abbr>workspace</abbr> uses Git, without DVC you would have to manually
put each data file or directory in into `.gitignore`. DVC commands that take or
make files that will go under its control automatically takes care of this for
you! (You just have to add the changes to Git.)

Refer to
[Data and Model Files Versioning](/doc/use-cases/data-and-model-files-versioning),
`dvc add`, and `dvc run` for more information on storing and versioning data
Expand Down
10 changes: 5 additions & 5 deletions static/docs/get-started/initialize.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Initialize

In order to start using DVC, you need first to initialize it in your project's
directory. DVC doesn't require Git and can work without any source control
management system, but for the best experience we recommend using DVC on top of
Git repositories.
In order to start using DVC, you need first to initialize it in your
<abbr>workspace</abbr>. DVC doesn't require Git and can work without any source
control management system, but for the best experience we recommend using DVC on
top of Git repositories.

If you don't have a directory for your project already, create it now with these
If you don't have a directory for this project already, create it now with these
commands:

```dvc
Expand Down