From 1b523f420336c223dad32ac0c4e4318d840fc0d7 Mon Sep 17 00:00:00 2001 From: Mihir Joshi <mihir67mj@gmail.com> Date: Fri, 3 Jul 2020 19:48:48 +0530 Subject: [PATCH] Removed condescending language from Command Reference --- content/docs/command-reference/add.md | 4 ++-- content/docs/command-reference/checkout.md | 2 +- content/docs/command-reference/commit.md | 4 ++-- content/docs/command-reference/fetch.md | 12 ++++++------ content/docs/command-reference/import-url.md | 2 +- content/docs/command-reference/import.md | 17 ++++++++--------- content/docs/command-reference/init.md | 2 +- content/docs/command-reference/install.md | 8 ++++---- content/docs/command-reference/pull.md | 18 +++++++++--------- content/docs/command-reference/push.md | 10 +++++----- content/docs/command-reference/repro.md | 5 ++--- content/docs/command-reference/run.md | 12 ++++++------ content/docs/command-reference/update.md | 4 ++-- 13 files changed, 49 insertions(+), 51 deletions(-) diff --git a/content/docs/command-reference/add.md b/content/docs/command-reference/add.md index 7b923887b8..14e2bcd3ba 100644 --- a/content/docs/command-reference/add.md +++ b/content/docs/command-reference/add.md @@ -93,7 +93,7 @@ files, even when the directory is added as a whole. Examples: `dvc push`, As a rarely needed alternative, the `--recursive` option causes every file in the hierarchy to be added individually. A corresponding `.dvc` file will be -generated for each file in he same location. This may be helpful to save time +generated for each file in the same location. This may be helpful to save time adding several data files grouped in a structural directory, but it's undesirable for data directories with a large number of files. @@ -186,7 +186,7 @@ pics └── dogs [more image files] ``` -Tracking a directory with DVC as simple as with a single file: +Tracking a directory with DVC is as simple as it is with a single file: ```dvc $ dvc add pics diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md index baafff3655..e215d741e9 100644 --- a/content/docs/command-reference/checkout.md +++ b/content/docs/command-reference/checkout.md @@ -158,7 +158,7 @@ baseline-experiment <- First simple version of the model bigrams-experiment <- Uses bigrams to improve the model ``` -We can now just run `dvc checkout` that will update the most recent `model.pkl`, +We can now run `dvc checkout` that will update the most recent `model.pkl`, `data.xml`, and other files that are tracked by DVC. The model file hash is defined in the `dvc.lock` file, and in the `data.xml.dvc` file for the `data.xml`: diff --git a/content/docs/command-reference/commit.md b/content/docs/command-reference/commit.md index 7b238473b7..22f93d1bb5 100644 --- a/content/docs/command-reference/commit.md +++ b/content/docs/command-reference/commit.md @@ -96,8 +96,8 @@ reproducibility in those cases. ## Examples -Let's employ a simple <abbr>workspace</abbr> with some data, code, ML models, -pipeline stages, such as the <abbr>DVC project</abbr> created for the +Let's employ a <abbr>workspace</abbr> with some data, code, ML models, pipeline +stages, such as the <abbr>DVC project</abbr> created for the [Get Started](/doc/tutorials/get-started). Then we can see what happens with `git commit` and `dvc commit` in different situations. diff --git a/content/docs/command-reference/fetch.md b/content/docs/command-reference/fetch.md index ac5ea9b2b5..7de1a9f55a 100644 --- a/content/docs/command-reference/fetch.md +++ b/content/docs/command-reference/fetch.md @@ -120,8 +120,8 @@ or `-T` options are used). ## Examples -Let's employ a simple <abbr>workspace</abbr> with some data, code, ML models, -pipeline stages, such as the <abbr>DVC project</abbr> created for the +Let's employ a <abbr>workspace</abbr> with some data, code, ML models, pipeline +stages, such as the <abbr>DVC project</abbr> created for the [Get Started](/doc/tutorials/get-started). Then we can see what happens with `dvc fetch` as we switch from tag to tag. @@ -166,8 +166,8 @@ bigrams-experiment <- use bigrams to improve the model ## Example: Default behavior This project comes with a predefined HTTP -[remote storage](/doc/command-reference/remote). We can now just run `dvc fetch` -to download the most recent `model.pkl`, `data.xml`, and other DVC-tracked files +[remote storage](/doc/command-reference/remote). We can now run `dvc fetch` to +download the most recent `model.pkl`, `data.xml`, and other DVC-tracked files into our local <abbr>cache</abbr>. ```dvc @@ -256,8 +256,8 @@ $ dvc status -c deleted: data/data.xml ``` -One could do a simple `dvc fetch` to get all the data, but what if you only want -to retrieve the data up to our third stage, `train.dvc`? We can use the +One could do a `dvc fetch` to get all the data, but what if you only want to +retrieve the data up to our third stage, `train.dvc`? We can use the `--with-deps` (or `-d`) option: ```dvc diff --git a/content/docs/command-reference/import-url.md b/content/docs/command-reference/import-url.md index 8380f9508c..74844f9c03 100644 --- a/content/docs/command-reference/import-url.md +++ b/content/docs/command-reference/import-url.md @@ -31,7 +31,7 @@ external data source changes. Example scenarios: - A shared dataset on a remote storage that is managed and updated outside DVC. > Note that `dvc get-url` corresponds to the first step this command performs -> (just download the file or directory). +> (just downloads the file or directory). The `dvc import-url` command helps the user create such an external data dependency without having to manually copying files from the supported remote diff --git a/content/docs/command-reference/import.md b/content/docs/command-reference/import.md index 8ad0f90cf4..5cc4803747 100644 --- a/content/docs/command-reference/import.md +++ b/content/docs/command-reference/import.md @@ -29,7 +29,7 @@ updating the import later, if it has changed in its data source. (See `dvc update`.) > Note that `dvc get` corresponds to the first step this command performs (just -> download the data). +> downloads the data). > See `dvc list` for a way to browse repository contents to find files or > directories to import. @@ -102,7 +102,7 @@ from the source repo. ## Examples -A simple case for this command is to import a dataset from an external <abbr>DVC +A case for this command is to import a dataset from an external <abbr>DVC repository</abbr>, such as our [get started example repo](https://github.com/iterative/example-get-started). @@ -170,10 +170,10 @@ deps: If `rev` is a Git branch or tag (where the underlying commit changes), the data source may have updates at a later time. To bring it up to date if so (and -update `rev_lock` in the `.dvc` file), simply use `dvc update <stage>.dvc`. If -`rev` is a specific commit hash (does not change), `dvc update` without options -will not have an effect on the import stage. You may force-update it to a -different commit with `dvc update --rev`: +update `rev_lock` in the `.dvc` file), use `dvc update <stage>.dvc`. If `rev` is +a specific commit hash (does not change), `dvc update` without options will not +have an effect on the import stage. You may force-update it to a different +commit with `dvc update --rev`: ```dvc $ dvc update --rev cats-dogs-v2 @@ -189,9 +189,8 @@ If you take a look at our <abbr>project</abbr>, you'll see that it's organized into different directories such as `tutorial/ver` and `use-cases/`, and these contain [`.dvc` files](/doc/user-guide/dvc-files-and-directories#dvc-files) that track -different datasets. Given this simple structure, its data files can be easily -shared among several other projects using `dvc get` and `dvc import`. For -example: +different datasets. Given this structure, its data files can be easily shared +among several other projects using `dvc get` and `dvc import`. For example: ```dvc $ dvc get https://github.com/iterative/dataset-registry \ diff --git a/content/docs/command-reference/init.md b/content/docs/command-reference/init.md index 54cbf603b4..7e9b8c95d4 100644 --- a/content/docs/command-reference/init.md +++ b/content/docs/command-reference/init.md @@ -125,7 +125,7 @@ include: - SCM other than Git is being used. Even though there are DVC features that require DVC to be run in the Git repo, DVC can work well with other version - control systems. Since DVC relies on simple + control systems. Since DVC relies on [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) files to manage <abbr>pipelines</abbr>, data, etc, they can be added into any SCM thus providing large data files and directories versioning. diff --git a/content/docs/command-reference/install.md b/content/docs/command-reference/install.md index f815719b50..cd0a52782d 100644 --- a/content/docs/command-reference/install.md +++ b/content/docs/command-reference/install.md @@ -39,7 +39,7 @@ project's results (which implicitly commits them to DVC as well). This hook automates `dvc status` before `git commit` when needed, to remind the user to employ either `dvc commit` or `dvc repro`. -**Push**: While publishing changes to the Git remote with `git push`, its easy +**Push**: While publishing changes to the Git remote with `git push`, it's easy to forget that the `dvc push` command is necessary to upload new or updated data files and directories tracked by DVC to [remote storage](/doc/command-reference/remote). @@ -117,8 +117,8 @@ repos: ## Examples -Let's employ a simple <abbr>workspace</abbr> with some data, code, ML models, -pipeline stages, such as the <abbr>DVC project</abbr> created in our +Let's employ a <abbr>workspace</abbr> with some data, code, ML models, pipeline +stages, such as the <abbr>DVC project</abbr> created in our [Get Started](/doc/tutorials/get-started) section. Then we can see what happens with `dvc install` in different situations. @@ -261,7 +261,7 @@ matching what is referenced by the DVC files. To follow this example, start with the same workspace as before, making sure it is not in a _detached HEAD_ state by running `git checkout master`. -If we simply edit one of the code files: +If we edit one of the code files: ```dvc $ vi src/featurization.py diff --git a/content/docs/command-reference/pull.md b/content/docs/command-reference/pull.md index 17a6d66f20..eea998d4cb 100644 --- a/content/docs/command-reference/pull.md +++ b/content/docs/command-reference/pull.md @@ -37,11 +37,11 @@ The default remote is used (see `dvc config core.remote`) unless the `--remote` option is used. See `dvc remote` for more information on how to configure a remote. -With no arguments, just `dvc pull` or `dvc pull --remote <name>`, it downloads -only the files (or directories) missing from the workspace by searching all -stages in [`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) -or [`.dvc`](/doc/user-guide/dvc-files-and-directories#dvc-files) files currently -in the <abbr>project</abbr>. It will not download files associated with earlier +With no arguments, `dvc pull` or `dvc pull --remote <name>`, it downloads only +the files (or directories) missing from the workspace by searching all stages in +[`dvc.yaml`](/doc/user-guide/dvc-files-and-directories#dvcyaml-file) or +[`.dvc`](/doc/user-guide/dvc-files-and-directories#dvc-files) files currently in +the <abbr>project</abbr>. It will not download files associated with earlier commits in the <abbr>repository</abbr> (if using Git), nor will it download files that have not changed. @@ -113,8 +113,8 @@ reflinks or hardlinks to put it in the workspace without copying. See ## Examples -Let's employ a simple <abbr>workspace</abbr> with some data, code, ML models, -pipeline stages, such as the <abbr>DVC project</abbr> created for the +Let's employ a <abbr>workspace</abbr> with some data, code, ML models, pipeline +stages, such as the <abbr>DVC project</abbr> created for the [Get Started](/doc/tutorials/get-started). Then we can see what happens with `dvc pull`. @@ -193,8 +193,8 @@ $ dvc status -c ... ``` -One could do a simple `dvc pull` to get all the data, but what if you only want -to retrieve part of the data? +One could do a `dvc pull` to get all the data, but what if you only want to +retrieve part of the data? ```dvc $ dvc pull --with-deps featurize.dvc diff --git a/content/docs/command-reference/push.md b/content/docs/command-reference/push.md index 18e0d33fae..5dc57c8992 100644 --- a/content/docs/command-reference/push.md +++ b/content/docs/command-reference/push.md @@ -56,9 +56,9 @@ none are specified on the command line nor in the configuration. The default remote is used (see `dvc config core.remote`) unless the `--remote` option is used. See `dvc remote` for more information on how to configure a remote. -With no arguments, just `dvc push` or `dvc push --remote REMOTE`, it uploads -only the files (or directories) that are new in the local repository to remote -storage. It will not upload files associated with earlier commits in the +With no arguments, `dvc push` or `dvc push --remote REMOTE`, it uploads only the +files (or directories) that are new in the local repository to remote storage. +It will not upload files associated with earlier commits in the <abbr>repository</abbr> (if using Git), nor will it upload files that have not changed. @@ -179,8 +179,8 @@ $ dvc status --cloud new: data/matrix-train.p ``` -One could do a simple `dvc push` to share all the data, but what if you only -want to upload part of the data? +One could do a `dvc push` to share all the data, but what if you only want to +upload part of the data? ```dvc $ dvc push --with-deps matrix-train.p.dvc diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index ea924c7a82..93ba194c6c 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -181,7 +181,7 @@ the best ``` -And runs a few simple transformations to filter and count numbers: +And runs a few transformations to filter and count numbers: ```dvc $ dvc run -f filter.dvc -d text.txt -o numbers.txt \ @@ -196,8 +196,7 @@ $ dvc run -f Dvcfile -d numbers.txt -d process.py -M count.txt \ > example because that's the default stage file name `dvc repro` will read > without having to provide any `targets`. -Where `process.py` is a script that, for simplicity, just prints the number of -lines: +Where `process.py` is a script that, for simplicity, prints the number of lines: ```python import sys diff --git a/content/docs/command-reference/run.md b/content/docs/command-reference/run.md index f81780bd81..830ee532dc 100644 --- a/content/docs/command-reference/run.md +++ b/content/docs/command-reference/run.md @@ -175,9 +175,9 @@ $ dvc run -n my_stage './my_script.sh $MYENVVAR' - `-m <path>`, `--metrics <path>` - specify a metrics file produces by this stage. This option behaves like `-o` but registers the file in a `metrics` field inside the `dvc.yaml` stage. Metrics are usually small, human readable - files (JSON or YAML) with scalar numbers or other simple information that - describes a model (or any other data artifact). See `dvc metrics` to learn - more about _metrics_. + files (JSON or YAML) with scalar numbers or other information that describes a + model (or any other data artifact). See `dvc metrics` to learn more about + _metrics_. - `-M <path>`, `--metrics-no-cache <path>` - the same as `-m` except that DVC does not track the metrics file. This means that the file is not cached, so @@ -376,9 +376,9 @@ $ dvc dag ## Example: Using parameter dependencies -To use specific values inside a parameters file as dependencies, create a simple -YAML file named `params.yaml` (default params file name, see `dvc params` to -learn more): +To use specific values inside a parameters file as dependencies, create a YAML +file named `params.yaml` (default params file name, see `dvc params` to learn +more): ```yaml seed: 20180226 diff --git a/content/docs/command-reference/update.md b/content/docs/command-reference/update.md index ee8700301a..23929fa38d 100644 --- a/content/docs/command-reference/update.md +++ b/content/docs/command-reference/update.md @@ -73,8 +73,8 @@ Importing 'model.pkl (git@github.com:iterative/example-get-started)' As DVC mentions, the import stage (`.dvc` file) `model.pkl.dvc` is created. This [stage file](/doc/command-reference/run) is frozen by default though, so to [reproduce](/doc/command-reference/repro) it, we would need to run -`dvc unfreeze` on it first, then `dvc repro` (and `dvc freeze` again). Let's -just run `dvc update` on it instead: +`dvc unfreeze` on it first, then `dvc repro` (and `dvc freeze` again). Let's run +`dvc update` on it instead: ```dvc $ dvc update model.pkl.dvc