update tag names and descriptions in example-get-started/code/README.md

jorgeorpinel · jorgeorpinel · commit 2bd28e99bc0d · 2019-08-23T14:38:48.000-07:00
per c37bf05#r34812136
diff --git a/example-get-started/code/README.md b/example-get-started/code/README.md
@@ -1,7 +1,7 @@
 # DVC Get Started
 
 This is an auto-generated repository for use in https://dvc.org/doc/get-started.
-Please report any issues in
+Please report any issues in its source project,
 [example-repos-dev](https://github.com/iterative/example-repos-dev).
 
 ![](https://dvc.org/static/img/example-flow-2x.png)
@@ -34,8 +34,10 @@ $ source .env/bin/activate
 $ pip install -r src/requirements.txt
 ```
 
-This DVC project comes with a preconfigured remote DVC storage that has raw data
-(input), intermediate, and final results that are produced.
+This DVC project comes with a preconfigured DVC
+[remote storage](https://dvc.org/doc/commands-reference/remote) that holds raw
+data (input), intermediate, and final results that are produced. This is a
+read-only HTTP remote.
 
 ```console
 $ dvc remote list
@@ -87,38 +89,41 @@ are run in the DVC [get started](https://dvc.org/doc/get-started) guide. Feel
 free to checkout one of them and play with the DVC commands having the
 playground ready.
 
-- `0-empty` - empty Git repository.
-- `1-initialize` - DVC has been initialized. The `.dvc` with the cache directory
+- `0-empty`: Empty Git repository initialized.
+- `1-initialize`: DVC has been initialized. `.dvc/` with the cache directory
   created.
-- `2-remote` - remote HTTP storage initialized. It is a shared read only storage
+- `2-remote`: Remote HTTP storage initialized. It's a shared read only storage
   that contains all data artifacts produced during next steps.
-- `3-add-file` - raw data file `data.xml` downloaded and put under DVC
-  control with [`dvc add`](https://man.dvc.org/add). First `.dvc` meta-file
-  created.
-- `4-source` - source code downloaded and put under Git control.
-- `5-preparation` - first DVC stage created using
+- `3-add-file`: Raw data file `data.xml` downloaded and put under DVC control
+  with [`dvc add`](https://man.dvc.org/add). First DVC-file (`.dvc` file
+  extension) created.
+- `4-source`: Source code downloaded and put under Git control.
+- `5-preparation`: First stage file (DVC-file) created using
   [`dvc run`](https://man.dvc.org/run). It transforms XML data into TSV.
-- `6-featurization` - feature extraction step added. It also includes the split
-  step for simplicity. It takes data in TSV format and produces two `.pkl` files
-  that contain serialized feature matrices.
-- `7-train` - the model training stage added. It produces `model.pkl` file - the
-  actual result that can be then deployed somewhere and classify questions.
-- `8-evaluate` - evaluate stage, we run it on a test dataset to see the AUC
-  value for the model. The result is dumped into a DVC metric file so that we
-  can compare it with other experiments later.
-- `9-bigrams` - bigrams experiment, code has been modified to extract more
+- `6-featurization`: Feature extraction stage created. It takes data in TSV
+  format and produces two `.pkl` files that contain serialized feature matrices.
+- `7-train`: Model training stage created. It produces `model.pkl` file – the
+  actual result that can then get deployed to an app that implements NLP
+  classification.
+- `8-evaluate`: Evaluation stage. Runs the model on a test dataset to produce
+  its performance AUC value. The result is dumped into a DVC metric file so that
+  we can compare it with other experiments later.
+- `9-bigrams-model`: Bigrams experiment, code has been modified to extract more
   features. We run [`dvc repro`](https://man.dvc.org/repro) for the first time
   to illustrate how DVC can reuse cached files and detect changes along the
-  computational graph.
+  computational graph, regenerating the model with the updated data.
+- `10-bigrams-experiment`: Reproduce the evaluation stage with the bigrams based
+  model.
 
 There are two additional tags:
 
-- `baseline-experiment` - the first end-to-end result that we performance metric
+- `baseline-experiment`: First end-to-end result that we have performance metric
   for.
-- `bigrams-experiment` - second version of the experiment.
+- `bigrams-experiment`: Second experiment (model trained using bigrams
+  features).
 
-Both these tags could be used to illustrate `-a` or `-T` options across
-different [DVC commands](https://man.dvc.org/).
+These tags can be used to illustrate `-a` or `-T` options across different
+[DVC commands](https://man.dvc.org/).
 
 ## Project Structure