Skip to content

Commit 2bd28e9

Browse files
committed
update tag names and descriptions in example-get-started/code/README.md
per c37bf05#r34812136
1 parent a205fdf commit 2bd28e9

File tree

1 file changed

+30
-25
lines changed

1 file changed

+30
-25
lines changed

example-get-started/code/README.md

Lines changed: 30 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# DVC Get Started
22

33
This is an auto-generated repository for use in https://dvc.org/doc/get-started.
4-
Please report any issues in
4+
Please report any issues in its source project,
55
[example-repos-dev](https://github.com/iterative/example-repos-dev).
66

77
![](https://dvc.org/static/img/example-flow-2x.png)
@@ -34,8 +34,10 @@ $ source .env/bin/activate
3434
$ pip install -r src/requirements.txt
3535
```
3636

37-
This DVC project comes with a preconfigured remote DVC storage that has raw data
38-
(input), intermediate, and final results that are produced.
37+
This DVC project comes with a preconfigured DVC
38+
[remote storage](https://dvc.org/doc/commands-reference/remote) that holds raw
39+
data (input), intermediate, and final results that are produced. This is a
40+
read-only HTTP remote.
3941

4042
```console
4143
$ dvc remote list
@@ -87,38 +89,41 @@ are run in the DVC [get started](https://dvc.org/doc/get-started) guide. Feel
8789
free to checkout one of them and play with the DVC commands having the
8890
playground ready.
8991

90-
- `0-empty` - empty Git repository.
91-
- `1-initialize` - DVC has been initialized. The `.dvc` with the cache directory
92+
- `0-empty`: Empty Git repository initialized.
93+
- `1-initialize`: DVC has been initialized. `.dvc/` with the cache directory
9294
created.
93-
- `2-remote` - remote HTTP storage initialized. It is a shared read only storage
95+
- `2-remote`: Remote HTTP storage initialized. It's a shared read only storage
9496
that contains all data artifacts produced during next steps.
95-
- `3-add-file` - raw data file `data.xml` downloaded and put under DVC
96-
control with [`dvc add`](https://man.dvc.org/add). First `.dvc` meta-file
97-
created.
98-
- `4-source` - source code downloaded and put under Git control.
99-
- `5-preparation` - first DVC stage created using
97+
- `3-add-file`: Raw data file `data.xml` downloaded and put under DVC control
98+
with [`dvc add`](https://man.dvc.org/add). First DVC-file (`.dvc` file
99+
extension) created.
100+
- `4-source`: Source code downloaded and put under Git control.
101+
- `5-preparation`: First stage file (DVC-file) created using
100102
[`dvc run`](https://man.dvc.org/run). It transforms XML data into TSV.
101-
- `6-featurization` - feature extraction step added. It also includes the split
102-
step for simplicity. It takes data in TSV format and produces two `.pkl` files
103-
that contain serialized feature matrices.
104-
- `7-train` - the model training stage added. It produces `model.pkl` file - the
105-
actual result that can be then deployed somewhere and classify questions.
106-
- `8-evaluate` - evaluate stage, we run it on a test dataset to see the AUC
107-
value for the model. The result is dumped into a DVC metric file so that we
108-
can compare it with other experiments later.
109-
- `9-bigrams` - bigrams experiment, code has been modified to extract more
103+
- `6-featurization`: Feature extraction stage created. It takes data in TSV
104+
format and produces two `.pkl` files that contain serialized feature matrices.
105+
- `7-train`: Model training stage created. It produces `model.pkl` file – the
106+
actual result that can then get deployed to an app that implements NLP
107+
classification.
108+
- `8-evaluate`: Evaluation stage. Runs the model on a test dataset to produce
109+
its performance AUC value. The result is dumped into a DVC metric file so that
110+
we can compare it with other experiments later.
111+
- `9-bigrams-model`: Bigrams experiment, code has been modified to extract more
110112
features. We run [`dvc repro`](https://man.dvc.org/repro) for the first time
111113
to illustrate how DVC can reuse cached files and detect changes along the
112-
computational graph.
114+
computational graph, regenerating the model with the updated data.
115+
- `10-bigrams-experiment`: Reproduce the evaluation stage with the bigrams based
116+
model.
113117

114118
There are two additional tags:
115119

116-
- `baseline-experiment` - the first end-to-end result that we performance metric
120+
- `baseline-experiment`: First end-to-end result that we have performance metric
117121
for.
118-
- `bigrams-experiment` - second version of the experiment.
122+
- `bigrams-experiment`: Second experiment (model trained using bigrams
123+
features).
119124

120-
Both these tags could be used to illustrate `-a` or `-T` options across
121-
different [DVC commands](https://man.dvc.org/).
125+
These tags can be used to illustrate `-a` or `-T` options across different
126+
[DVC commands](https://man.dvc.org/).
122127

123128
## Project Structure
124129

0 commit comments

Comments
 (0)