iterative · jorgeorpinel · May 12, 2021 · Apr 6, 2021 · Apr 6, 2021 · Apr 7, 2021
diff --git a/content/docs/start/data-and-model-access.md b/content/docs/start/data-and-model-access.md
@@ -4,27 +4,27 @@ title: 'Get Started: Data and Model Access'
 
 # Get Started: Data and Model Access
 
-Okay, we've learned how to _track_ data and models with DVC, and how to commit
-their versions to Git. The next questions are: How can we _use_ these artifacts
-outside of the project? How do I download a model to deploy it? How to download
+We've learned how to _track_ data and models with DVC, and how to commit their
+versions to Git. The next questions are: How can we _use_ these artifacts
+outside of the project? How do we download a model to deploy it? How to download
 a specific version of a model? Or reuse datasets across different projects?
 
 > These questions tend to come up when you browse the files that DVC saves to
-> remote storage, e.g.
+> remote storage (e.g.
 > `s3://dvc-public/remote/get-started/fb/89904ef053f04d64eafcc3d70db673` 😱
-> instead of the original files, name such as `model.pkl` or `data.xml`.
+> instead of the original file name such as `model.pkl` or `data.xml`).
 
 Read on or watch our video to see how to find and access models and datasets
 with DVC.
 
 https://youtu.be/EE7Gk84OZY8
 
-Remember those `.dvc` files `dvc add` generates? Those files (and `dvc.lock`
-that we'll cover later), have their history in Git, DVC remote storage config
-saved in Git contain all the information needed to access and download any
-version of datasets, files, and models. It means that a Git repository with
-<abbr>DVC files</abbr> becomes an entry point, and can be used instead of
-accessing files directly.
+Remember those `.dvc` files `dvc add` generates? Those files (and `dvc.lock`,
+which we'll cover later) have their history in Git. DVC's remote storage config
+is also saved in Git, and contains all the information needed to access and
+download any version of datasets, files, and models. It means that a Git
+repository with <abbr>DVC files</abbr> becomes an entry point, and can be used
+instead of accessing files directly.
 
 ## Find a file or directory
 
@@ -62,7 +62,7 @@ the data came from or whether new versions are available.
 ## Import file or directory
 
 `dvc import` also downloads any file or directory, while also creating a `.dvc`
-file that can be saved in the project:
+file (which can be saved in the project):
 
 ```dvc
 $ dvc import https://github.com/iterative/dataset-registry \
@@ -71,7 +71,7 @@ $ dvc import https://github.com/iterative/dataset-registry \
 
 This is similar to `dvc get` + `dvc add`, but the resulting `.dvc` files
 includes metadata to track changes in the source repository. This allows you to
-bring in changes from the data source later, using `dvc update`.
+bring in changes from the data source later using `dvc update`.
 
 <details>
 
@@ -83,7 +83,7 @@ bring in changes from the data source later, using `dvc update`.
 > `dvc import` downloads from [remote storage](/doc/command-reference/remote).
 
 `.dvc` files created by `dvc import` have special fields, such as the data
-source `repo`, and `path` (under `deps`):
+source `repo` and `path` (under `deps`):
 
 ```git
 +deps:
@@ -111,8 +111,8 @@ directly from within an application at runtime. For example:
 import dvc.api
 
 with dvc.api.open(
-        'get-started/data.xml',
-        repo='https://github.com/iterative/dataset-registry'
-        ) as fd:
-    # ... fd is a file descriptor that can be processed normally.
+    'get-started/data.xml',
+    repo='https://github.com/iterative/dataset-registry'
+) as fd:
+    # fd is a file descriptor which can be processed normally
 ```
diff --git a/content/docs/start/data-and-model-versioning.md b/content/docs/start/data-and-model-versioning.md
@@ -13,9 +13,9 @@ and seeing data files and machine learning models in the workspace. Or switching
 to a different version of a 100Gb file in less than a second with a
 `git checkout`.
 
-The foundation of DVC consists of a few commands that you can run along with
-`git` to track large files, directories, or ML model files. Think "Git for
-data". Read on or watch our video to learn about versioning data with DVC!
+The foundation of DVC consists of a few commands you can run along with `git` to
+track large files, directories, or ML model files. Think "Git for data". Read on
+or watch our video to learn about versioning data with DVC!
 
 https://youtu.be/kLKBcPonMYw
 
@@ -25,16 +25,16 @@ To start tracking a file or directory, use `dvc add`:
 
 ### ⚙️ Expand to get an example dataset.
 
-Having initialized a project in the previous section, get the data file we will
-be using later like this:
+Having initialized a project in the previous section, we can get the data file
+(which we'll be using later) like this:
 
 ```dvc
 $ dvc get https://github.com/iterative/dataset-registry \
           get-started/data.xml -o data/data.xml
 ```
 
-We use the fancy `dvc get` command to jump ahead a bit and show how Git repo
-becomes a source for datasets or models - what we call "data/model registry".
+We use the fancy `dvc get` command to jump ahead a bit and show how a Git repo
+becomes a source for datasets or models — what we call a "data/model registry".
 `dvc get` can download any file or directory tracked in a <abbr>DVC
 repository</abbr>. It's like `wget`, but for DVC or Git repos. In this case we
 download the latest version of the `data.xml` file from the
@@ -48,22 +48,24 @@ $ dvc add data/data.xml
 ```
 
 DVC stores information about the added file (or a directory) in a special `.dvc`
-file named `data/data.xml.dvc`, a small text file with a human-readable
-[format](/doc/user-guide/project-structure/dvc-files). This file can be easily
-versioned like source code with Git, as a placeholder for the original data
-(which gets listed in `.gitignore`):
+file named `data/data.xml.dvc` — a small text file with a human-readable
+[format](/doc/user-guide/project-structure/dvc-files). This metadata file is a
+placeholder for the original data, and can be easily versioned like source code
+with Git:
 
 ```dvc
 $ git add data/data.xml.dvc data/.gitignore
 $ git commit -m "Add raw data"
 ```
 
+The original data, meanwhile, is listed in `.gitignore`.
+
 <details>
 
 ### 💡 Expand to see what happens under the hood.
 
-`dvc add` moved the data to the project's <abbr>cache</abbr>, and linked\* it
-back to the <abbr>workspace</abbr>.
+`dvc add` moved the data to the project's <abbr>cache</abbr>, and
+<abbr>linked</abbr> it back to the <abbr>workspace</abbr>.
 
 ```dvc
 $ tree .dvc/cache
@@ -82,35 +84,31 @@ outs:
     path: data.xml
 ```
 
-> \* See
-> [Large Dataset Optimization](/doc/user-guide/large-dataset-optimization) and
-> `dvc config cache` for more info. on file linking.
-
 </details>
 
 ## Storing and sharing
 
 You can upload DVC-tracked data or model files with `dvc push`, so they're
 safely stored [remotely](/doc/command-reference/remote). This also means they
 can be retrieved on other environments later with `dvc pull`. First, we need to
-setup a storage:
+setup a remote storage location:
 
 ```dvc
 $ dvc remote add -d storage s3://mybucket/dvcstore
 $ git add .dvc/config
 $ git commit -m "Configure remote storage"
 ```
 
-> DVC supports the following remote storage types: Google Drive, Amazon S3,
-> Azure Blob Storage, Google Cloud Storage, Aliyun OSS, SSH, HDFS, and HTTP.
-> Please refer to `dvc remote add` for more details and examples.
+> DVC supports many remote storage types, including Amazon S3, SSH, Google
+> Drive, Azure Blob Storage, and HDFS. See `dvc remote add` for more details and
+> examples.
 
 <details>
 
-### ⚙️ Set up a remote storage
+### ⚙️ Expand to set up remote storage.
 
 DVC remotes let you store a copy of the data tracked by DVC outside of the local
-cache, usually a cloud storage service. For simplicity, let's set up a _local
+cache (usually a cloud storage service). For simplicity, let's set up a _local
 remote_:
 
 ```dvc
@@ -121,7 +119,7 @@ $ git commit .dvc/config -m "Configure local remote"
 
 > While the term "local remote" may seem contradictory, it doesn't have to be.
 > The "local" part refers to the type of location: another directory in the file
-> system. "Remote" is how we call storage for <abbr>DVC projects</abbr>. It's
+> system. "Remote" is what we call storage for <abbr>DVC projects</abbr>. It's
 > essentially a local data backup.
 
 </details>
@@ -160,7 +158,7 @@ run it after `git clone` and `git pull`.
 
 <details>
 
-### ⚙️ Expand to explode the project 💣
+### ⚙️ Expand to delete locally cached data.
 
 If you've run `dvc push`, you can delete the cache (`.dvc/cache`) and
 `data/data.xml` to experiment with `dvc pull`:
@@ -189,8 +187,8 @@ latest version:
 
 ### ⚙️ Expand to make some changes.
 
-For the sake of simplicity let's just double the dataset artificially (and
-pretend that we got more data from some external source):
+Let's say we obtained more data from some external source. We can pretend this
+is the case by doubling the dataset:
 
 ```dvc
 $ cp data/data.xml /tmp/data.xml
@@ -212,9 +210,8 @@ $ dvc push
 
 ## Switching between versions
 
-The regular workflow is to use `git checkout` first to switch a branch, checkout
-a commit, or a revision of a `.dvc` file, and then run `dvc checkout` to sync
-data:
+The regular workflow is to use `git checkout` first (to switch a branch or
+checkout a `.dvc` file version) and then run `dvc checkout` to sync data:
 
 ```dvc
 $ git checkout <...>
@@ -225,41 +222,38 @@ $ dvc checkout
 
 ### ⚙️ Expand to get the previous version of the dataset.
 
-Let's cleanup the previous artificial changes we made and get the previous :
+Let's go back to the original version of the data:
 
 ```dvc
-$ git checkout HEAD^1 data/data.xml.dvc
+$ git checkout HEAD~1 data/data.xml.dvc
 $ dvc checkout
 ```
 
-Let's commit it (no need to do `dvc push` this time since the previous version
-of this dataset was saved before):
+Let's commit it (no need to do `dvc push` this time since this original version
+of the dataset was already saved):
 
 ```dvc
 $ git commit data/data.xml.dvc -m "Revert dataset updates"
 ```
 
 </details>
 
-Yes, DVC is technically not even a version control system! `.dvc` files content
-defines data file versions. Git itself provides the version control. DVC in turn
+Yes, DVC is technically not even a version control system! `.dvc` file contents
+define data file versions. Git itself provides the version control. DVC in turn
 creates these `.dvc` files, updates them, and synchronizes DVC-tracked data in
 the <abbr>workspace</abbr> efficiently to match them.
 
 ## Large datasets versioning
 
 In cases where you process very large datasets, you need an efficient mechanism
 (in terms of space and performance) to share a lot of data, including different
-versions of itself. Do you use a network attached storage? Or a large external
-volume?
-
-While these cases are not covered in the Get Started, we recommend reading the
-following sections next to learn more about advanced workflows:
+versions. Do you use network attached storage (NAS)? Or a large external volume?
+You can learn more about advanced workflows using these links:
 
 - A shared [external cache](/doc/use-cases/shared-development-server) can be set
   up to store, version and access a lot of data on a large shared volume
   efficiently.
 - A quite advanced scenario is to track and version data directly on the remote
-  storage (e.g. S3). Check out
+  storage (e.g. S3). See
   [Managing External Data](https://dvc.org/doc/user-guide/managing-external-data)
   to learn more.