forked from hadley/r4ds
-
Notifications
You must be signed in to change notification settings - Fork 0
/
whole-game.qmd
44 lines (31 loc) · 2.89 KB
/
whole-game.qmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
# Whole game {#sec-whole-game-intro .unnumbered}
```{r}
#| results: "asis"
#| echo: false
source("_common.R")
```
The goal of the first part of this book is to introduce you the data science workflow including data **importing**, **tidying**, and data **exploration** as quickly as possible.
Data exploration is the art of looking at your data, rapidly generating hypotheses, quickly testing them, then repeating again and again and again.
The goal of data exploration is to generate many promising leads that you can later explore in more depth.
```{r}
#| echo: false
#| fig-alt: >
#| A diagram displaying the data science cycle: Import -> Tidy -> Explore
#| (which has the phases Transform -> Visualize -> Model in a cycle) ->
#| Communicate. Surrounding all of these is Communicate. Explore is highlighted.
knitr::include_graphics("diagrams/data-science-explore.png")
```
<!--# TO DO: Update figure to include import and tidy as well. -->
In this part of the book, you will learn several useful tools that have an immediate payoff:
- Visualisation is a great place to start with R programming, because the payoff is so clear: you get to make elegant and informative plots that help you understand data.
In [Chapter -@sec-data-visualisation] you'll dive into visualization, learning the basic structure of a ggplot2 plot, and powerful techniques for turning data into plots.
- Visualisation alone is typically not enough, so in [Chapter -@sec-data-transform], you'll learn the key verbs that allow you to select important variables, filter out key observations, create new variables, and compute summaries.
- In [Chapter -@sec-data-tidy], you'll learn about tidy data, a consistent way of storing your data that makes transformation, visualization, and modelling easier.
You'll learn the underlying principles, and how to get your data into a tidy form.
- Before you can transform and visualize your data, you need to first get your data into R.
In [Chapter -@sec-data-import] you'll learn the basics of getting plain-text, rectangular data into R.
- Finally, in [Chapter -@sec-exploratory-data-analysis], you'll combine visualization and transformation with your curiosity and skepticism to ask and answer interesting questions about data.
Modelling is an important part of the exploratory process, but you don't have the skills to effectively learn or apply it yet and details of modeling fall outside the scope of this book.
Nestled among these five chapters that teach you the tools for doing data science are three chapters that focus on your R workflow.
In [Chapter -@sec-workflow-basics], [Chapter -@sec-workflow-pipes], [Chapter -@sec-workflow-style], and [Chapter -@sec-workflow-scripts-projects], you'll learn good workflow practices for writing and organizing your R code.
These will set you up for success in the long run, as they'll give you the tools to stay organised when you tackle real projects.