diff --git a/source/wrangling.Rmd b/source/wrangling.Rmd index 974449cfd..e116095ad 100644 --- a/source/wrangling.Rmd +++ b/source/wrangling.Rmd @@ -101,8 +101,8 @@ example, to create the vector `region` as shown in Figure \@ref(fig:02-vector), you would write: ``` {r} -year <- c("Toronto", "Montreal", "Vancouver", "Calgary", "Ottawa") -year +region <- c("Toronto", "Montreal", "Vancouver", "Calgary", "Ottawa") +region ``` > **Note:** Technically, these objects are called "atomic vectors." In this book @@ -198,7 +198,7 @@ tibbles as *data frames* in this book. > **Note:** You can use the function `class` \index{class} on a data object to assess whether a data > frame is a built-in R data frame or a tibble. If the data object is a data > frame, `class` will return `"data.frame"`. If the data object is a -> tibble it will return `"tbl_df" "tbl" "data.frame"`. You can easily convert +> tibble it will return `"tbl_df" "tbl" "data.frame"`. You can easily convert > built-in R data frames to tibbles using the `tidyverse` `as_tibble` function. > For example we can check the class of the Canadian languages data set, > `can_lang`, we worked with in the previous chapters and we see it is a tibble. @@ -1260,7 +1260,7 @@ For example, we can use `group_by` to group the regions of the `region_lang` dat reporting the language as the primary language at home for each of the regions in the data set. -(ref:summarize-groupby) `summarize` and `group_by` is useful for calculating summary statistics on one or more column(s) for each group. It creates a new data frame—with one row for each group—containing the summary statistic(s) for each column being summarized. It also creates a column listing the value of the grouping variable. The darker, top row of each table represents the column headers. The gray, blue, and green colored rows correspond to the rows that belong to each of the three groups being represented in this cartoon example. +(ref:summarize-groupby) `summarize` and `group_by` is useful for calculating summary statistics on one or more column(s) for each group. It creates a new data frame—with one row for each group—containing the summary statistic(s) for each column being summarized. It also creates a column listing the value of the grouping variable. The darker, top row of each table represents the column headers. The orange, blue, and green colored rows correspond to the rows that belong to each of the three groups being represented in this cartoon example. ```{r summarize-groupby, echo = FALSE, message = FALSE, warning = FALSE, fig.align = "center", fig.cap = "(ref:summarize-groupby)", fig.retina = 2, out.width = "85%"} image_read("img/wrangling/summarize.002.png")