Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix dataset -> data set #555

Merged
merged 1 commit into from
Nov 10, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions source/clustering.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -902,7 +902,7 @@ to those in the earlier classification and regression chapters.
We will begin by loading the `tidyclust`\index{tidyclust} library, which contains the necessary
functionality, and then read in the
original (i.e., unstandardized) subset of 18 observations
from the penguins dataset.
from the penguins data set.

```{r 10-get-unscaled-data, echo = FALSE, message = FALSE, warning = FALSE}
unstandardized_data <- read_csv("data/penguins_toy.csv") |>
Expand Down Expand Up @@ -1188,5 +1188,5 @@ and guidance that the worksheets provide will function as intended.
clustering for when you expect there to be subgroups, and then subgroups within
subgroups, etc., in your data. In the realm of more general unsupervised
learning, it covers *principal components analysis (PCA)*, which is a very
popular technique for reducing the number of predictors in a dataset.
popular technique for reducing the number of predictors in a data set.

2 changes: 1 addition & 1 deletion source/regression1.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ By the end of the chapter, readers will be able to do the following:
* Recognize situations where a simple regression analysis would be appropriate for making predictions.
* Explain the K-nearest neighbor (KNN) regression algorithm and describe how it differs from KNN classification.
* Interpret the output of a KNN regression.
* In a dataset with two or more variables, perform K-nearest neighbor regression in R using a `tidymodels` workflow.
* In a data set with two or more variables, perform K-nearest neighbor regression in R using a `tidymodels` workflow.
* Execute cross-validation in R to choose the number of neighbors.
* Evaluate KNN regression prediction accuracy in R using a test data set and the root mean squared prediction error (RMSPE).
* In the context of KNN regression, compare and contrast goodness of fit and prediction properties (namely RMSE vs RMSPE).
Expand Down
Loading