From b0bebda63d5cf9fca0c2012370d90ea2252ef3ca Mon Sep 17 00:00:00 2001 From: Trevor Campbell Date: Fri, 10 Nov 2023 15:39:36 -0800 Subject: [PATCH] fix dataset --- source/clustering.Rmd | 4 ++-- source/regression1.Rmd | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-) diff --git a/source/clustering.Rmd b/source/clustering.Rmd index 84097c3a3..68f16740e 100644 --- a/source/clustering.Rmd +++ b/source/clustering.Rmd @@ -902,7 +902,7 @@ to those in the earlier classification and regression chapters. We will begin by loading the `tidyclust`\index{tidyclust} library, which contains the necessary functionality, and then read in the original (i.e., unstandardized) subset of 18 observations -from the penguins dataset. +from the penguins data set. ```{r 10-get-unscaled-data, echo = FALSE, message = FALSE, warning = FALSE} unstandardized_data <- read_csv("data/penguins_toy.csv") |> @@ -1188,5 +1188,5 @@ and guidance that the worksheets provide will function as intended. clustering for when you expect there to be subgroups, and then subgroups within subgroups, etc., in your data. In the realm of more general unsupervised learning, it covers *principal components analysis (PCA)*, which is a very - popular technique for reducing the number of predictors in a dataset. + popular technique for reducing the number of predictors in a data set. diff --git a/source/regression1.Rmd b/source/regression1.Rmd index ecf51e05b..432a14ecc 100644 --- a/source/regression1.Rmd +++ b/source/regression1.Rmd @@ -66,7 +66,7 @@ By the end of the chapter, readers will be able to do the following: * Recognize situations where a simple regression analysis would be appropriate for making predictions. * Explain the K-nearest neighbor (KNN) regression algorithm and describe how it differs from KNN classification. * Interpret the output of a KNN regression. -* In a dataset with two or more variables, perform K-nearest neighbor regression in R using a `tidymodels` workflow. +* In a data set with two or more variables, perform K-nearest neighbor regression in R using a `tidymodels` workflow. * Execute cross-validation in R to choose the number of neighbors. * Evaluate KNN regression prediction accuracy in R using a test data set and the root mean squared prediction error (RMSPE). * In the context of KNN regression, compare and contrast goodness of fit and prediction properties (namely RMSE vs RMSPE).