Merge pull request #222 from tidymodels/misc-updates-2022

juliasilge · web-flow · commit fd16b45e6ae7 · 2022-01-27T21:38:22.000-07:00
Misc updates
diff --git a/11-comparing-models.Rmd b/11-comparing-models.Rmd
@@ -257,13 +257,11 @@ where the residuals $\epsilon_{ij}$ are assumed to be independent and follow a G
 A Bayesian linear model makes additional assumptions. In addition to specifying a distribution for the residuals, we require _prior distribution_ specifications for the model parameters ( $\beta_j$ and $\sigma$ ). These are distributions for the parameters that the model assumes before being exposed to the observed data. For example, a simple set of prior distributions for our model might be:
 
 
-$$
 \begin{align}
 \epsilon_{ij} &\sim N(0, \sigma) \notag \\
 \beta_j &\sim N(0, 10) \notag \\
 \sigma &\sim \text{exponential}(1) \notag
 \end{align}
-$$
 
 These priors set the possible/probable ranges of the model parameters and have no unknown parameters. For example, the prior on $\sigma$ indicates that values must be larger than zero, are very right-skewed, and have values that are usually less than 3 or 4. 
 
diff --git a/12-tuning-parameters.Rmd b/12-tuning-parameters.Rmd
@@ -121,7 +121,7 @@ In the context of generalized linear models, the logit function is the _link fun
 $$\Phi^{-1}(\pi) = \beta_0 + \beta_1x_1 + \ldots + \beta_px_p$$
 were $\Phi$ is the cumulative standard normal function, as well as the _complementary log-log_ model:
 
-$$\log(−\log(1−\pi)) = \beta_0 + \beta_1x_1 + \ldots + \beta_px_p$$
+$$\log(-\log(1-\pi)) = \beta_0 + \beta_1x_1 + \ldots + \beta_px_p$$
 Each of these models result in linear class boundaries. Which one should be we use? Since, for these data, the number of model parameters does not vary, the statistical approach is to compute the (log) likelihood for each model and determine the model with the largest value. Traditionally, the likelihood is computed using the same data that were used to estimate the parameters, not using approaches like data splitting or resampling from Chapters \@ref(splitting) and \@ref(resampling).
 
 For a data frame `training_set`, let's create a function to compute the different models and extract the likelihood statistics for the training set (using `broom::glance()`): 
diff --git a/18-explaining-models-and-predictions.Rmd b/18-explaining-models-and-predictions.Rmd
@@ -1,4 +1,5 @@
 ```{r explain-setup, include = FALSE}
+knitr::opts_chunk$set(fig.path = "figures/")
 library(tidymodels)
 library(forcats)
 tidymodels_prefer()
diff --git a/19-when-should-you-trust-predictions.Rmd b/19-when-should-you-trust-predictions.Rmd
@@ -1,5 +1,5 @@
-
 ```{r setup, include=FALSE}
+knitr::opts_chunk$set(fig.path = "figures/")
 library(tidymodels)
 library(applicable)
 library(patchwork)
diff --git a/20-ensemble-models.Rmd b/20-ensemble-models.Rmd
@@ -1,4 +1,5 @@
 ```{r ensembles-setup, include = FALSE}
+knitr::opts_chunk$set(fig.path = "figures/")
 library(tidymodels)
 library(rules)
 library(baguette)
@@ -199,7 +200,7 @@ tmp <-
   unnest(cols = "data")
 
 eqn <- paste(c(glmn_int$estimate, tmp$term), collapse = " \\\\\n\t+&")
-eqn <- paste0("$$\n\\begin{align}\n \\text{ensemble prediction} &=", eqn, "\n\\end{align}\n$$")
+eqn <- paste0("\n\\begin{align}\n \\text{ensemble prediction} &=", eqn, "\n\\end{align}\n")
 
 cat(eqn)
 ```
diff --git a/21-inferential-analysis.Rmd b/21-inferential-analysis.Rmd
@@ -1,4 +1,5 @@
 ```{r inferential-setup, include = FALSE}
+knitr::opts_chunk$set(fig.path = "figures/")
 library(tidymodels)
 library(poissonreg)
 library(infer)
@@ -63,12 +64,11 @@ There were many more publications by men, although there were more men in the da
 
 For our application, the hypotheses to compare the two sexes are:
 
-$$
 \begin{align}
 H_0&: \lambda_m = \lambda_f \notag \\
 H_a&: \lambda_m \ne \lambda_f \notag
 \end{align}
-$$
+
 where the $\lambda$ values are the rates of publications (over the same time period). 
 
 A basic application of the test is:
@@ -196,12 +196,10 @@ tidy(log_lin_fit, conf.int = TRUE, conf.level = 0.90)
 
 In this output, the p-values correspond to separate hypothesis tests for each parameter: 
 
-$$
 \begin{align}
 H_0&: \beta_j = 0 \notag \\
 H_a&: \beta_j \ne 0 \notag
 \end{align}
-$$
 
 for each of the model parameters. Looking at these results, `phd` (the prestige of their department) may not have any relationship with the outcome.
 
@@ -238,12 +236,10 @@ glm_boot %>%
 
 Determining which predictors to include in the model is a difficult problem. One approach is to conduct likelihood ratio tests (LRT) [@McCullaghNelder89] between nested models. Based on the confidence intervals, we have evidence that a simpler model without `phd` may be sufficient. Let's fit a smaller model, then conduct a statistical test: 
 
-$$
 \begin{align}
 H_0&: \beta_{phd} = 0 \notag \\
 H_a&: \beta_{phd} \ne 0 \notag
 \end{align}
-$$
 
 This hypothesis was previously tested when we showed the tidied results for `log_lin_fit`. That particular approach used results from a single model fit via a Wald statistic (i.e. the parameter divided by its standard error). For that approach, the p-value was `r tidy(log_lin_fit) %>% filter(term == "phd") %>% pluck("p.value") %>% format.pval()`. We can tidy the results for the LRT to get the p-value: 
 
@@ -270,13 +266,11 @@ $$\lambda = 0 \pi + (1 - \pi) \lambda_{nz}$$
 
 where
 
-$$
 \begin{align}
 \log(\lambda_{nz}) &= \beta_0 + \beta_1x_1 + \ldots + \beta_px_p \notag \\
 & and \notag \\
 log\left(\frac{\pi}{1-\pi}\right) &= \gamma_0 + \gamma_1z_1 + \ldots + \gamma_qz_q \notag 
 \end{align}
-$$
 
 where the $x$ covariates affect the non-zero count values and the $z$ covariates influence the probability of a zero count. The two sets of predictors do not need to be mutually exclusive.
 
@@ -295,13 +289,10 @@ zero_inflated_fit
 
 Since the coefficients for this model are also estimated using maximum likelihood, let's try to use another likelihood ratio test to understand if the new model terms are helpful. We will _simultaneously_ test that
 
-$$
 \begin{align}
 H_0&: \gamma_1 = 0, \gamma_2 = 0, \cdots, \gamma_5 = 0 \notag \\
 H_a&: \text{at least one} \gamma \ne 0  \notag
 \end{align}
-$$
-
 
 ```{r inference-zip-anova, error = TRUE}
 anova(
diff --git a/TMwR.bib b/TMwR.bib
@@ -224,18 +224,6 @@ @article{glmnet
   year={2010}
 }
 
-@article{pvalue,
-author = {R Wasserstein and N Lazar},
-title = {The {ASA} statement on p-values: Context, process, and purpose},
-journal = {The American Statistician},
-volume = {70},
-number = {2},
-pages = {129-133},
-year  = {2016},
-publisher = {Taylor & Francis}
-}
-
-
 @article{parallel,
    author = {M  Schmidberger and M Morgan and D Eddelbuettel and H Yu and L Tierney and U Mansmann},
    title = {State of the art in parallel computing with {R}},
@@ -892,6 +880,7 @@ @Book{Molnar2021
   year = {2020},
   isbn = {9780244768522},
   url = {https://christophm.github.io/interpretable-ml-book/},
+  publisher = {lulu.com}
 }
 
 @inproceedings{Lundberg2017,

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,5 @@`
`1`	`1`	```{r explain-setup, include = FALSE}
	`2`	`+knitr::opts_chunk$set(fig.path = "figures/")`
`2`	`3`	`library(tidymodels)`
`3`	`4`	`library(forcats)`
`4`	`5`	`tidymodels_prefer()`