statistics readme

Patrick Cherry

emmeans: Estimated marginal means
DoE: Design of Experiment
Random forest classification
Anscombe’s Quartet

emmeans: Estimated marginal means

Estimated marginal means (EMMs, previously known as least-squares means in the context of traditional regression models) are derived by using a model to make predictions over a regular grid of predictor combinations (called a reference grid).

I use estimated marginal means to estimate the effect sizes when:

interaction effects are present,
when multiple effects are present—but are not scaled the same way (e.g. one effect is linear, one effect is reciprocal (1/x) ),
when variability is not homoscedastic (e.g. as in a glm) and I could use the confidence intervals offered by emmeans that I don’t get with ordinary means,
or when the experiment’s sampling is not balanced, causing some conditions to be over-weighted in the ordinary means.

DoE: Design of Experiment

Link to output

Design of Experiment principles seek to configure the samples, variables, and controls in a scientific experimental plan to answer the question posed or test the hypothesis while controlling for known sources of variability and confounding due to the methods and materials used to carry out the experiment.

Here, I use the and the standard R function gen.factorial to make a full factorial design and then use the package AlgDesign to add blocking for two operators who will be carrying out the experiment (with n = 3 replicates for each unique sample condition).

Random forest classification

Link to output

Random forest model classification of legal status of trees in San Francisco Department of Public Works data from a Tidy Tuesday project.

Here, I used ranger engine to train a random forest model to classify the legal status of the trees using all relevant observations, use the tune package to run hyperperamater optimization on mtry and min_n, and then evaluate the accuracy of the model using AUC and by plotting the correct and incorrect predictions in a map-like format.

Anscombe’s Quartet

Link to output

Anscombe’s quartet is a set of four x : y value pairs published by F J Anscombe in American Statistician in 1793. The sets have nearly identical descriptive statistics, like mean, standard deviation, R^2 correlation, and least-squares regression slopes, (to ~ 3 decimal places), but are clearly very different data sets when visualized by plotting.

I use unpivotr to tidy the data upon import, ggplot to make plots, and purrr’s map for functional programming on nested dataframes with broom for model object manipulation, included nested in dataframes.

Name		Name	Last commit message	Last commit date
Latest commit History 17 Commits
Anscombes-quartet-python_files		Anscombes-quartet-python_files
Anscombes_quartet_files/figure-gfm		Anscombes_quartet_files/figure-gfm
code_and_data		code_and_data
doe_design_of_experiment_with_library_prep_files		doe_design_of_experiment_with_library_prep_files
emmeans_estimated_marginal_means_files/figure-gfm		emmeans_estimated_marginal_means_files/figure-gfm
random_forest_hyperparameters_files/figure-gfm		random_forest_hyperparameters_files/figure-gfm
.gitignore		.gitignore
Anscombes-quartet-python.md		Anscombes-quartet-python.md
Anscombes_quartet.md		Anscombes_quartet.md
README.md		README.md
doe_design_of_experiment_with_library_prep.md		doe_design_of_experiment_with_library_prep.md
doe_design_of_experiment_with_library_prep.pdf		doe_design_of_experiment_with_library_prep.pdf
emmeans_estimated_marginal_means.md		emmeans_estimated_marginal_means.md
emmeans_estimated_marginal_means.pdf		emmeans_estimated_marginal_means.pdf
random_forest_hyperparameters.md		random_forest_hyperparameters.md
statistics.Rproj		statistics.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

statistics readme

emmeans: Estimated marginal means

DoE: Design of Experiment

Random forest classification

Anscombe’s Quartet

About

Releases

Packages

Languages

pdcherry/statistics

Folders and files

Latest commit

History

Repository files navigation

statistics readme

emmeans: Estimated marginal means

DoE: Design of Experiment

Random forest classification

Anscombe’s Quartet

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages