This is a work-in-progress course website for Introductory Statistics for Undergraduate Students, produced by Fan. Course covers a limited subset of topics from Statistics for Business and Economics (Anderson Sweeney Williams Camm Cochran 12e).
R is used. Packages from Tidyverse are used, including tibble for framing data, tidyr and dplyr for reshaping data and aggregating statistics, ggplot2 for graphing, and readr for file IO. Materials are written in R using Jupyter notebook and shown as HTML files. To obtain codes and raw files, see here for github set up. For HTML files, click on the links below.
Please contact FanWangEcon for issues or problems.
- create a tibble dataset
- draw 10 random students from 50 and build a survey
- first use: tibble, add_row, factor, ifelse, group_by, mutate, summarise, write_csv
- relative and absolute path
- first use: read.csv
- frequency table
- bar chart and histogram
- R function and lapply to generate graphs/tables for different variables
- first use: function, loop, lapply, !!sym, geom_histogram, geom_bar
- two-way frequency table
- stacked bar chart
- scatter-plot
- first use: spread, geom_point, geom_text, geom_smooth, geom_bar
- a dataset with city-month temperatures
- mean and standard deviation
- use: dplyr + ggplot, gather, filter, facet_wrap, show.unique.values, geom_line, geom_point, scale_x_continuous
- a dataset with state-level wage and education data
- scatter-plot
- coefficient of variation rescales standard deviation
- correlation rescales covariance
- definitions of Sample Space, Experimental Outcomes, Events and Probability
- union, intersection and complements
- conditional probability
- throwing a Quarter, four candidates for election, six-sided unfair dice, two basketball games
- use: tibble, sample
- Throw an unfair dice many times, law of large number
- use: reduce, full_join, mutate_all, dplyr::mutate; tibble+group_by+summarise+mutate+arrange+select; !!str.var.name!=, sprintf, str_extract; bind_cols, logspace; geom_line, scale_x_continuous(trans='log10'), labs()
- Path after 1, 2 and 3 plays
- Discrete Random Variable
- Expected Value and Variance
- Binomial Properties
- Examples: USA larceny clearance rate, WWII German soldier survival rate
- use: dbinom, pbinom; geom_bar, geom_line, geom_point, geom_text; lapply, sprintf, scale_y_continuous(sec.axis), axis.text.y, round
- Poisson Properties
- Examples: Ladislaus Bortkiewicz's analysis of Prussian army horse-kick deaths
- use: dpois, ppois