In our previous research, Gender Wage Inequality in STEM, my colleagues and I used multiple linear regression (MLR) to explore the relationship between gender demographics and median salary of STEM major categories. Our final model used the inverse transformation of the response variable to improve the model fit. Transforming response (and/or explanatory) variables, common practice among statisticians, can lead to a better fitting model, but these models are not easily understood by the average person.
In this project, I compared the multiple linear regression model with the inverse transformation dependent response variable,
To address this problem, I used a subset of the College Majors dataset from FiveThirthyEight, found here: https://github.com/fivethirtyeight/data/blob/master/college-majors/women-stem.csv
- Packages: tidyverse, ggpubr, easystats, lindia, ggstatsplot
- Statistical Tests & Analyses: Box-Cox, Step-wise selection, Model Diagnostics