Team 26's group project GitHub repository for MGT 6203 (Canvas) Fall of 2023 semester.
Begin reviewing this analysis by reading the reports curated, which walk through the team's hypothesis, process, and conclusions regarding this analysis.
The reports should provide a detailed description of the entire analysis.
The supporting software that enabled this analysis can be found under Code/, which contains all the software related to this analysis.
The program can be executed using the latest version of R (>= 4.3.1), and does require various packages to be installed. These packages are listed in the requirements.txt file of the project, however, the program will install the packages upon execution as well.
The program relies on the data being pre-downloaded into Data/. If executing the program, please ensure the following csv files have been downloaded:
A rendered/knitted html file is also provided here if the program cannot be executed directly.
Additional software can be found under Other Resources. This section includes exploratory software that was not used in the final report, but was crucial for the initial exploration and review of this analysis.
Visualizations used in this analysis can be found under Visualizations/.
- Exploratory Data Analysis (EDA)
- Handling missing data, outliers, preprocessing
- Documenting preprocessing techniques
- Linear Regression Models, including evaluation
- Logistic Regression Models, including evaluation
- Assess additional models ( ideally choose 2 more)
- Hypothesis Testing, significance, variables impact
- Additional model work (if needed) by Jamie/Emmett (Reviewer - William & Jamie)
- Data visualization to illustrate disparities by William/Sanjay(Reviewer - Marissa)
- Begin writing the final report by Marissa
- Receive TA Feedback
- Sanjay - Hypothesis Testing
- Jamie - Logistic Regression Model Validation ( Possibly Tuning )
- William - Gender Breakdown by Role and Department
- Emmett - Validate Hypothesis Testing and add that portion to the final paper
- Marissa - Validate Jamie & Williams work and add to final paper
- Overview of data by Jamie
- Describing the cleaning process, key variables, dataset by Sanjay
- Overview of modeling by Marissa
- Describing the models used and selection process by Jamie/Emmett
- In-depth discussion of model performance by William/Sanjay