Data Wrangling on F1 Dataset

In this project, I tried and learned many data wrangling techniques such as cleaning and transformation. I also applied my transformed datasets into simple machine learning models to gauge and understand the effects of the wrangling process.

Going through this project taught me a lot about how data transformations can affect the distribution and correlation of features, which subsequently affects the performance of machine learning models and its interpretability.

Project Overview

Analyse and transform F1 dataset to be implemented in a machine learning model. Model looks to predict top F1 teams to allow for sponsorship evaluation.

Data Cleaning

Missing values and Outliers

Detection of missing values and imputation methods. Missing date values were dropped and alt missing values were median imputed
Detection of outlier values. Methods such as trimming, windsorization and capping were evaluated using Quartile-Quartile plots and model metrics

Data Transformation

Categorical data encoding, utilised Target Mean encoding.
Numerical data transformation for feature distribution, used Yeo-Johnson transformer
Used binning and discretization for skewed distributions

Feature Engineering

Feature standardization with Robust Scaling
Polynomial expansion
Dimensionality reduction with Principal Component Analysis

Machine Models

Naive Baseline and Logistic Regression

Improvements

Use more datasets and utilise more models

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
DW ASG 2 [Markell Wong]_[S10242300].ipynb		DW ASG 2 [Markell Wong]_[S10242300].ipynb
DW ASG 2 [Markell Wong]_[S10242300].pptx		DW ASG 2 [Markell Wong]_[S10242300].pptx
DW Assignment 2 Report.docx		DW Assignment 2 Report.docx
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Data Wrangling on F1 Dataset

Project Overview

Data Cleaning

Data Transformation

Feature Engineering

Machine Models

Improvements

About

Releases

Packages

Languages

LegalWiggle/Y2S1_Data_Wrangling_Project

Folders and files

Latest commit

History

Repository files navigation

Data Wrangling on F1 Dataset

Project Overview

Data Cleaning

Data Transformation

Feature Engineering

Machine Models

Improvements

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages