This repository contains a Jupyter notebook for exploratory data analysis on a dataset. The notebook includes various functions for visualizing and preprocessing the data.
-
Categorical Data Visualization: The notebook includes a function to create horizontal bar plots for categorical columns using seaborn. The plots are displayed inline and side by side for easy comparison.
-
Numerical Data Visualization: The notebook also includes a function to create histograms for numerical columns using seaborn. Again, the plots are displayed inline and side by side.
-
Data Preprocessing: The notebook includes a function to replace 'unknown' values in categorical columns with the mode of the column.
-
Normality Check: The notebook includes a function to create QQ plots for numerical columns to check if they are normally distributed. The plots are displayed inline and side by side.
To use the notebook, you need to replace the column names and DataFrame with your actual data. The functions can be called as shown in the notebook.
The notebook requires the following Python libraries:
- pandas
- numpy
- matplotlib
- seaborn
- statsmodels
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.