Predict which form of content (target column) a user is most likely to engage on based on their behaviours in the platform. Target data is heavily imbalanced.
- Data clean up (e.g. data leakage and univalue column)
- Data visualization to gain some insight on the data
- Multi-class random forest model
- Convert to binary classification to resolve the imbalance data issue and compare the performance