Recently I did an assessment for a company in which, in the 1st file, I had to do an EDA based on a given dataset of one of their apps' recent performance.
On the basis of the analysis made, I provide insight and suggestions regarding what's happening in the app and how to improve its performance; with the purpose of delivering it to the decision-makers. Histograms, regular expressions and a Pareto curve are just some of the examples I thought to be relevant to the analysis.
In the 2nd file I had to build a classification Machine Learning model in order to predict whether a user will make a purchase within the app.
In effect, this was a predictive problem taken with a ML approach. For the sake of simplicity, I used a Logistic Regression model as well as a Random Forest.
- Python
- Jupyter Notebook
- Excel
- Packages:
- pandas
- matplotlib
- scikit
- seaborn