The purpose of this project is to master the exploratory data analysis (EDA) in banking with Pandas framework.
- Explore a banking dataset with Pandas framework.
- Build pivot tables.
- Visualize the dataset with various plot types.
- Materials and methods
- General part : i) Libraries import ii) Dataset exploration iii) Pivot tables iv) Visualization in Pandas
- Tasks
The data that we are going to use for this is a subset of an open source Bank Marketing Data Set from the UCI ML repository: https://archive.ics.uci.edu/ml/citation_policy.html.
This dataset is publicly available for research. The details are described in [Moro et al., 2014].
During the work, the task of preliminary analysis of a positive response (term deposit) to direct calls from a bank is to solve. In essence, the task is a matter of bank scoring, i.e. according to the characteristics of a client (potential client), their behavior is predicted (loan default, a wish to make a deposit, etc.).
In this project, we will try to give answers to a set of questions that may be relevant when analyzing banking data:
What is the share of clients attracted in our source data? What are the mean values of numerical features among the attracted clients? What is the average call duration for the attracted clients? What is the average age among the attracted and unmarried clients? What is the average age and call duration for different types of client employment? In addition, we will make a visual analysis in order to plan marketing banking campaigns more effectively.
- NUMPY
- PANDAS
- MATPLOTLIB