Retail Bank Risk Evaluation

Context

This project uses the home credit default risk dataset that can be found on Kaggle to create 3 products that aim to add value to retail banks.

The first product is an anomaly detection model to detect anomalous applications.

Results: The model flags about 8% of the applications as anomalous. The anomalous applications have a larger ratio of repayment issues. Thus being of value for subsequent models.

The second product is a segmentation model that segments the application into one of 5 clusters. This model also uses the flag from the anomaly detection model.

Results: The model segments the applications into 5 different clusters. There is a significant difference between the repayment issues between different clusters. Again, providing value for subsequent models and later processes like setting interest rates.

The third and final product is a classification model that classify's whether or not an application is likely to experience repayment issues.

Results: The model is able to provide as much value to the bank as the status quo. Using the model should thus not result in lost revenue while enabling the automation of the entire approval process. This can increase productivity of the people that are involved in this process and thus increasing revenue.

Project Structure

.
└── Project Home/
    ├── src/
    │   ├── data/
    │   │   ├── application_test.csv
    │   │   ├── application_train.csv
    │   │   ├── bureau_balance.csv
    │   │   ├── bureau.csv
    │   │   ├── credit_card_balance.csv
    │   │   ├── HomeCredit_columns_description.csv
    │   │   ├── installments_payments.csv
    │   │   ├── POS_CASH_balance.csv
    │   │   └── previous_application.csv
    │   ├── deployment/
    │   │   ├── models/
    │   │   │   ├── anomaly_pipe.pkl
    │   │   │   ├── segmentation_pipe.pkl
    │   │   │   └── xgb_credit_approval_2.pkl
    │   │   ├── src/
    │   │   │   └── lib/
    │   │   │       ├── data_functions.py
    │   │   │       └── ml_functions.py
    │   │   ├── deployment_commands.txt
    │   │   ├── deployment.py
    │   │   ├── Dockerfile
    │   │   └── requirements.txt
    │   ├── img/
    │   │   └── data_diagram.jpg
    │   ├── lib/
    │   │   ├── data_functions.py
    │   │   ├── ml_functions.py
    │   │   └── plot_functions.py
    │   ├── logs/
    │   │   └── my.log
    │   ├── models/
    │   │   ├── anomaly_pipe.pkl
    │   │   ├── rfecv.pkl
    │   │   ├── segmentation_pipe.pkl
    │   │   ├── xgb_credit_approval_0.pkl
    │   │   ├── xgb_credit_approval_1.pkl
    │   │   └── xgb_credit_approval_2.pkl
    │   ├── request_sample.json
    │   └── requirements.txt
    ├── .gitignore
    ├── 341.ipynb
    ├── Project.ipynb
    └── readme.md

Local Execution

If you'd like to run this notebook locally, you can download the dataset using this link and extract it into the src/data folder. After that you can run the notebook up until the deployment part since this requires an authentication code to access the models on google cloud.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
LICENSE		LICENSE
Project.ipynb		Project.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Retail Bank Risk Evaluation

Local Execution

About

Releases

Packages

Languages

License

kkalera/Retail-Bank-Risk-Evaluation

Folders and files

Latest commit

History

Repository files navigation

Retail Bank Risk Evaluation

Local Execution

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages