- Python
- Single Family Mortgage Loan
- Commercial Real Estate Loan
- Business Loans
- Consumer Loans
- Credit Cards Loans
- US Bureau of Labor Statistics:
- US Unemployment rate, monthly.
- University of Michigan Surveys of Consumers:
- Index of Consumer Sentiment, quarterly (Table 1)
- Expected Change in Financial Situation in a Year, quarterly (Table 8)
- Expected Change in Unemployment During the Next Year, quarterly (Table 30)
- US Census Bureau:
- US Housing Starts, single unit and multi-unit, quarterly.
- US Housing Completions, single unit, and multi-unit, quarterly.
- Bureau of Economic Analysis, US Department of Commerce:
- Household personal savings, quarterly.
- Undistributed Corporate profits, quarterly.
- Economic Indicators
- GDP growth rates
- GDP per capita
- GNI growth rates
- GNI per capita
- Inflation rate
- Interest rate
- Gov. debt to GDP
- Current account to GDP
- Fiscal Expenditure
- CPI
- Food Inflation
- Business confidence
- Consumer confidence
- Consumer credit
- Delinquency Rate on Consumer Loans (Q1 1987 to Q2 2023)
- Delinquency Rate on Credit Card Loans (Q1 1991 to Q2 2023)
- Delinquency Rate on Commercial Real Estate Loans excluding Farmland (Q1 1991 to Q2 2023)
- Delinquency Rate on Business Loans (Q1 1987 to Q2 2023)
Together, these series encompass the percentage of defaulted loans within consumer and business lending, activities which are typically larger sources of credit risk for banks.
Several macro economic indicators should have an impact on delinquency rates:
- Refiniv Eikon (available in WRDS database): GDP growth, unemployment rate, consumer confidence levels
- Federal Reserve bank reports (available in WRDS database): Interest rates on commercial paper, short term business lending, prime lending rates, swap rates, yields on state and local bonds.
- Consumption, Aggregate Wealth, and Expected Stock returns (CAY data as computed by Martin Lettau and Sydney Ludvigson, available in WRDS database)
-
Each target series has only ~140 data points.
-
Feature set has more data points than the target variables. They needed to be transformed to fit with the target variables.
Identifying the correlation between the target variable
(Time series of various delinquency rates)
From the above correlation matrix, we can infer that:
- Single Family Mortgages are uncorrelated with other categories.
- CRE Loans and Business Loans are highly correlated, both of which are moderately correlated with Consumer Loans and Credit Card Loans.
- Consumer Loans and Credit Card Loans are highly correlated, both of which are moderately correlated with CRE Loans and Business Loans.
Models Fitted:
- Linear Regression
- Lasso Regression
- Ridge Regression
(Time Series of Model Fit)
-
The result implies the poor fitting of the model. This may be due to less number of data points (quarterly datapoints, from 1987 to 2023) considered to fit the model. (See limitations)
-
Overall, Lasso Regression appears to provide a relatively better fit for predicting consumer loan delinquencies compared to other estimation methods.
(Coefficient model plot of all the variables)
-
The current models exhibit poor fit due to the limited dataset available. Efforts to locate additional data prior to 1987 were unsuccessful. Despite this, the results indicate that predicting delinquency rates based on economic outlook is feasible and meaningful with a more comprehensive selection of target variables and improved optimization techniques.
-
Further exploration could involve retraining the models using a broader range of economic datasets in combination with the current ones. Additional improvements can be achieved by fine-tuning model hyperparameters and employing ensemble methods such as boosting, which combines several weak learners for better performance. These steps could significantly enhance the model's predictive capabilities with additional time and resources.
- Machine Learning in Banking Risk Management: A Literature Review (2019)
- Consumer Credit-Risk Models via Machine-Learning Algorithms (2010)
- Machine Learning for Corporate Default Risk: Multi-Period Prediction, Frailty Correlation, Loan Portfolios, and Tail Probabilities (2023)
- A Comparison of Prediction Methods for Credit Default on Peer to Peer Lending using Machine Learning (2019)
- Explainable Prediction of Loan Default Based on Machine Learning Models (2023)
- National Student Loans Default Risk Prediction: A Heterogeneous Ensemble Learning Approach and the SHAP Method (2023)