Skip to content

lassoregression/us-delinquency-forecast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Forecasting U.S. Delinquency Rates using Economic Indicators (Py)

Problem Statement

The project aimed to forecast U.S. consumer loan delinquency rates using economic indicators from FRED and WRDS databases. Implementing regression models, specifically Lasso and Ridge regression, addressed issues of variable selection and multicollinearity.

Language Used

  • Python

Target variables

  • Single Family Mortgage Loan
  • Commercial Real Estate Loan
  • Business Loans
  • Consumer Loans
  • Credit Cards Loans

Predictors

Aspects on selecting the data series:

  1. US Bureau of Labor Statistics:
  • US Unemployment rate, monthly.
  1. University of Michigan Surveys of Consumers:
  • Index of Consumer Sentiment, quarterly (Table 1)
  • Expected Change in Financial Situation in a Year, quarterly (Table 8)
  • Expected Change in Unemployment During the Next Year, quarterly (Table 30)
  1. US Census Bureau:
  • US Housing Starts, single unit and multi-unit, quarterly.
  • US Housing Completions, single unit, and multi-unit, quarterly.
  1. Bureau of Economic Analysis, US Department of Commerce:
  • Household personal savings, quarterly.
  • Undistributed Corporate profits, quarterly.
  1. Economic Indicators
  • GDP growth rates
  • GDP per capita
  • GNI growth rates
  • GNI per capita
  • Inflation rate
  • Interest rate
  • Gov. debt to GDP
  • Current account to GDP
  • Fiscal Expenditure
  • CPI
  • Food Inflation
  • Business confidence
  • Consumer confidence
  • Consumer credit

Data & Sources

Focused on 4 delinquency rates, available from the Federal Reserve Economic Database (FRED):

  • Delinquency Rate on Consumer Loans (Q1 1987 to Q2 2023)
  • Delinquency Rate on Credit Card Loans (Q1 1991 to Q2 2023)
  • Delinquency Rate on Commercial Real Estate Loans excluding Farmland (Q1 1991 to Q2 2023)
  • Delinquency Rate on Business Loans (Q1 1987 to Q2 2023)

Together, these series encompass the percentage of defaulted loans within consumer and business lending, activities which are typically larger sources of credit risk for banks.

Several macro economic indicators should have an impact on delinquency rates:

  • Refiniv Eikon (available in WRDS database): GDP growth, unemployment rate, consumer confidence levels
  • Federal Reserve bank reports (available in WRDS database): Interest rates on commercial paper, short term business lending, prime lending rates, swap rates, yields on state and local bonds.
  • Consumption, Aggregate Wealth, and Expected Stock returns (CAY data as computed by Martin Lettau and Sydney Ludvigson, available in WRDS database)

Limitations

  • Each target series has only ~140 data points.

  • Feature set has more data points than the target variables. They needed to be transformed to fit with the target variables.

Data Processing and Model Fitting

Identifying the correlation between the target variable

TimeSeriesDeinquencies

(Time series of various delinquency rates)

CorrMtxx

From the above correlation matrix, we can infer that:

  • Single Family Mortgages are uncorrelated with other categories.
  • CRE Loans and Business Loans are highly correlated, both of which are moderately correlated with Consumer Loans and Credit Card Loans.
  • Consumer Loans and Credit Card Loans are highly correlated, both of which are moderately correlated with CRE Loans and Business Loans.

Further processing of data variables and model fitting to derive the relationship

Models Fitted:

  • Linear Regression
  • Lasso Regression
  • Ridge Regression

Results

Model Fit: 1 Model Fit: 2 Model Fit: 3

(Time Series of Model Fit)

Inference

  • The result implies the poor fitting of the model. This may be due to less number of data points (quarterly datapoints, from 1987 to 2023) considered to fit the model. (See limitations)

  • Overall, Lasso Regression appears to provide a relatively better fit for predicting consumer loan delinquencies compared to other estimation methods.



Coeff Model Plot

(Coefficient model plot of all the variables)

Performance Metrics of Regression Models Across Loan Categories

Performance Metrics of Regression Models Across Loan Categories

Conclusion and Further Exploration

  • The current models exhibit poor fit due to the limited dataset available. Efforts to locate additional data prior to 1987 were unsuccessful. Despite this, the results indicate that predicting delinquency rates based on economic outlook is feasible and meaningful with a more comprehensive selection of target variables and improved optimization techniques.

  • Further exploration could involve retraining the models using a broader range of economic datasets in combination with the current ones. Additional improvements can be achieved by fine-tuning model hyperparameters and employing ensemble methods such as boosting, which combines several weak learners for better performance. These steps could significantly enhance the model's predictive capabilities with additional time and resources.

Literature Reference

About

Forecasting U.S. Delinquency Rates using Economic Data

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published