This work is for learning purpose only. The work can not be used for publication or as commercial products etc without mentor’s consent.
COBRA stands for COmBined Regression Alternative. It is a new method for combining several initial estimators of the regression function. Instead of building a linear or convex optimized combination over a collection of basic estimators r1, . . . , rM, we use them as a collective indicator of the proximity between the training data and a test observation. This local distance approach is model-free and very fast. More specifically, the resulting nonparametric/nonlinear combined estimator is shown to perform asymptotically at least as well in the L2 sense as the best combination of the basic estimators in the collective.
This project has been made by AB Satyraprakash, Kartikay Goel, Samiksha Sachdeva, Jatin Dhingra, Himanshu Yadav. The work has been mentored by Prof. Arabin Kumar Dey.
This project uses 2 famous credit risk datasets namely - German credit analysis dataset and Australian credit analysis dataset. It implements COBRA to make accurate prediction after combining different estimators. The datasets have been taken from the UCI Machine Learning Repository:
Using COBRA, an accuracy of
- 70.19% has been obtained for German dataset and
- 85.53% has been obtained for Australian dataset
For the complete implmentations on Google Colab see these notebooks - Australian notebook, German notebook
The project has been presented as a website. The backend (API with docs) has been hosted here. The australian dataset cannot be presented as a webapp because we do not have variable or value names due to data confidentiality issues.
The following techstack has been used for the project implementation and presentation.