Comparative analysis of most commonly used ML algorithms to test efficiency of them at detecting fraudulent credit card transactions
The aim of this project is to build a classifier that can detect fraudulent credit card transactions using several machine learning algorithms such as logistic regression, decision trees, artificial neural network and gradient boosting. And determine which algorithm gives best results for given use case scenario and should be recommended in real life application using real life credit card transaction data.
- Logistic Regression
- Decision Tree
- Artificial Nueral Network
- Gradient Boosting
Area under the ROC(Receiver Operator Characteristic) curve.
It is a graphical plot used to show the diagnostic ability of binary classifiers. Such as in our case fraud/not fraud. ROC is created using plotting the True positive rate (TPR) against the false positive rate (FPR). Hence it shows the trade-off between sensitivity and specificity. Hence closer the curve is to top left of the graph better fit of the model and closer it is to the 45-degree line, less accurate the fit.
I have used real-life credit card transaction dataset from data-flair website and is available in this repository. It contains a log of 284807 real life credit card transactions and has all of them identified as fraudulent or not.
For complete workings please refer to the pdf file.
Git-hub may have issues showing the preview in browsers, considering the file has several large models and images. Kindly use the download button in that case.