Skip to content

Latest commit

 

History

History
750 lines (589 loc) · 87.1 KB

README.md

File metadata and controls

750 lines (589 loc) · 87.1 KB

The-Age-of-AI-Quants

A growing repository for those who are passionate about Quants, Artificial Intelligence, Deep learning, Machine Learning, Risk and Finance.

A growing collection of python notebooks, papers, articles and various resources on the confluence where Artifical Intelligence and Machine Learnine meets Quants, Risk and Finance.

Disclaimer: This is a curated list form different resources including several GITHUB pages for knowledge dissemination only (Non-commercial use). The license, if applicable, belongs to the respective authors/code providers.

List of notebooks showing deep learning + reinforcement learning models

  • Stock-Prediction-Models - Stock prediction Models.
  • AI Trading - AI to predict stock market movements
  • FinRL-Library - started by Columbia university engineering students and designed as an end to end deep reinforcement learning library for automated trading platform. Implementation of DQN DDQN DDPG etc using PyTorch and
  • gym use pyfolio for showing backtesting stats. Big contributions on Proximal Policy Optimization (PPO) advantage actor critic (A2C) and Deep Deterministic Policy Gradient (DDPG) agents for trading13:18
  • Deep Learning IV - Bulbea: Deep Learning based Python Library
  • RLTrader - predecessor to
  • tensortrade uses open api
  • gym and neat way to render matplotlib plots in real time. Also explains LSTM/data stationarity/Bayesian optimization using
  • Optuna etc.
  • Deep Learning III- Algorithmic trading with deep learning experiments.
  • Personae- implementation of deep reinforcement learning and supervised learnings covering areas: deep deterministic policy gradient (DDPG) and DDQN etc. Data are being pulled from
  • rqalpha which is a python backtest engine and have a nice docker image to run training/testing
  • RL Trading- A collection of 25+ Reinforcement Learning Trading Strategies -Google Colab.
  • Deep-Reinforcement-Learning-for-Automated-Stock-Trading-Ensemble-Strategy-ICAIF-2020Part of FinRL and provided code for paper
  • deep reinformacement learning for automated stock trading - focuses on ensemble.
  • awesome-deep-trading - curated list of papers/repos on topics like CNN/LSTM/GAN/Reinforcement Learning etc. Categorized as deep learning for now but there are other topics here. Manually maintained by cbailes - Deep Learning- Technical experimentations to beat the stock market using deep learning. - LTSM Recurrent- OHLC Average Prediction of Apple Inc. Using LSTM Recurrent Neural Network.
  • RL III- Github -Deep Reinforcement Learning based Trading Agent for Bitcoin. - crypto-rl- Retrieve limit order book level data from coinbase pro and bitfinex -> record in
  • arctic timeseries database then implemented trend following strategies (market orders) and market making (limit orders). Uses reinforcement learning (DQN)
  • keras-rl to create agents and uses openai gym to implement POMDP (partially observable markov decision process)

Other Models

Data Processing Techniques and Transformations

  • advances in financial machine learning.Exercises to book. Relevant topics include data cleaning and outlier detection (using MAD) - Google-Finance-Stock-Data-Analysis - data processing platform which stream data from kafka. The example shows two incoming data stream stock vs tweets and two spark streams are created to consume the kafka data then end results are stored in cassandra. Older tech stacks were used and not actively maintained.
  • Twitter-Trends - sentiment analysis baed on twitter data. Relevant topics include data cleaning/tokenization/data aggregation using mangodb etc.
  • finserv-application-blueprint - generate streamable data using mapr converged data platfrom built mostly in java. Uses apache zepplin for web visualization
  • cointrader- java based platform for trading crypto. Relevant sections including using esper event queries to transform data and place orders - CryptoNets- CryptoNets is a demonstration of the use of Neural-Networks over data encrypted with - - Homomorphic Encryption. Homomorphic Encryptions allow performing operations such as addition and multiplication over data while it is encrypted.

Portfolio Selection and Optimisation

Factor and Risk Analysis

Techniques

Unsupervised

Textual

Other Assets

Derivatives and Hedging

Fixed Income

Alternative Finance

Extended Research

Courses

Data

Colleges, Centers and Departments \

Languages

Python

Numerical Libraries & Data Structures

  • numpy - NumPy is the fundamental package for scientific computing with Python.
  • scipy - SciPy (pronounced “Sigh Pie”) is a Python-based ecosystem of open-source software for mathematics, science, and engineering.
  • pandas - pandas is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
  • quantdsl - Domain specific language for quantitative analytics in finance and trading.
  • statistics - Builtin Python library for all basic statistical calculations.
  • sympy - SymPy is a Python library for symbolic mathematics.
  • pymc3 - Probabilistic Programming in Python: Bayesian Modeling and Probabilistic Machine Learning with Theano.

Financial Instruments and Pricing

  • PyQL - QuantLib's Python port.
  • pyfin - Basic options pricing in Python. [ARCHIVED]
  • vollib - vollib is a python library for calculating option prices, implied volatility and greeks.
  • QuantPy - A framework for quantitative finance In python.
  • Finance-Python - Python tools for Finance.
  • ffn - A financial function library for Python.
  • pynance - PyNance is open-source software for retrieving, analysing and visualizing data from stock and derivatives markets.
  • tia - Toolkit for integration and analysis.
  • hasura/base-python-dash - Hasura quickstart to deploy Dash framework. Written on top of Flask, Plotly.js, and React.js, Dash is ideal for building data visualization apps with highly custom user interfaces in pure Python.
  • hasura/base-python-bokeh - Hasura quickstart to visualize data with bokeh library.
  • pysabr - SABR model Python implementation.
  • FinancePy - A Python Finance Library that focuses on the pricing and risk-management of Financial Derivatives, including fixed-income, equity, FX and credit derivatives.
  • gs-quant - Python toolkit for quantitative finance
  • willowtree - Robust and flexible Python implementation of the willow tree lattice for derivatives pricing.
  • financial-engineering - Applications of Monte Carlo methods to financial engineering projects, in Python.
  • optlib - A library for financial options pricing written in Python.
  • tf-quant-finance - High-performance TensorFlow library for quantitative finance.

Indicators

Trading & Backtesting

  • TA-Lib - perform technical analysis of financial market data.
  • trade - trade is a Python framework for the development of financial applications.
  • zipline - Pythonic algorithmic trading library.
  • QuantSoftware Toolkit - Python-based open source software framework designed to support portfolio construction and management.
  • quantitative - Quantitative finance, and backtesting library.
  • analyzer - Python framework for real-time financial and backtesting trading strategies.
  • bt - Flexible Backtesting for Python.
  • backtrader - Python Backtesting library for trading strategies.
  • pythalesians - Python library to backtest trading strategies, plot charts, seamlessly download market data, analyse market patterns etc.
  • pybacktest - Vectorized backtesting framework in Python / pandas, designed to make your backtesting easier.
  • pyalgotrade - Python Algorithmic Trading Library.
  • tradingWithPython - A collection of functions and classes for Quantitative trading.
  • Pandas TA - Pandas TA is an easy to use Python 3 Pandas Extension with 115+ Indicators. Easily build Custom Strategies.
  • ta - Technical Analysis Library using Pandas (Python)
  • algobroker - This is an execution engine for algo trading.
  • pysentosa - Python API for sentosa trading system.
  • finmarketpy - Python library for backtesting trading strategies and analyzing financial markets.
  • binary-martingale - Computer program to automatically trade binary options martingale style.
  • fooltrader - the project using big-data technology to provide an uniform way to analyze the whole market.
  • zvt - the project using sql,pandas to provide an uniform and extendable way to record data,computing factors,select securites, backtesting,realtime trading and it could show all of them in clearly charts in realtime.
  • pylivetrader - zipline-compatible live trading library.
  • pipeline-live - zipline's pipeline capability with IEX for live trading.
  • zipline-extensions - Zipline extensions and adapters for QuantRocket.
  • moonshot - Vectorized backtester and trading engine for QuantRocket based on Pandas.
  • PyPortfolioOpt - Financial portfolio optimisation in python, including classical efficient frontier and advanced methods.
  • Eiten - Eiten is an open source toolkit by Tradytics that implements various statistical and algorithmic investing strategies such as Eigen Portfolios, Minimum Variance Portfolios, Maximum Sharpe Ratio Portfolios, and Genetic Algorithms based Portfolios.
  • riskparity.py - fast and scalable design of risk parity portfolios with TensorFlow 2.0
  • mlfinlab - Implementations regarding "Advances in Financial Machine Learning" by Marcos Lopez de Prado. (Feature Engineering, Financial Data Structures, Meta-Labeling)
  • pyqstrat - A fast, extensible, transparent python library for backtesting quantitative strategies.
  • NowTrade - Python library for backtesting technical/mechanical strategies in the stock and currency markets.
  • pinkfish - A backtester and spreadsheet library for security analysis.
  • aat - Async Algorithmic Trading Engine
  • Backtesting.py - Backtest trading strategies in Python
  • catalyst - An Algorithmic Trading Library for Crypto-Assets in Python
  • quantstats - Portfolio analytics for quants, written in Python
  • qtpylib - QTPyLib, Pythonic Algorithmic Trading http://qtpylib.io
  • Quantdom - Python-based framework for backtesting trading strategies & analyzing financial markets [GUI :neckbeard:]
  • freqtrade - Free, open source crypto trading bot
  • algorithmic-trading-with-python - Free pandas and scikit-learn resources for trading simulation, backtesting, and machine learning on financial data.
  • DeepDow - Portfolio optimization with deep learning
  • Qlib - An AI-oriented Quantitative Investment Platform by Microsoft. Full ML pipeline of data processing, model training, back-testing; and covers the entire chain of quantitative investment: alpha seeking, risk modeling, portfolio optimization, and order execution.
  • machine-learning-for-trading - Code and resources for Machine Learning for Algorithmic Trading
  • AlphaPy - Automated Machine Learning [AutoML] with Python, scikit-learn, Keras, XGBoost, LightGBM, and CatBoost
  • jesse - An advanced crypto trading bot written in Python
  • rqalpha - A extendable, replaceable Python algorithmic backtest && trading framework supporting multiple securities.
  • FinRL-Library - A Deep Reinforcement Learning Library for Automated Trading in Quantitative Finance. NeurIPS 2020.
  • bulbea - Deep Learning based Python Library for Stock Market Prediction and Modelling.
  • ib_nope - Automated trading system for NOPE strategy over IBKR TWS.
  • OctoBot - Open source cryptocurrency trading bot for high frequency, arbitrage, TA and social trading with an advanced web interface.
  • bta-lib - Technical Analysis library in pandas for backtesting algotrading and quantitative analysis.
  • Stock-Prediction-Models - Gathers machine learning and deep learning models for Stock forecasting including trading bots and simulations.
  • tda-api - Gather data and trade equities, options, and ETFs via TDAmeritrade.

Risk Analysis

  • pyfolio - Portfolio and risk analytics in Python.
  • empyrical - Common financial risk and performance metrics.
  • fecon235 - Computational tools for financial economics include: Gaussian Mixture model of leptokurtotic risk, adaptive Boltzmann portfolios.
  • finance - Financial Risk Calculations. Optimized for ease of use through class construction and operator overload.
  • qfrm - Quantitative Financial Risk Management: awesome OOP tools for measuring, managing and visualizing risk of financial instruments and portfolios.
  • visualize-wealth - Portfolio construction and quantitative analysis.
  • VisualPortfolio - This tool is used to visualize the perfomance of a portfolio.
  • universal-portfolios - Collection of algorithms for online portfolio selection.
  • FinQuant - A program for financial portfolio management, analysis and optimisation.
  • Empyrial - Portfolio's risk and performance analytics and returns predictions.

Factor Analysis

  • alphalens - Performance analysis of predictive alpha factors.
  • Spectre - GPU-accelerated Factors analysis library and Backtester

Time Series

  • ARCH - ARCH models in Python.
  • statsmodels - Python module that allows users to explore data, estimate statistical models, and perform statistical tests.
  • dynts - Python package for timeseries analysis and manipulation.
  • PyFlux - Python library for timeseries modelling and inference (frequentist and Bayesian) on models.
  • tsfresh - Automatic extraction of relevant features from time series.
  • hasura/quandl-metabase - Hasura quickstart to visualize Quandl's timeseries datasets with Metabase.
  • Facebook Prophet - Tool for producing high quality forecasts for time series data that has multiple seasonality with linear or non-linear growth.
  • tsmoothie - A python library for time-series smoothing and outlier detection in a vectorized way.

Calendars

Data Sources

  • yfinance - Yahoo! Finance market data downloader (+faster Pandas Datareader)
  • findatapy - Python library to download market data via Bloomberg, Quandl, Yahoo etc.
  • googlefinance - Python module to get real-time stock data from Google Finance API.
  • yahoo-finance - Python module to get stock data from Yahoo! Finance.
  • pandas-datareader - Python module to get data from various sources (Google Finance, Yahoo Finance, FRED, OECD, Fama/French, World Bank, Eurostat...) into Pandas datastructures such as DataFrame, Panel with a caching mechanism.
  • pandas-finance - High level API for access to and analysis of financial data.
  • pyhoofinance - Rapidly queries Yahoo Finance for multiple tickers and returns typed data for analysis.
  • yfinanceapi - Finance API for Python.
  • yql-finance - yql-finance is simple and fast. API returns stock closing prices for current period of time and current stock ticker (i.e. APPL, GOOGL).
  • ystockquote - Retrieve stock quote data from Yahoo Finance.
  • wallstreet - Real time stock and option data.
  • stock_extractor - General Purpose Stock Extractors from Online Resources.
  • Stockex - Python wrapper for Yahoo! Finance API.
  • finsymbols - Obtains stock symbols and relating information for SP500, AMEX, NYSE, and NASDAQ.
  • FRB - Python Client for FRED® API.
  • inquisitor - Python Interface to Econdb.com API.
  • yfi - Yahoo! YQL library.
  • chinesestockapi - Python API to get Chinese stock price.
  • exchange - Get current exchange rate.
  • ticks - Simple command line tool to get stock ticker data.
  • pybbg - Python interface to Bloomberg COM APIs.
  • ccy - Python module for currencies.
  • tushare - A utility for crawling historical and Real-time Quotes data of China stocks.
  • jsm - Get the japanese stock market data.
  • cn_stock_src - Utility for retrieving basic China stock data from different sources.
  • coinmarketcap - Python API for coinmarketcap.
  • after-hours - Obtain pre market and after hours stock prices for a given symbol.
  • bronto-python - Bronto API Integration for Python.
  • pytdx - Python Interface for retrieving chinese stock realtime quote data from TongDaXin Nodes.
  • pdblp - A simple interface to integrate pandas and the Bloomberg Open API.
  • tiingo - Python interface for daily composite prices/OHLC/Volume + Real-time News Feeds, powered by the Tiingo Data Platform.
  • iexfinance - Python Interface for retrieving real-time and historical prices and equities data from The Investor's Exchange.
  • pyEX - Python interface to IEX with emphasis on pandas, support for streaming data, premium data, points data (economic, rates, commodities), and technical indicators.
  • alpaca-trade-api - Python interface for retrieving real-time and historical prices from Alpaca API as well as trade execution.
  • metatrader5 - API Connector to MetaTrader 5 Terminal
  • akshare - AkShare is an elegant and simple financial data interface library for Python, built for human beings! https://akshare.readthedocs.io
  • yahooquery - Python interface for retrieving data through unofficial Yahoo Finance API.
  • investpy - Financial Data Extraction from Investing.com with Python! https://investpy.readthedocs.io/
  • yliveticker - Live stream of market data from Yahoo Finance websocket.
  • bbgbridge - Easy to use Bloomberg Desktop API wrapper for Python.
  • alpha_vantage - A python wrapper for Alpha Vantage API for financial data.
  • FinanceDataReader - Open Source Financial data reader for U.S, Korean, Japanese, Chinese, Vietnamese Stocks

Excel Integration

  • xlwings - Make Excel fly with Python.
  • openpyxl - Read/Write Excel 2007 xlsx/xlsm files.
  • xlrd - Library for developers to extract data from Microsoft Excel spreadsheet files.
  • xlsxwriter - Write files in the Excel 2007+ XLSX file format.
  • xlwt - Library to create spreadsheet files compatible with MS Excel 97/2000/XP/2003 XLS files, on any platform.
  • DataNitro - DataNitro also offers full-featured Python-Excel integration, including UDFs. Trial downloads are available, but users must purchase a license.
  • xlloop - XLLoop is an open source framework for implementing Excel user-defined functions (UDFs) on a centralised server (a function server).
  • expy - The ExPy add-in allows easy use of Python directly from within an Microsoft Excel spreadsheet, both to execute arbitrary code and to define new Excel functions.
  • pyxll - PyXLL is an Excel add-in that enables you to extend Excel using nothing but Python code.

Visualization

  • D-Tale - Visualizer for pandas dataframes and xarray datasets.
  • mplfinance - matplotlib utilities for the visualization, and visual analysis, of financial data.
  • finplot - Performant and effortless finance plotting for Python.
  • finvizfinance - Finviz analysis python library.

R

Numerical Libraries & Data Structures

  • xts - eXtensible Time Series: Provide for uniform handling of R's different time-based data classes by extending zoo, maximizing native format information preservation and allowing for user level customization and extension, while simplifying cross-class interoperability.
  • data.table - Extension of data.frame: Fast aggregation of large data (e.g. 100GB in RAM), fast ordered joins, fast add/modify/delete of columns by group using no copies at all, list columns and a fast file reader (fread). Offers a natural and flexible syntax, for faster development.
  • sparseEigen - Sparse pricipal component analysis.
  • TSdbi - Provides a common interface to time series databases.
  • tseries - Time Series Analysis and Computational Finance.
  • zoo - S3 Infrastructure for Regular and Irregular Time Series (Z's Ordered Observations).
  • tis - Functions and S3 classes for time indexes and time indexed series, which are compatible with FAME frequencies.
  • tfplot - Utilities for simple manipulation and quick plotting of time series data.
  • tframe - A kernel of functions for programming time series methods in a way that is relatively independently of the representation of time.

Data Sources

  • IBrokers - Provides native R access to Interactive Brokers Trader Workstation API.
  • Rblpapi - An R Interface to 'Bloomberg' is provided via the 'Blp API'.
  • Quandl - Get Financial Data Directly Into R.
  • Rbitcoin - Unified markets API interface (bitstamp, kraken, btce, bitmarket).
  • GetTDData - Downloads and aggregates data for Brazilian government issued bonds directly from the website of Tesouro Direto.
  • GetHFData - Downloads and aggregates high frequency trading data for Brazilian instruments directly from Bovespa ftp site.
  • Reddit WallstreetBets API - Provides daily top 50 stocks from reddit (subreddit) Wallstreetbets and their sentiments via the API
  • td - Interfaces the 'twelvedata' API for stocks and (digital and standard) currencies

Financial Instruments and Pricing

  • RQuantLib - RQuantLib connects GNU R with QuantLib.
  • quantmod - Quantitative Financial Modelling Framework.
  • Rmetrics - The premier open source software solution for teaching and training quantitative finance.
  • portfolio - Analysing equity portfolios.
  • portfolioSim - Framework for simulating equity portfolio strategies.
  • sparseIndexTracking - Portfolio design to track an index.
  • covFactorModel - Covariance matrix estimation via factor models.
  • riskParityPortfolio - Blazingly fast design of risk parity portfolios.
  • sde - Simulation and Inference for Stochastic Differential Equations.
  • YieldCurve - Modelling and estimation of the yield curve.
  • SmithWilsonYieldCurve - Constructs a yield curve by the Smith-Wilson method from a table of LIBOR and SWAP rates.
  • ycinterextra - Yield curve or zero-coupon prices interpolation and extrapolation.
  • AmericanCallOpt - This package includes pricing function for selected American call options with underlying assets that generate payouts.
  • VarSwapPrice - Pricing a variance swap on an equity index.
  • RND - Risk Neutral Density Extraction Package.
  • LSMonteCarlo - American options pricing with Least Squares Monte Carlo method.
  • OptHedging - Estimation of value and hedging strategy of call and put options.
  • tvm - Time Value of Money Functions.
  • OptionPricing - Option Pricing with Efficient Simulation Algorithms.
  • credule - Credit Default Swap Functions.
  • derivmkts - Functions and R Code to Accompany Derivatives Markets.
  • FinCal - Package for time value of money calculation, time series analysis and computational finance.
  • r-quant - R code for quantitative analysis in finance.
  • options.studies - options trading studies functions for use with options.data package and shiny.

Portfolio Optimization

  • PortfolioAnalytics - Portfolio Analysis, Including Numerical Methods for Optimizationof Portfolios

Trading

  • TA-Lib - perform technical analysis of financial market data.
  • backtest - Exploring Portfolio-Based Conjectures About Financial Instruments.
  • pa - Performance Attribution for Equity Portfolios.
  • TTR - Technical Trading Rules.
  • QuantTools - Enhanced Quantitative Trading Modelling.
  • blotter - Transaction infrastructure for defining instruments, transactions, portfolios and accounts for trading systems and simulation. Provides portfolio support for multi-asset class and multi-currency portfolios. Actively maintained and developed.

Backtesting

  • quantstrat - Transaction-oriented infrastructure for constructing trading systems and simulation. Provides support for multi-asset class and multi-currency portfolios for backtesting and other financial research.

Risk Analysis

Factor Analysis

  • FactorAnalytics - The FactorAnalytics package contains fitting and analysis methods for the three main types of factor models used in conjunction with portfolio construction, optimization and risk management, namely fundamental factor models, time series factor models and statistical factor models.
  • Expected Returns - Solutions for enhancing portfolio diversification and replications of seminal papers with R, most of which are discussed in one of the best investment references of the recent decade, Expected Returns: An Investors Guide to Harvesting Market Rewards by Antti Ilmanen.

Time Series

  • tseries - Time Series Analysis and Computational Finance.
  • zoo - S3 Infrastructure for Regular and Irregular Time Series (Z's Ordered Observations).
  • xts - eXtensible Time Series.
  • fGarch - Rmetrics - Autoregressive Conditional Heteroskedastic Modelling.
  • timeSeries - Rmetrics - Financial Time Series Objects.
  • rugarch - Univariate GARCH Models.
  • rmgarch - Multivariate GARCH Models.
  • tidypredict - Run predictions inside the database https://tidypredict.netlify.com/.
  • tidyquant - Bringing financial analysis to the tidyverse.
  • timetk - A toolkit for working with time series in R.
  • tibbletime - Built on top of the tidyverse, tibbletime is an extension that allows for the creation of time aware tibbles through the setting of a time index.
  • matrixprofile - Time series data mining library built on top of the novel Matrix Profile data structure and algorithms.
  • garchmodels - A parsnip backend for GARCH models.

Calendars

  • timeDate - Chronological and Calendar Objects
  • bizdays - Business days calculations and utilities

Matlab

FrameWorks

  • QUANTAXIS - Integrated Quantitative Toolbox with Matlab.

Julia

  • QuantLib.jl - Quantlib implementation in pure Julia.
  • FinancialMarkets.jl - Describe and model financial markets objects using Julia.
  • Ito.jl - A Julia package for quantitative finance.
  • TALib.jl - A Julia wrapper for TA-Lib.
  • Miletus.jl - A financial contract definition, modeling language, and valuation framework.
  • Temporal.jl - Flexible and efficient time series class & methods.
  • Indicators.jl - Financial market technical analysis & indicators on top of Temporal.
  • Strategems.jl - Quantitative systematic trading strategy development and backtesting.
  • TimeSeries.jl - Time series toolkit for Julia.
  • MarketTechnicals.jl - Technical analysis of financial time series on top of TimeSeries.
  • MarketData.jl - Time series market data.
  • TimeFrames.jl - A Julia library that defines TimeFrame (essentially for resampling TimeSeries).

Java

  • Strata - Modern open-source analytics and market risk library designed and written in Java.
  • JQuantLib - JQuantLib is a free, open-source, comprehensive framework for quantitative finance, written in 100% Java.
  • finmath.net - Java library with algorithms and methodologies related to mathematical finance.
  • quantcomponents - Free Java components for Quantitative Finance and Algorithmic Trading.
  • DRIP - Fixed Income, Asset Allocation, Transaction Cost Analysis, XVA Metrics Libraries.

JavaScript

  • finance.js - A JavaScript library for common financial calculations.
  • portfolio-allocation - PortfolioAllocation is a JavaScript library designed to help constructing financial portfolios made of several assets: bonds, commodities, cryptocurrencies, currencies, exchange traded funds (ETFs), mutual funds, stocks...
  • Ghostfolio - Wealth management software to keep track of financial assets like stocks, ETFs or cryptocurrencies and make solid, data-driven investment decisions.

Data Visualization

Haskell

  • quantfin - quant finance in pure haskell.
  • hqfl - Haskell Quantitative Finance Library.
  • Haxcel - Excel Addin for Haskell

Scala

  • QuantScale - Scala Quantitative Finance Library.
  • Scala Quant Scala library for working with stock data from IFTTT recipes or Google Finance.

Ruby

  • Jiji - Open Source Forex algorithmic trading framework using OANDA REST API.

Elixir/Erlang

  • Tai - Open Source composable, real time, market data and trade execution toolkit.
  • Workbench - From Idea to Execution - Manage your trading operation across a globally distributed cluster
  • Prop - An open and opinionated trading platform using productive & familiar open source libraries and tools for strategy research, execution and operation.

Golang

  • Kelp - Kelp is an open-source Golang algorithmic cryptocurrency trading bot that runs on centralized exchanges and Stellar DEX (command-line usage and desktop GUI).
  • marketstore - DataFrame Server for Financial Timeseries Data.

CPP

  • TradeFrame - C++ 17 based framework/library (with sample applications) for testing options based automated trading ideas using DTN IQ real time data feed and Interactive Brokers (TWS API) for trade execution. Comes with built-in Option Greeks/IV calculation library.

Frameworks

CSharp

  • QuantConnect - Lean Engine is an open-source fully managed C# algorithmic trading engine built for desktop and cloud usage.
  • StockSharp - Algorithmic trading and quantitative trading open source platform to develop trading robots (stock markets, forex, crypto, bitcoins, and options).
  • TDAmeritrade.DotNetCore - Free, open-source .NET Client for the TD Ameritrade Trading Platform. Helps developers integrate TD Ameritrade API into custom trading solutions.

Rust

  • QuantMath - Financial maths library for risk-neutral pricing and risk

Reproducing Works

  • Derman Papers - Notebooks that replicate original quantitative finance papers from Emanuel Derman.
  • volatility-trading - A complete set of volatility estimators based on Euan Sinclair's Volatility Trading.
  • quant - Quantitative Finance and Algorithmic Trading exhaust; mostly ipython notebooks based on Quantopian, Zipline, or Pandas.
  • fecon235 - Open source project for software tools in financial economics. Many jupyter notebook to verify theoretical ideas and practical methods interactively.
  • Quantitative-Notebooks - Educational notebooks on quantitative finance, algorithmic trading, financial modelling and investment strategy
  • QuantEcon - Lecture series on economics, finance, econometrics and data science; QuantEcon.py, QuantEcon.jl, notebooks
  • FinanceHub - Resources for Quantitative Finance
  • Python_Option_Pricing - An libary to price financial options written in Python. Includes: Black Scholes, Black 76, Implied Volatility, American, European, Asian, Spread Options.
  • python-training - J.P. Morgan's Python training for business analysts and traders.
  • Stock_Analysis_For_Quant - Different Types of Stock Analysis in Excel, Matlab, Power BI, Python, R, and Tableau.
  • algorithmic-trading-with-python - Source code for Algorithmic Trading with Python (2020) by Chris Conlan.
  • MEDIUM_NoteBook - Repository containing notebooks of cerlymarco's posts on Medium.

Credit Risk Modeling

Contents

Introduction

Credit Scoring

  • Benchmarking state-of-the-art classification algorithms for credit scoring: An update of research - There have been several advancements in scorecard development, including novel learning methods, performance measures and techniques to reliably compare different classifiers, which the credit scoring literature does not reflect. This paper compares several novel classification algorithms to the state-of-the-art in credit scoring. In addition, the extent to which the assessment of alternative scorecards differs across established and novel indicators of predictive accuracy is examined.

  • Classification methods applied to credit scoring: Systematic review and overall comparison - The need for controlling and effectively managing credit risk has led financial institutions to excel in improving techniques designed for this purpose, resulting in the development of various quantitative models by financial institutions and consulting companies. Hence, the growing number of academic studies about credit scoring shows a variety of classification methods applied to discriminate good and bad borrowers. This paper aims to present a systematic literature review relating theory and application of binary classification techniques for credit scoring financial analysis. The general results show the use and importance of the main techniques for credit rating, as well as some of the scientific paradigm changes throughout the years.

  • Classifier Technology and the Illusion of Progress - A great many tools have been developed for supervised classification, ranging from early methods such as linear discriminant analysis through to modern developments such as neural networks and support vector machines. A large number of comparative studies have been conducted in attempts to establish the relative superiority of these methods. This paper argues that these comparisons often fail to take into account important aspects of real problems, so that the apparent superiority of more sophisticated methods may be something of an illusion. In particular, simple methods typically yield performance almost as good as more sophisticated methods, to the extent that the difference in performance may be swamped by other sources of uncertainty that generally are not considered in the classical supervised classification paradigm.

  • Financial credit risk assessment: a recent review - Summarizes the traditional statistical models and state-of-the-art intelligent methods for financial distress forecasting, with emphasis on the most recent achievements.

  • Good practice in retail credit scorecard assessment - In retail banking, predictive statistical models called ‘scorecards’ are used to assign customers to classes, and hence to appropriate actions or interventions. Such assignments are made on the basis of whether a customer's predicted score is above or below a given threshold. The predictive power of such scorecards gradually deteriorates over time, so that performance needs to be monitored. Common performance measures used in the retail banking sector include the Gini coefficient, the Kolmogorov–Smirnov statistic, the mean difference, and the information value. However, all of these measures use irrelevant information about the magnitude of scores, and fail to use crucial information relating to numbers misclassified. The result is that such measures can sometimes be seriously misleading, resulting in poor quality decisions being made, and mistaken actions being taken.

  • A literature review on the application of evolutionary computing to credit scoring - The aim of this paper is to summarize the most recent developments in the application of evolutionary algorithms to credit scoring by means of a thorough review of scientific articles published during the period 2000–2012.

  • Machine learning predictivity applied to consumer creditworthiness - Analyzes the adequacy of borrower’s classification models using a Brazilian bank’s loan database, exploring machine learning techniques, and comparing their predictive accuracy with a benchmark based on a Logistic Regression model. Comparisons are based on usual classification performance metrics.

  • Consumer credit-risk models via machine-learning algorithms - The authors apply machine-learning techniques to construct nonlinear nonparametric forecasting models of consumer credit risk. They are able to construct out-of-sample forecasts that significantly improve the classification rates of credit-card-holder delinquencies and defaults.

  • Example-Dependent Cost-Sensitive Logistic Regression for Credit Scoring - Several real-world classification problems are example-dependent cost-sensitive in nature, where the costs due to misclassification vary between examples. Credit scoring is a typical example of cost-sensitive classification. However, it is usually treated using methods that do not take into account the real financial costs associated with the lending business.

  • Credit scoring using the clustered support vector machine - Introduces the use of the clustered support vector machine (CSVM) for credit scorecard development. This recently designed algorithm addresses some of the limitations associated with traditional nonlinear support vector machine (SVM) based methods for classification. Specifically, it is well known that as historical credit scoring datasets get large, these nonlinear approaches, while highly accurate, become computationally expensive. The CSVM can achieve comparable levels of classification performance while remaining relatively cheap computationally.

  • A comparative study on base classifiers in ensemble methods for credit scoring - In the last years, the application of artificial intelligence methods on credit risk assessment has meant an improvement over classic methods. Recent works show that ensembles of classifiers achieve the better results for this kind of tasks.

  • Multiple classifier application to credit risk assessment - (Corrigendum) - This paper explores the predicted behaviour of five classifiers for different types of noise in terms of credit risk prediction accuracy, and how such accuracy could be improved by using classifier ensembles.

  • Recent developments in consumer credit risk assessment - The riskiness of lending to a credit applicant is usually estimated using a logistic regression model though researchers have considered many other types of classifier, but data quality issues may prevent these laboratory based results from being achieved in practice. The training of a classifier on a sample of accepted applicants rather than on a sample representative of the applicant population seems not to result in bias though it does result in difficulties in setting the cut off.

  • A survey of credit and behavioural scoring: forecasting financial risk of lending to consumers - Surveys the techniques used — both statistical and operational research based — to help organisations decide whether or not to grant credit to consumers. It also discusses the need to incorporate economic conditions into the scoring systems and the way the systems could change from estimating the probability of a consumer defaulting to estimating the profit a consumer will bring to the lending organisation.

  • The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients - This research compares the predictive accuracy of probability of default among six data mining methods. From the perspective of risk management, the result of predictive accuracy of the estimated probability of default will be more valuable than the binary result of classification.

  • Super-App Behavioral Patterns in Credit Risk Models: Financial, Statistical and Regulatory Implications - Presents the impact of alternative data that originates from an app-based marketplace, in contrast to traditional bureau data, upon credit scoring models. These alternative data sources have shown themselves to be immensely powerful in predicting borrower behavior in segments traditionally underserved by banks and financial institutions. At the same time alternative data must be carefully validated to overcome regulatory hurdles across diverse jurisdictions.

Institutional Credit Risk

  • Availability of Credit to Small Businesses - Section 2227 of the Economic Growth and Regulatory Paperwork Reduction Act of 1996 requires that, every five years, the Board of Governors of the Federal Reserve System submit a report to the Congress detailing the extent of small business lending by all creditors. The most recent one is dated September, 2017.

  • Credit Scoring and the Availability, Price, and Risk of Small Business Credit - Finds that small business credit scoring is associated with expanded quantities, higher averages prices, and greater average risk levels for small business credits under $100,000, after controlling for bank size and other differences across banks.

  • Credit Risk Assessment Using Statistical and Machine Learning: Basic Methodology and Risk Modeling Applications - An important ingredient to accomplish the goal of a more efficient use of resources through risk modeling is to find accurate predictors of individual risk in the credit portfolios of institutions. In this context the authors make a comparative analysis of different statistical and machine learning modeling methods of classification on a mortgage loan dataset with the motivation to understand their limitations and potential.

  • Random Survival Forests Models for SME Credit Risk Measurement - Extends the existing literature on empirical research in the field of credit risk default for Small Medium Enterprizes (SMEs), proposing a non-parametric approach based on Random Survival Forests (RSF) and comparing its performance with a standard logit model.

  • Modeling Institutional Credit Risk with Financial News - Current work in downgrade risk modeling depends on multiple variations of quantitative measures provided by third-party rating agencies and risk management consultancy companies. There has been a wide push into using alternative sources of data, such as financial news, earnings call transcripts, or social media content, to possibly gain a competitive edge in the industry. This paper proposes a predictive downgrade model using solely news data represented by neural network embeddings.

  • Bankruptcy prediction for credit risk using neural networks: A survey and new results - The prediction of corporate bankruptcies is an important and widely studied topic since it can have significant impact on bank lending decisions and profitability. This work reviews the topic of bankruptcy prediction, with emphasis on neural-network (NN) models and develops an NN bankruptcy prediction model, proposing novel indicators for the NN system.

Peer-to-Peer Lending

  • Network based credit risk models - Peer-to-Peer lending platforms may lead to cost reduction, and to an improved user experience. These improvements may come at the price of inaccurate credit risk measurements. The authors propose to augment traditional credit scoring methods with “alternative data” that consist of centrality measures derived from similarity networks among borrowers, deduced from their financial ratios.

Sample Selection

Feature Selection

  • A multi-objective approach for profit-driven feature selection in credit scoring - In credit scoring, feature selection aims at removing irrelevant data to improve the performance and interpretability of the scorecard. Standard techniques treat feature selection as a single-objective task and rely on statistical criteria such as correlation. Recent studies suggest that using profit-based indicators may improve the quality of scoring models for businesses.

  • Data mining feature selection for credit scoring models - The features used may have an important effect on the performance of credit scoring models. The process of choosing the best set of features for credit scoring models is usually unsystematic and dominated by somewhat arbitrary trial. This paper presents an empirical study of four machine learning feature selection methods.

  • Combination of feature selection approaches with SVM in credit scoring - An effective classificatory model in credit scoring will objectively help managers who rely on intuitive experience. This study proposes four approaches using the SVM (support vector machine) classifier for feature selection that retain sufficient information for classification purposes.

Model Explainability

  • Explainable Machine learning in Credit Risk Management - Proposes an explainable AI model that can be used in credit risk management and, in particular, in measuring the risks that arise when credit is borrowed employing credit scoring platforms.

  • Machine learning explainability in finance: an application to default risk analysis - This Staff Working Paper from the Bank of England proposes a framework for addressing the ‘black box’ problem present in some Machine Learning (ML) applications.

  • Regulatory learning: How to supervise machine learning models? An application to credit scoring - The arrival of Big Data strategies is threatening the latest trends in financial regulation related to the simplification of models and the enhancement of the comparability of approaches chosen by financial institutions. Indeed, the intrinsic dynamic philosophy of Big Data strategies is almost incompatible with the current legal and regulatory framework as illustrated in this paper. Besides, the model selection may also evolve dynamically forcing both practitioners and regulators to develop libraries of models, strategies allowing to switch from one to the other as well as supervising approaches allowing financial institutions to innovate in a risk mitigated environment.