This repository is improved from neural random forest. Thanks the original author a lot!
Relevant paper:
Based on it, we add LightGBM model in the framework and impove the performance a lot.
This code is based on python3 and uses tensorflow 1.3.0.
First, let's make sure you have all packages needed:
pip3 install -r requirements.txt
Notice that the newest version (installed from github source code) of LightGBM is needed and can't installed by pip temporarily!
For a quick start, let's download the mpg dataset from the UCI Machine Learning Repository (30KB):
cd datasets/data/mpg_data
sh download.sh
To run different Neural Random Forest models on the mpg dataset, execute this (takes ~2min) from the repository root directory:
python3 main.py mpg <randomforest or lightgbm>
To run the model on a new dataset, you must write a data loader function and add an option to data_loader.py
.
For inspiration, check out the data loaders in preprocessing/
which are for other datasets used in the paper .
The data loader functions all return a pair (X, Y), where X is an input matrix of size [# samples, # features]
, and Y is a vector of regression outputs with size [# samples]
.