Predict which country a new user will make his or her first booking.
Link to the kaggle challenge: https://www.kaggle.com/c/airbnb-recruiting-new-user-bookings
Install anaconda or miniconda. Then create the environment with conda.
$ conda env create -f environment.yml
Download the Airbnb data here.
i.e.
$ mkdir data results
Run the following script to generate the processed train and test files.
$ python preprocess.py
Use the -d flag to change your data directory (default: 'data').
To install lightgbm and xgboost, see the lightgbm install guide and xgboost install guide.
$ python train.py -h
usage: train.py [-h] [-d DATA_DIR] [-r RESULTS_DIR]
[-m {all,logistic,tree,forest,ada,xgb}]
[-s {random,smote,adasyn,smoteenn,none}] [-k K_FOLDS]
[--device {cpu,gpu}]
Airbnb New User Booking Classification
optional arguments:
-h, --help show this help message and exit
-d DATA_DIR data directory
-r RESULTS_DIR results save directory
-m {all,logistic,tree,forest,ada,xgb}
model
-s {random,smote,adasyn,smoteenn,none}
sampling method
-k K_FOLDS number of CV folds
--device {cpu,gpu} device
To use lightgbm, run:
$ python lightgbm_train.py
See cross validation output in the results
directory in the form of pickle files.
To print a summary, run the following:
$ python summarize.py