cu2rec: CUDA Meets Recommender Systems

cu2rec is a Matrix Factorization library designed to accelerate training Recommender Systems models using GPUs in CUDA. It implements Parallel Stochastic Gradient Descent for training the matrix factorization model.

Data

The input data should be a CSV file in the form of userId,itemId,rating and should have an header. If the user ids and the item ids are not sequential, run python preprocessing/map_items.py <ratings_file> to convert the user ids and item ids into sequential integers, starting with 1.

Once you have a mapped CSV, you can use python preprocessing/split_to_test_train.py <mapped_file> <test_ratio> to split the data into training and tests sets to use with mf.cu.

Alternatively, you can also use the datasets below:

Movielens

Download movielens data here and save in data folder.
Run python preprocessing/map_items.py <ratings_file> to create a user-item mapped ratings file.
Run python preprocessing/split_to_test_train.py <mapped_file> <test_ratio> to split it into training and test files.

Netflix

Download the Netflix dataset here and place in under data/datasets/netflix.
Run python preprocessing/map_netflix.py to create the mapped training and test files.

Compiling Code

SSH into Prince or cuda2 using NYU credentials
srun -t5:00:00 --mem=30000 --gres=gpu:1 --pty /bin/bash
module load cuda/9.2.88
cd matrix_factorization && make

The makefile compiles for compute capability 5.2. If you have a GPU that does not support that, please change it to compile for your device's compute capability. The code has been tested for compute capability down to 3.5.

Training

make mf
bin/mf -c <config_file> <ratings_file_train> <ratings_file_test>

Running all possible configurations

In order to run all of the experiments mentioned in the report, you can cd experiments and run the included bash scripts. cu2rec.sh will give you the total runtimes and error metrics for all configurations, while cu2rec_prof.sh will give you all the nvprof results. Make sure you have all the data as described in the data section.

Getting recommendations for a user

Make sure you get the user data into the same ratings format as MovieLens.
make predict
bin/predict -c <config_file> -i <trained_item_bias_file> -g <trained_global_bias_file> -q <trained_Q_file> <ratings_file>

Running Tests

cd tests
make
If you want to run all tests, make run_all
Otherwise, bin/test_{}

Authors

Nick Greenquist
Doruk Kilitcioglu

Name		Name	Last commit message	Last commit date
Latest commit History 145 Commits
data/test		data/test
experiments		experiments
matrix_factorization		matrix_factorization
preprocessing		preprocessing
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
movies_mapped.csv		movies_mapped.csv
ratings_mapped.csv		ratings_mapped.csv

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cu2rec: CUDA Meets Recommender Systems

Data

Movielens

Netflix

Compiling Code

Training

Running all possible configurations

Getting recommendations for a user

Running Tests

Authors

About

Releases

Packages

Contributors 2

Languages

License

nickgreenquist/cu2rec

Folders and files

Latest commit

History

Repository files navigation

cu2rec: CUDA Meets Recommender Systems

Data

Movielens

Netflix

Compiling Code

Training

Running all possible configurations

Getting recommendations for a user

Running Tests

Authors

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages