cpmf: Collection of Parallel Matrix Factorization

Prerequisite

required

piconjson is needed to parse config.json.

$ git clone https://github.com/kazuho/picojson.git vendor/picojson

optional

If you want to use MassiveThreads as a task parallel library, install it by the following command.

$ git clone https://github.com/massivethreads/massivethreads.git vendor/massivethreads
$ cd vendor/massivethreads
$ ./configure --prefix=/usr/local
$ make && make install

When you change PREFIX from /usr/local, be sure to also change MYTH_PATH in Makefile.

Converting MovieLens data

Use scripts/convert_movielens.py to convert MovieLens data format to cpmf format.

To convert MovieLens 100K Dataset,

$ python scripts/convert_movielens.py PATH/ml-100k/u.data > input/ml-100k

To convert MovieLens 1M dataset,

$ python scripts/convert_movielens.py PATH/ml-1m/ratings.dat --separator :: > input/ml-1m

To convert MovieLens 10M dataset

$ python scripts/convert_movielens.py PATH/ml-10M100K/ratings.dat --separator :: > input/ml-10m

Parallel methods

Users can designate the parallel method by DPARALLEL in Makefile.

FPSGD

In FPSGD, the rating matrix is divided into many blocks and multiple threads work on blocks not to share the same row or column.

If you want to use FPSGD method, specify DPARALLEL = -DFPSGD.

Reference

Y.Zhuang, W-S.Chin and Y-C.Juan and C-J.Lin, "A fast parallel SGD for matrix factorization in shared memory systems", RecSys'13, paper

dcMF (by Intel Cilk or MassiveThreads)

dcMF is our proposing way to parallelize matrix factorization by recursively dividing the rating matrix into 4 smaller blocks and dynamically assigning the created tasks to threads.

If you want to use dcMF, specify DPARALLEL = -DTP_BASED.

To decide which task parallel library to use, you should set as follows: DTP = -DTP_CILK for Intel Cilk or DTP = -DTP_MYTH for MassiveThreads.

Reference

Y. Nishioka, and K. Taura. "Scalable task-parallel SGD on matrix factorization in multicore architectures." Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International. paper

How to use

Just make and run!

$ make
$ ./mf train config.json

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
cpmf		cpmf
input		input
scripts		scripts
.gitignore		.gitignore
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
config.json		config.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

cpmf: Collection of Parallel Matrix Factorization

Prerequisite

required

optional

Converting MovieLens data

Parallel methods

FPSGD

dcMF (by Intel Cilk or MassiveThreads)

How to use

About

Releases

Packages

Languages

License

ysk24ok/cpmf

Folders and files

Latest commit

History

Repository files navigation

cpmf: Collection of Parallel Matrix Factorization

Prerequisite

required

optional

Converting MovieLens data

Parallel methods

FPSGD

dcMF (by Intel Cilk or MassiveThreads)

How to use

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages