Alternating logistic regression is a collaborative filtering method for the prediction of occurance probability given binary observations (e.g. click-though rate).
To compile:
sbt/sbt assembly
To run with 4GB of ram:
./bin/spark-submit --class \
./examples/target/scala-2.10/spark-examples-1.6.2-SNAPSHOT-hadoop2.2.0.jar \
--executor-memory 4G --driver-memory 4G
All the implementations are in the SparkALR.scala except the localTrain method for LogisticRegression() which is in LogisiticRegression.scala.
Sample data is included at data/mllib/