Skip to content
Hollin Wilkins edited this page Dec 29, 2016 · 55 revisions

MLeap For Spark

MLeap deploys Spark ML (and some MLlib) transformers and pipelines to production without a Spark Context.

MLeap For Scikit-Learn

MLeap extends scikit-learn's functionality to be able to serialize and deploy scikit transformers, pipelines and feature unions without any dependencies on scikit (numpy, scipy, c++ libraries). It also serializes transformers and pipelines as Spark, so you can load and deploy your scikit pipelines on Spark infrastructure with a few lines of code.

Tutorials

Demos

Supported Transformers

Features

Transformer Spark Scikit-Learn TensorFlow
Binarizer x x
BucketedRandomProjectionLSH
Bucketizer x
ChiSqSelector x
CountVectorizer x
DCT x
ElementwiseProduct x x
HashingTermFrequency x x
IDF x
Imputer x x
Interaction x x
MaxAbsScaler x
MinHashLSH
MinMaxScaler x x
Ngram x
Normalizer x
OneHotEncoder x x
PCA x x
QuantileDiscretizer x
PolynomialExpansion x x
ReverseStringIndexer x x
StandardScaler x x
StopWordsRemover x
StringIndexer x x
Tokenizer x x
VectorAssembler x x
VectorIndexer
VectorSlicer
WordToVector x

Classification

Transformer Spark Scikit-Learn TensorFlow
DecisionTreeClassifier x x
GradientBoostedTreeClassifier x
LogisticRegression x x
LogisticRegressionCv x x
NaiveBayesClassifier x
OneVsRest x
RandomForestClassifier x x
SupportVectorMachines x x
MultiLayerPerceptron x

Regression

Transformer Spark Scikit-Learn TensorFlow
AFTSurvivalRegression x
DecisionTreeRegression x x
GeneralizedLinearRegression x
GradientBoostedTreeRegression x
IsotonicRegression x
LinearRegression x x
RandomForestRegression x x

Clustering

Transformer Spark Scikit-Learn TensorFlow
BisectingKMeans x
GaussianMixtureModel x
KMeans x
LDA

Extensions

Transformer Spark Scikit-Learn TensorFlow Description
MathUnary x x Simple set of unary mathematical operations
MathBinary x x Simple set of binary mathematical operations

Recommendation

Transformer Spark Scikit-Learn TensorFlow
ALS

Linear Algebra

  • CholeskyDecomposition