From a932f51cdabaabebe50b2432f27bb96d4a54a751 Mon Sep 17 00:00:00 2001 From: Vincenzo Selvaggio Date: Sun, 17 May 2015 10:17:42 +0100 Subject: [PATCH 1/7] Create mllib-pmml-model-export.md --- docs/mllib-pmml-model-export.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 docs/mllib-pmml-model-export.md diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md new file mode 100644 index 000000000000..77aa86e645eb --- /dev/null +++ b/docs/mllib-pmml-model-export.md @@ -0,0 +1,25 @@ +--- +layout: global +title: PMML model export - MLlib +displayTitle: MLlib - PMML model export +--- + +MLlib supports model export to Predictive Model Markup Language ([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)) format. +The table below outlines the MLlib models that can be exported to PMML and their equivalent PMML format. + + + + + + + + + + + + + + + + +
MLlib modelPMML model
KMeansModelClusteringModel
LogisticRegressionModelRegressionModel
SVMModelRegressionModel
From 2e298b59f9ef648610bac1ab711d9ec11567df5c Mon Sep 17 00:00:00 2001 From: Vincenzo Selvaggio Date: Sun, 17 May 2015 10:29:15 +0100 Subject: [PATCH 2/7] Update mllib-pmml-model-export.md --- docs/mllib-pmml-model-export.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md index 77aa86e645eb..cb9b73af5440 100644 --- a/docs/mllib-pmml-model-export.md +++ b/docs/mllib-pmml-model-export.md @@ -5,6 +5,7 @@ displayTitle: MLlib - PMML model export --- MLlib supports model export to Predictive Model Markup Language ([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)) format. + The table below outlines the MLlib models that can be exported to PMML and their equivalent PMML format. @@ -14,12 +15,21 @@ The table below outlines the MLlib models that can be exported to PMML and their + + + + + + + + + - + - +
KMeansModelClusteringModel
LinearRegressionModelRegressionModel (functionName="regression")
RidgeRegressionModelRegressionModel (functionName="regression")
LassoModelRegressionModel (functionName="regression")
LogisticRegressionModelRegressionModelSVMModelRegressionModel (functionName="classification" normalizationMethod="none")
SVMModelRegressionModelLogisticRegressionModelRegressionModel (functionName="classification" normalizationMethod="logit")
From 680dc3369f268461f7f822aefad9e04da98095ec Mon Sep 17 00:00:00 2001 From: Vincenzo Selvaggio Date: Sun, 17 May 2015 10:58:51 +0100 Subject: [PATCH 3/7] Update mllib-pmml-model-export.md --- docs/mllib-pmml-model-export.md | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md index cb9b73af5440..ff0a0b0cfe6a 100644 --- a/docs/mllib-pmml-model-export.md +++ b/docs/mllib-pmml-model-export.md @@ -4,6 +4,11 @@ title: PMML model export - MLlib displayTitle: MLlib - PMML model export --- +* Table of contents +{:toc} + +## MLlib supported models + MLlib supports model export to Predictive Model Markup Language ([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)) format. The table below outlines the MLlib models that can be exported to PMML and their equivalent PMML format. @@ -33,3 +38,8 @@ The table below outlines the MLlib models that can be exported to PMML and their + +## Example: exporting KMeansModel +Same applies to other models... + +## Example: exporting a model to String, file ... From 273137552ce7879456f9c181008eb37f9b78d163 Mon Sep 17 00:00:00 2001 From: Vincenzo Selvaggio Date: Sun, 17 May 2015 11:13:14 +0100 Subject: [PATCH 4/7] Update mllib-pmml-model-export.md --- docs/mllib-pmml-model-export.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md index ff0a0b0cfe6a..dd96dc1f5cdf 100644 --- a/docs/mllib-pmml-model-export.md +++ b/docs/mllib-pmml-model-export.md @@ -34,7 +34,7 @@ The table below outlines the MLlib models that can be exported to PMML and their SVMModelRegressionModel (functionName="classification" normalizationMethod="none") - LogisticRegressionModelRegressionModel (functionName="classification" normalizationMethod="logit") + Binary LogisticRegressionModelRegressionModel (functionName="classification" normalizationMethod="logit") From d670662c9c292a265f271e19eef49d4918ac9dda Mon Sep 17 00:00:00 2001 From: Vincenzo Selvaggio Date: Sun, 17 May 2015 11:15:01 +0100 Subject: [PATCH 5/7] Update mllib-pmml-model-export.md --- docs/mllib-pmml-model-export.md | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md index dd96dc1f5cdf..b048395b8a90 100644 --- a/docs/mllib-pmml-model-export.md +++ b/docs/mllib-pmml-model-export.md @@ -39,7 +39,5 @@ The table below outlines the MLlib models that can be exported to PMML and their -## Example: exporting KMeansModel -Same applies to other models... +## Examples -## Example: exporting a model to String, file ... From 1beda98c81b3954c24bf9611232f57b6707ddc46 Mon Sep 17 00:00:00 2001 From: Vincenzo Selvaggio Date: Sun, 17 May 2015 04:53:31 -0700 Subject: [PATCH 6/7] [SPARK-7272] Initial user guide for pmml export --- docs/mllib-guide.md | 1 + docs/mllib-pmml-model-export.md | 47 +++++++++++++++++++++++++++++++-- 2 files changed, 46 insertions(+), 2 deletions(-) diff --git a/docs/mllib-guide.md b/docs/mllib-guide.md index f8e879496c13..de7d66fb2ded 100644 --- a/docs/mllib-guide.md +++ b/docs/mllib-guide.md @@ -39,6 +39,7 @@ filtering, dimensionality reduction, as well as underlying optimization primitiv * [Optimization (developer)](mllib-optimization.html) * stochastic gradient descent * limited-memory BFGS (L-BFGS) +* [PMML model export](mllib-pmml-model-export.html) MLlib is under active development. The APIs marked `Experimental`/`DeveloperApi` may change in future releases, diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md index b048395b8a90..83b7b8100def 100644 --- a/docs/mllib-pmml-model-export.md +++ b/docs/mllib-pmml-model-export.md @@ -9,9 +9,9 @@ displayTitle: MLlib - PMML model export ## MLlib supported models -MLlib supports model export to Predictive Model Markup Language ([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)) format. +MLlib supports model export to Predictive Model Markup Language ([PMML](http://en.wikipedia.org/wiki/Predictive_Model_Markup_Language)). -The table below outlines the MLlib models that can be exported to PMML and their equivalent PMML format. +The table below outlines the MLlib models that can be exported to PMML and their equivalent PMML model. @@ -40,4 +40,47 @@ The table below outlines the MLlib models that can be exported to PMML and their
## Examples +
+
+To export a supported `model` (see table above) to PMML, simply call `model.toPMML`. + +Here a complete example of building a KMeansModel and print it out in PMML format: +{% highlight scala %} +import org.apache.spark.mllib.clustering.KMeans +import org.apache.spark.mllib.linalg.Vectors + +// Load and parse the data +val data = sc.textFile("data/mllib/kmeans_data.txt") +val parsedData = data.map(s => Vectors.dense(s.split(' ').map(_.toDouble))).cache() + +// Cluster the data into two classes using KMeans +val numClusters = 2 +val numIterations = 20 +val clusters = KMeans.train(parsedData, numClusters, numIterations) + +// Export to PMML +println("PMML export = " + clusters.toPMML) +{% endhighlight %} + +As well as exporting the PMML model to a String (`model.toPMML` as in the example above), you can export the PMML model to other formats: + +{% highlight scala %} +// Export the model to a String in PMML format +clusters.toPMML + +// Export the model to a local file in PMML format +clusters.toPMML("/tmp/kmeans.xml") + +// Export the model to a directory on a distributed file system in PMML format +clusters.toPMML(sc,"/tmp/kmeans") + +// Export the model to the OutputStream in PMML format +clusters.toPMML(System.out) +{% endhighlight %} + +For unsupported models, either you will not find a `.toPMML` method or an `IllegalArgumentException` will be thrown. + +
+ +
From c866fb8eaa287ea70d619ffb183439f6b4f0a5d4 Mon Sep 17 00:00:00 2001 From: Vincenzo Selvaggio Date: Mon, 18 May 2015 07:30:16 +0100 Subject: [PATCH 7/7] Update mllib-pmml-model-export.md --- docs/mllib-pmml-model-export.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/mllib-pmml-model-export.md b/docs/mllib-pmml-model-export.md index 83b7b8100def..42ea2ca81f80 100644 --- a/docs/mllib-pmml-model-export.md +++ b/docs/mllib-pmml-model-export.md @@ -60,7 +60,7 @@ val numIterations = 20 val clusters = KMeans.train(parsedData, numClusters, numIterations) // Export to PMML -println("PMML export = " + clusters.toPMML) +println("PMML Model:\n" + clusters.toPMML) {% endhighlight %} As well as exporting the PMML model to a String (`model.toPMML` as in the example above), you can export the PMML model to other formats: