-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-13089][ML] [Doc] spark.ml Naive Bayes user guide and examples #11015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #50531 has finished for PR 11015 at commit
|
| package org.apache.spark.examples.ml; | ||
|
|
||
| // $example on$ | ||
| import org.apache.spark.SparkConf; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would not include SparkConf or JavaSparkContext
|
test this please |
|
Test build #55389 has finished for PR 11015 at commit
|
|
Test build #55471 has finished for PR 11015 at commit
|
|
Thanks for the review. Updated according to the comments. |
| probabilistic classifiers based on applying Bayes' theorem with strong (naive) independence | ||
| assumptions between the features. More information about the spark.ml implementation can be | ||
| found further in the section on [Naive Bayes in MLlib](mllib-naive-bayes.html#naive-bayes-sparkmllib). | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to clarify ml.NaiveBayes supports Multinomial NB and Bernoulli NB. Meanwhile, we should provide the link to corresponding documents. You can refer the NaiveBayes API doc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for taking a look. The wiki link already provided a good overall introduction to Naive Bayes. I'll add some clarification. And in the mllib documents, it clarifies naive Bayes supports both Multinomial and Bernoulli.
|
Test build #55596 has finished for PR 11015 at commit
|
| .setLabelCol("label") | ||
| .setPredictionCol("prediction") | ||
| .setMetricName("precision") | ||
| val accuracy = evaluator.evaluate(predictions) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call it "precision" instead of "accuracy" since it might confuse people (even though they are the same here).
|
Just that 1 comment |
|
Test build #55694 has finished for PR 11015 at commit
|
|
LGTM |
|
Thanks @jkbradley |
jira: https://issues.apache.org/jira/browse/SPARK-13089
Add section in ml-classification.md for NaiveBayes DataFrame-based API, plus example code (using include_example to clip code from examples/ folder files).