-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-15149][EXAMPLE][DOC] update kmeans example #12925
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #57857 has finished for PR 12925 at commit
|
|
Test build #57859 has finished for PR 12925 at commit
|
|
@dongjoon-hyun This one as well. Do you mind if I ask your thoughts on the component in the title? Making good examples for PRs will help all other contributers. |
|
@HyukjinKwon ok. I will change them to |
|
@zhengruifeng I prefer the style of I will comment on #11844 too about harmonzing the Scala examples. |
|
@MLnick Ok. I will update this examples to read the datafile |
|
@MLnick updated. Thanks for your comments. |
|
Oh, I need to update the |
|
Test build #57885 has finished for PR 12925 at commit
|
|
Test build #57888 has finished for PR 12925 at commit
|
|
Ah, I had a PR ready for this but didn't see you had created a Jira for it. I can review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you see my comments about this on #11844 and let me know?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it. I will make KMeans examples keep in line with BiKMeans ones
|
|
|
Test build #58084 has finished for PR 12925 at commit
|
|
Test build #58145 has finished for PR 12925 at commit
|
|
|
||
| import numpy as np | ||
| # $example on$ | ||
| from pyspark.ml.clustering import KMeans, KMeansModel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need to import KMeansModel here.
|
LGTM other than one minor comment and pending #11844 |
| Run with: | ||
| bin/spark-submit examples/src/main/python/ml/kmeans_example.py <input> <k> | ||
| This example requires NumPy (http://www.numpy.org/). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: So I believe this example still requires NumPy even though it isn't explicitly imported (see inside of def toArray called inside of clusterCenters which says it returns a NumPy array).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I will revert this removal.
|
Test build #58307 has finished for PR 12925 at commit
|
|
@MLnick Thanks. Updated |
|
Test build #58309 has finished for PR 12925 at commit
|
|
LGTM. I'll merge this once #11844 is merged. |
|
Merged to master and branch-2.0. Thanks! |
## What changes were proposed in this pull request? Python example for ml.kmeans already exists, but not included in user guide. 1,small changes like: `example_on` `example_off` 2,add it to user guide 3,update examples to directly read datafile ## How was this patch tested? manual tests `./bin/spark-submit examples/src/main/python/ml/kmeans_example.py Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes #12925 from zhengruifeng/km_pe. (cherry picked from commit 8beae59) Signed-off-by: Nick Pentreath <nickp@za.ibm.com>
What changes were proposed in this pull request?
Python example for ml.kmeans already exists, but not included in user guide.
1,small changes like:
example_onexample_off2,add it to user guide
3,update examples to directly read datafile
How was this patch tested?
manual tests
`./bin/spark-submit examples/src/main/python/ml/kmeans_example.py