-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-8519][SPARK-11560] [ML] [MLlib] Optimize KMeans implementation #10806
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #49596 has finished for PR 10806 at commit
|
|
Test build #49598 has finished for PR 10806 at commit
|
|
cc: @avulanov |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Collection methods might be expensive in that case, because fill will create array of arrays, copy data there, and then flatten will create another array and copy data there. Copy the array directly with System.arraycopy using additional loop instead.
|
@yanboliang @mengxr I made one pass. |
|
@avulanov Thanks for your comments. I will update this PR soon. |
|
Hi everyone, @yanboliang, thanks for optimizing Kmeans. I have one question. We can also work on it in a separate jira/PR. I just want to know your opinion on that. Thanks, |
|
@yanboliang Do you have time to update this PR? Could you also post the latest performance results after the update? Thanks! @NarineK Yes, I think it is useful to add those statistics (tracked in a separate JIRA). So please create a JIRA for it and include links to sklearn and R methods. |
|
@mengxr Sorry for late response, I will update it and post latest performance results soon. |
68d830c to
5b76bd9
Compare
5b76bd9 to
e166e86
Compare
|
Test build #53924 has finished for PR 10806 at commit
|
|
Test build #53926 has finished for PR 10806 at commit
|
|
Test build #53922 has finished for PR 10806 at commit
|
|
@mengxr @avulanov After update, The original version: |
|
Thanks @yanboliang, I'll create the jira soon! |
|
I will make another pass soon:) |
|
I see that the size of the blocks can be tuned and is fairly small by default (128). Out of curiosity, how did you pick this number, instead of the full size of the partition for example? |
|
Good point. According to matrices multiplication benchmark, we can get peak performance on modern CPUs with square matrices somewhere between 4Kx4K and 8Kx8K. So, it worth using a bigger batch rather than 128. |
|
Test build #60396 has finished for PR 10806 at commit
|
|
Close this one and move update version to #14937, let's discuss there. Thanks! |
Note: This is the new implementation to replace #10306 . cc @mengxr