-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-20930][ML] Destroy broadcasted centers after computing cost in KMeans #18152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #77570 has finished for PR 18152 at commit
|
|
Yes, I agree with that. It looks like there may be a similar instance in In every other similar instance I see, yes, the broadcast is destroyed after it's used to collect a result. |
3736992 to
ddc9ffb
Compare
|
Test build #77618 has finished for PR 18152 at commit
|
|
@srowen Thanks for pointing that out. Btw, I reviewed all calls of broadcast in ml again and found this instance also happens in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good; how about in the getTopicDistributionMethod method too? I actually think that broadcast is pointless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getTopicDistributionMethod method is only used in ml.clustering.LDA#transform, so I think this maybe useful if the model size is large.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhengruifeng but the broadcast isn't actually used. Its .value is called, locally, not from a distributed method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@srowen You are right. I removed the unnecessary broadcasting.
ddc9ffb to
3fd52a8
Compare
|
Test build #77728 has finished for PR 18152 at commit
|
|
Merged to master |
What changes were proposed in this pull request?
Destroy broadcasted centers after computing cost
How was this patch tested?
existing tests