Skip to content

Conversation

@zhengruifeng
Copy link
Contributor

What changes were proposed in this pull request?

Destroy broadcasted centers after computing cost

How was this patch tested?

existing tests

@SparkQA
Copy link

SparkQA commented May 31, 2017

Test build #77570 has finished for PR 18152 at commit 3736992.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented May 31, 2017

Yes, I agree with that. It looks like there may be a similar instance in GradientDescent.scala, where bcWeights is used in a treeAggregate but then never destroyed.

In every other similar instance I see, yes, the broadcast is destroyed after it's used to collect a result.

@zhengruifeng zhengruifeng force-pushed the destroy_kmeans_model branch from 3736992 to ddc9ffb Compare June 1, 2017 02:09
@SparkQA
Copy link

SparkQA commented Jun 1, 2017

Test build #77618 has finished for PR 18152 at commit ddc9ffb.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@zhengruifeng
Copy link
Contributor Author

@srowen Thanks for pointing that out. Btw, I reviewed all calls of broadcast in ml again and found this instance also happens in LDAModel.scala.
I think now all broadcasted objects in ml will be destroyed after computation, except transform/predict/predictSoft/etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good; how about in the getTopicDistributionMethod method too? I actually think that broadcast is pointless.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getTopicDistributionMethod method is only used in ml.clustering.LDA#transform, so I think this maybe useful if the model size is large.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zhengruifeng but the broadcast isn't actually used. Its .value is called, locally, not from a distributed method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@srowen You are right. I removed the unnecessary broadcasting.

@zhengruifeng zhengruifeng force-pushed the destroy_kmeans_model branch from ddc9ffb to 3fd52a8 Compare June 5, 2017 02:17
@SparkQA
Copy link

SparkQA commented Jun 5, 2017

Test build #77728 has finished for PR 18152 at commit 3fd52a8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@srowen
Copy link
Member

srowen commented Jun 5, 2017

Merged to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants