Skip to content

Conversation

@sryza
Copy link
Contributor

@sryza sryza commented Sep 22, 2014

No description provided.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have started for PR 2490 at commit cb9ffad.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Sep 22, 2014

QA tests have finished for PR 2490 at commit cb9ffad.

  • This patch passes unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sryza Actually this is not true....

scala> val acc = sc.accumulator(0)

scala> val data = sc.parallelize(List(1, 2, 3))
data: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at :12

scala> val a = data.map(x => acc += 1)

scala> a.count

scala> acc.value
res5: Int = 3

scala> a.count
scala> acc.value
res7: Int = 6

I will resubmit #228 tonight or tomorrow,

@sryza
Copy link
Contributor Author

sryza commented Nov 15, 2014

Thanks for the review @Ishiihara . Updated the PR to clarify these points.

@SparkQA
Copy link

SparkQA commented Nov 15, 2014

Test build #23400 has started for PR 2490 at commit 2a81019.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 15, 2014

Test build #23400 has finished for PR 2490 at commit 2a81019.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/23400/
Test PASSed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going to suggest that this might not be true, but it looks like this is partially-true now that #2524 has been merged. That PR added a paragraph to the programming guide which clarifies that we guard against duplicate updates only for updates performed inside of actions and not for ones performed in transformations: 66cc243?diff=unified#diff-3

In light of this, do we still need this paragraph?

@JoshRosen
Copy link
Contributor

Sorry for letting this sit for so long; I'm working my way through the backlog now, though.

I think that the second addition, RE: accumulator updates, may no longer be necessary / may be subsumed by changes in other PRs, but the first paragraph RE: broadcast variables still looks good.

@srowen
Copy link
Member

srowen commented Feb 11, 2015

Sandy do you want to remove or update the 2nd paragraph? can be merged then, it looks like.

@sryza sryza force-pushed the sandy-spark-3642 branch from 2a81019 to aae3340 Compare March 10, 2015 18:30
@SparkQA
Copy link

SparkQA commented Mar 10, 2015

Test build #28433 has started for PR 2490 at commit aae3340.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Mar 10, 2015

Test build #28433 has finished for PR 2490 at commit aae3340.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/28433/
Test PASSed.

@srowen
Copy link
Member

srowen commented Mar 10, 2015

LGTM since the PR now represents the just the first paragraph that Josh alludes to above, and he approved that much. I'll leave it open a little while for more comments just in case.

@asfgit asfgit closed this in 2d87a41 Mar 11, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants