Skip to content

Conversation

@JoshRosen
Copy link
Contributor

This commit moves the SparkContext accumulator and accumulable methods and the AccumulableParam implicits into mixin traits that are defined inside of the Accumulators.scala file. I think that this improves readability by grouping together all accumulator-related code in the same file; it also helps to reduce the bloat of SparkContext.scala.

From a user-facing perspective, nothing has changed.

@SparkQA
Copy link

SparkQA commented Nov 3, 2014

Test build #22808 has started for PR 3071 at commit e08a588.

  • This patch merges cleanly.

@SparkQA
Copy link

SparkQA commented Nov 3, 2014

Test build #22808 has finished for PR 3071 at commit e08a588.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class SparkContext(config: SparkConf) extends Logging

@AmplabJenkins
Copy link

Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/22808/
Test FAILed.

JoshRosen added a commit to JoshRosen/spark that referenced this pull request Nov 11, 2014
This makes binary compatibility easier to reason about and might avoid
some pitfalls that I’ve run into while attempting to refactor other
parts of SparkContext to use mixin traits (see apache#3071, for example).

Requiring users to access status API methods through `sc.statusAPI.*`
also avoids SparkContext bloat and buys us extra freedom for adding
parallel higher / lower-level APIs.
@JoshRosen
Copy link
Contributor Author

I'm going to close this for now; it looks like this breaks closure cleaning due to the addition of another level of $outer references. I'll be able to post a better explanation / picture once I clean up my serialization debugging visualizer.

@JoshRosen JoshRosen closed this Nov 11, 2014
asfgit pushed a commit that referenced this pull request Nov 15, 2014
This PR refactors / extends the status API introduced in #2696.

- Change StatusAPI from a mixin trait to a class.  Before, the new status API methods were directly accessible through SparkContext, whereas now they're accessed through a `sc.statusAPI` field.  As long as we were going to add these methods directly to SparkContext, the mixin trait seemed like a good idea, but this might be simpler to reason about and may avoid pitfalls that I've run into while attempting to refactor other parts of SparkContext to use mixins (see #3071, for example).
- Change the name from SparkStatusAPI to SparkStatusTracker.
- Make `getJobIdsForGroup(null)` return ids for jobs that aren't associated with any job group.
- Add `getActiveStageIds()` and `getActiveJobIds()` methods that return the ids of whatever's currently active in this SparkContext.  This should simplify davies's progress bar code.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3197 from JoshRosen/progress-api-improvements and squashes the following commits:

30b0afa [Josh Rosen] Rename SparkStatusAPI to SparkStatusTracker.
d1b08d8 [Josh Rosen] Add missing newlines
2cc7353 [Josh Rosen] Add missing file.
d5eab1f [Josh Rosen] Add getActive[Stage|Job]Ids() methods.
a227984 [Josh Rosen] getJobIdsForGroup(null) should return jobs for default group
c47e294 [Josh Rosen] Remove StatusAPI mixin trait.

(cherry picked from commit 40eb8b6)
Signed-off-by: Reynold Xin <rxin@databricks.com>
asfgit pushed a commit that referenced this pull request Nov 15, 2014
This PR refactors / extends the status API introduced in #2696.

- Change StatusAPI from a mixin trait to a class.  Before, the new status API methods were directly accessible through SparkContext, whereas now they're accessed through a `sc.statusAPI` field.  As long as we were going to add these methods directly to SparkContext, the mixin trait seemed like a good idea, but this might be simpler to reason about and may avoid pitfalls that I've run into while attempting to refactor other parts of SparkContext to use mixins (see #3071, for example).
- Change the name from SparkStatusAPI to SparkStatusTracker.
- Make `getJobIdsForGroup(null)` return ids for jobs that aren't associated with any job group.
- Add `getActiveStageIds()` and `getActiveJobIds()` methods that return the ids of whatever's currently active in this SparkContext.  This should simplify davies's progress bar code.

Author: Josh Rosen <joshrosen@databricks.com>

Closes #3197 from JoshRosen/progress-api-improvements and squashes the following commits:

30b0afa [Josh Rosen] Rename SparkStatusAPI to SparkStatusTracker.
d1b08d8 [Josh Rosen] Add missing newlines
2cc7353 [Josh Rosen] Add missing file.
d5eab1f [Josh Rosen] Add getActive[Stage|Job]Ids() methods.
a227984 [Josh Rosen] getJobIdsForGroup(null) should return jobs for default group
c47e294 [Josh Rosen] Remove StatusAPI mixin trait.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants