-
Notifications
You must be signed in to change notification settings - Fork 29k
Move SparkContext accumulator methods to Accumulators.scala #3071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move SparkContext accumulator methods to Accumulators.scala #3071
Conversation
|
Test build #22808 has started for PR 3071 at commit
|
|
Test build #22808 has finished for PR 3071 at commit
|
|
Test FAILed. |
This makes binary compatibility easier to reason about and might avoid some pitfalls that I’ve run into while attempting to refactor other parts of SparkContext to use mixin traits (see apache#3071, for example). Requiring users to access status API methods through `sc.statusAPI.*` also avoids SparkContext bloat and buys us extra freedom for adding parallel higher / lower-level APIs.
|
I'm going to close this for now; it looks like this breaks closure cleaning due to the addition of another level of |
This PR refactors / extends the status API introduced in #2696. - Change StatusAPI from a mixin trait to a class. Before, the new status API methods were directly accessible through SparkContext, whereas now they're accessed through a `sc.statusAPI` field. As long as we were going to add these methods directly to SparkContext, the mixin trait seemed like a good idea, but this might be simpler to reason about and may avoid pitfalls that I've run into while attempting to refactor other parts of SparkContext to use mixins (see #3071, for example). - Change the name from SparkStatusAPI to SparkStatusTracker. - Make `getJobIdsForGroup(null)` return ids for jobs that aren't associated with any job group. - Add `getActiveStageIds()` and `getActiveJobIds()` methods that return the ids of whatever's currently active in this SparkContext. This should simplify davies's progress bar code. Author: Josh Rosen <joshrosen@databricks.com> Closes #3197 from JoshRosen/progress-api-improvements and squashes the following commits: 30b0afa [Josh Rosen] Rename SparkStatusAPI to SparkStatusTracker. d1b08d8 [Josh Rosen] Add missing newlines 2cc7353 [Josh Rosen] Add missing file. d5eab1f [Josh Rosen] Add getActive[Stage|Job]Ids() methods. a227984 [Josh Rosen] getJobIdsForGroup(null) should return jobs for default group c47e294 [Josh Rosen] Remove StatusAPI mixin trait. (cherry picked from commit 40eb8b6) Signed-off-by: Reynold Xin <rxin@databricks.com>
This PR refactors / extends the status API introduced in #2696. - Change StatusAPI from a mixin trait to a class. Before, the new status API methods were directly accessible through SparkContext, whereas now they're accessed through a `sc.statusAPI` field. As long as we were going to add these methods directly to SparkContext, the mixin trait seemed like a good idea, but this might be simpler to reason about and may avoid pitfalls that I've run into while attempting to refactor other parts of SparkContext to use mixins (see #3071, for example). - Change the name from SparkStatusAPI to SparkStatusTracker. - Make `getJobIdsForGroup(null)` return ids for jobs that aren't associated with any job group. - Add `getActiveStageIds()` and `getActiveJobIds()` methods that return the ids of whatever's currently active in this SparkContext. This should simplify davies's progress bar code. Author: Josh Rosen <joshrosen@databricks.com> Closes #3197 from JoshRosen/progress-api-improvements and squashes the following commits: 30b0afa [Josh Rosen] Rename SparkStatusAPI to SparkStatusTracker. d1b08d8 [Josh Rosen] Add missing newlines 2cc7353 [Josh Rosen] Add missing file. d5eab1f [Josh Rosen] Add getActive[Stage|Job]Ids() methods. a227984 [Josh Rosen] getJobIdsForGroup(null) should return jobs for default group c47e294 [Josh Rosen] Remove StatusAPI mixin trait.
This commit moves the SparkContext
accumulatorandaccumulablemethods and theAccumulableParamimplicits into mixin traits that are defined inside of theAccumulators.scalafile. I think that this improves readability by grouping together all accumulator-related code in the same file; it also helps to reduce the bloat ofSparkContext.scala.From a user-facing perspective, nothing has changed.