[SPARK-1412][SQL] Disable partial aggregation automatically when reduction factor is low - WIP #1152

rxin · 2014-06-20T08:08:04Z

This avoids building up an expensive hash map if partial aggregation does not result in data size reduction.

Just a prototype. Kinda ugly, doesn't properly connect with the config system yet, and have no test.

rxin · 2014-06-20T08:09:02Z

@concretevitamin I find it hard to actually use config options in a physical operator. Any suggestions?

AmplabJenkins · 2014-06-20T08:09:57Z

Merged build triggered.

AmplabJenkins · 2014-06-20T08:10:06Z

Merged build started.

rxin · 2014-06-20T08:11:33Z

@pwendell / @mateiz should we actually build this into Spark directly (i.e. in Aggregator)?

AmplabJenkins · 2014-06-20T09:25:55Z

Merged build finished. All automated tests passed.

AmplabJenkins · 2014-06-20T09:25:55Z

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/15952/

concretevitamin · 2014-06-20T18:10:44Z

@rxin If we are simply trying to read the default values for the params, but not user-set ones (i.e. in the absence of a SQLContext in execute()), I think we could move the default param values to a companion object of SQLConf, and in the assessors of this class, either get the user-set values or else get the default values from the static object.

mateiz · 2014-06-21T23:04:27Z

It would be great to add this into Aggregator as well. Would that replace the implementation here? I.e. does Spark SQL go through Aggregator?

rxin · 2014-06-21T23:47:22Z

Spark SQL doesn't currently use the aggregator, but we would want to do that.

aarondav · 2014-06-22T01:33:05Z

sql/core/src/main/scala/org/apache/spark/sql/execution/Aggregate.scala

Man, those are some high standards!

rxin · 2014-06-24T07:44:31Z

@mateiz I submitted a patch to core's Aggregator in #1191.

After implementing it in Aggregator, I realized it might be hard for Spark SQL to reuse Aggregator unless we change Aggregator to allocate less temporary objects (or write the aggregation code path in Spark SQL to output key value tuples).

* CARMEL-6367: Insert bloom filter if it is skew bucket join * Fix * fix * fix * fix

https://github.pie.apple.com/IPR/apache-incubator-iceberg/compare/IPR:9a2d360...IPR:48834b0 Internal: Change Default Optimize Threshold Internal (Boson): Bump Boson version to 0.3.23 and remove the fallbac… Internal(Boson): Populate spark.boson.exceptionOnDatetimeRebase to Bo… Releases Apple Iceberg 1.3.0.5 (apache#1152)

Prototype for disable partial aggregation when we don't see reduction.

6360e11

aarondav reviewed Jun 22, 2014
View reviewed changes

sql/core/src/main/scala/org/apache/spark/sql/execution/Aggregate.scala

Copy link

Contributor

aarondav Jun 22, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man, those are some high standards!

rxin closed this Aug 29, 2014

wangyum added a commit that referenced this pull request May 26, 2023

[CARMEL-6367] Insert bloom filter if it is skew bucket join (#1152)

6b533c1

* CARMEL-6367: Insert bloom filter if it is skew bucket join * Fix * fix * fix * fix

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[SPARK-1412][SQL] Disable partial aggregation automatically when reduction factor is low - WIP #1152

[SPARK-1412][SQL] Disable partial aggregation automatically when reduction factor is low - WIP #1152

Uh oh!

rxin commented Jun 20, 2014

Uh oh!

rxin commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

rxin commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

concretevitamin commented Jun 20, 2014

Uh oh!

mateiz commented Jun 21, 2014

Uh oh!

rxin commented Jun 21, 2014

Uh oh!

aarondav Jun 22, 2014

Uh oh!

rxin commented Jun 24, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[SPARK-1412][SQL] Disable partial aggregation automatically when reduction factor is low - WIP #1152

[SPARK-1412][SQL] Disable partial aggregation automatically when reduction factor is low - WIP #1152

Uh oh!

Conversation

rxin commented Jun 20, 2014

Uh oh!

rxin commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

rxin commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

AmplabJenkins commented Jun 20, 2014

Uh oh!

concretevitamin commented Jun 20, 2014

Uh oh!

mateiz commented Jun 21, 2014

Uh oh!

rxin commented Jun 21, 2014

Uh oh!

aarondav Jun 22, 2014

Choose a reason for hiding this comment

Uh oh!

rxin commented Jun 24, 2014

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants