-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17114][SQL] Fix aggregates grouped by literals with empty input #15101
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| * Removes literals from group expressions in [[Aggregate]], as they have no effect to the result | ||
| * but only makes the grouping key bigger. | ||
| */ | ||
| object RemoveLiteralFromGroupExpressions extends Rule[LogicalPlan] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there an existing unit test suite for this? might be good to add a test case there too.
|
LGTM |
|
Test build #65398 has finished for PR 15101 at commit
|
|
Test build #65406 has finished for PR 15101 at commit
|
| // Do not rewrite the aggregate if we drop all grouping expressions, because this can | ||
| // change the return semantics when the input of the Aggregate is empty. See SPARK-17114 | ||
| // for more information. | ||
| a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about a.copy(groupingExpressions = Seq(grouping.head))? I think we can still remove some literal grouping if we keep one of them
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then it might be even better to replace it with something that is trivial to hash.
|
Test build #65434 has finished for PR 15101 at commit
|
|
LGTM, pending jenkins. |
|
retest this please |
|
Test build #65435 has finished for PR 15101 at commit
|
|
(My bad on MiMa issue -- should be fixed in master, retesting ...) |
|
Test build #3270 has finished for PR 15101 at commit
|
|
Merging to master/2.0. Thanks for the reviews. |
## What changes were proposed in this pull request? This PR fixes an issue with aggregates that have an empty input, and use a literals as their grouping keys. These aggregates are currently interpreted as aggregates **without** grouping keys, this triggers the ungrouped code path (which aways returns a single row). This PR fixes the `RemoveLiteralFromGroupExpressions` optimizer rule, which changes the semantics of the Aggregate by eliminating all literal grouping keys. ## How was this patch tested? Added tests to `SQLQueryTestSuite`. Author: Herman van Hovell <hvanhovell@databricks.com> Closes #15101 from hvanhovell/SPARK-17114-3. (cherry picked from commit d403562) Signed-off-by: Herman van Hovell <hvanhovell@databricks.com>
## What changes were proposed in this pull request? This PR fixes an issue with aggregates that have an empty input, and use a literals as their grouping keys. These aggregates are currently interpreted as aggregates **without** grouping keys, this triggers the ungrouped code path (which aways returns a single row). This PR fixes the `RemoveLiteralFromGroupExpressions` optimizer rule, which changes the semantics of the Aggregate by eliminating all literal grouping keys. ## How was this patch tested? Added tests to `SQLQueryTestSuite`. Author: Herman van Hovell <hvanhovell@databricks.com> Closes apache#15101 from hvanhovell/SPARK-17114-3.
What changes were proposed in this pull request?
This PR fixes an issue with aggregates that have an empty input, and use a literals as their grouping keys. These aggregates are currently interpreted as aggregates without grouping keys, this triggers the ungrouped code path (which aways returns a single row).
This PR fixes the
RemoveLiteralFromGroupExpressionsoptimizer rule, which changes the semantics of the Aggregate by eliminating all literal grouping keys.How was this patch tested?
Added tests to
SQLQueryTestSuite.