-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-17114][SQL] Fix aggregates grouped by literals with empty input. #15076
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
cc @cloud-fan |
|
Test build #65309 has finished for PR 15076 at commit
|
|
Test build #65312 has finished for PR 15076 at commit
|
|
Test build #65313 has finished for PR 15076 at commit
|
| !expressions.exists(!_.resolved) && | ||
| childrenResolved && | ||
| !hasWindowExpressions && | ||
| (isGrouped || groupingExpressions.isEmpty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why have this condition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It does not matter if isGrouped has any grouping expressions (literal grouping expressions are eliminated during optimization). It is however problematic when a not-grouped Aggregate has grouping expressions; this means that we have not derived the isGrouped flag correctly.
|
@hvanhovell What's the reason that we can not just use the literal use grouping key? |
|
@davies literal grouping keys get eliminated during optimization. We could modify the optimization rule, but I do feel this is more robust and it is more clear on what we intent to do. It probably also saves a few cycles during query execution. |
|
@hvanhovell Since the optimization rule change the result (or sematics), I'd like to fix it, not to introduce more complexcity in physical layer. The saved few cycles may not worth, because people usually not use that in practice. |
|
closing this in favor of #15101. |
What changes were proposed in this pull request?
This PR fixes an issue with aggregates that have an empty input, and use a literals as their grouping keys. These aggregates are currently interpreted as aggregates without grouping keys, this triggers the ungrouped code path (which aways returns a single row).
This PR fixes this by adding a
isGroupedflag toAggregate. This flag is inferred when theAggregateis created, but is kept when we change theAggregates grouping expressions. I have tried to implement this by wrapping thegroupingExpressionsin anOptionbut this is much more invasive and leads to much harder to use code.How was this patch tested?
Added tests to
SQLQueryTestSuite.