-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields #33498
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-36275][SQL] ResolveAggregateFunctions should works with nested fields #33498
Conversation
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #141569 has finished for PR 33498 at commit
|
|
cc @cloud-fan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shall we update TempResolvedColumn to only take Attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here actually we shouldn't transform attributes, because for nested fields this u.nameParts will not correspond to the attribute. Updated to specifically match Alias.
cloud-fan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch!
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #141658 has finished for PR 33498 at commit
|
1889fe4 to
fb3931c
Compare
|
Kubernetes integration test starting |
|
Kubernetes integration test status success |
|
Test build #141728 has finished for PR 33498 at commit
|
|
thanks, merging to master/3.2! |
… fields ### What changes were proposed in this pull request? This PR fixes an issue in `ResolveAggregateFunctions` where non-aggregated nested fields in ORDER BY and HAVING are not resolved correctly. This is because nested fields are resolved as aliases that fail to be semantically equal to any grouping/aggregate expressions. ### Why are the changes needed? To fix an analyzer issue. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Unit tests. Closes #33498 from allisonwang-db/spark-36275-resolve-agg-func. Authored-by: allisonwang-db <allison.wang@databricks.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 23a6ffa) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
| // should undo it later and fail with "Column c2 not found". | ||
| agg.child.resolve(u.nameParts, resolver).map(TempResolvedColumn(_, u.nameParts)) | ||
| .getOrElse(u) | ||
| agg.child.resolve(u.nameParts, resolver).map({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's use a single brace next time for map in this case: https://github.com/databricks/scala-style-guide#anonymous-methods
What changes were proposed in this pull request?
This PR fixes an issue in
ResolveAggregateFunctionswhere non-aggregated nested fields in ORDER BY and HAVING are not resolved correctly. This is because nested fields are resolved as aliases that fail to be semantically equal to any grouping/aggregate expressions.Why are the changes needed?
To fix an analyzer issue.
Does this PR introduce any user-facing change?
No
How was this patch tested?
Unit tests.