-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-24972][SQL] PivotFirst could not handle pivot columns of complex types #21926
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@maryannxue, thanks! I am a bot who has found some folks who might be able to help with the review:@gatorsmile, @yhuai and @marmbrus |
| // alternate plan that instead uses two steps of aggregation. | ||
| val namedAggExps: Seq[NamedExpression] = aggregates.map(a => Alias(a, a.sql)()) | ||
| val bigGroup = groupByExprs ++ pivotColumn.references | ||
| val namedPivotCol = pivotColumn match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is to revert the original walk-around aimed to avoid the PivotFirst issue. Now that we have PivotFirst working alright for complex types, we can revert it.
|
Test build #93815 has finished for PR 21926 at commit
|
|
retest this please |
|
Test build #93818 has finished for PR 21926 at commit
|
| case Pivot(groupByExprsOpt, pivotColumn, pivotValues, aggregates, child) => | ||
| if (!RowOrdering.isOrderable(pivotColumn.dataType)) { | ||
| throw new AnalysisException( | ||
| s"Invalid pivot column '${pivotColumn}'. Pivot columns must be comparable.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the other reviewers: this is consistent with the requirements of group-by columns.
|
LGTM Thanks! Merged to master. |
What changes were proposed in this pull request?
When the pivot column is of a complex type, the eval() result will be an UnsafeRow, while the keys of the HashMap for column value matching is a GenericInternalRow. As a result, there will be no match and the result will always be empty.
So for a pivot column of complex-types, we should:
PivotFirstcode path,PivotFirstshould use a TreeMap instead of HashMap for such columns.This PR has also reverted the walk-around in Analyzer that had been introduced to avoid this
PivotFirstissue.How was this patch tested?
Added UT.