-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-27340][SS][TESTS][FOLLOW-UP] Rephrase API comments and simplify tests #28390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Test build #121987 has finished for PR 28390 at commit
|
gatorsmile
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As @HeartSaVioR pointed out, this PR is actually changing the behaviors in metadata propagation for these public Column APIs. Let us add it to the migration guide?
docs/sql-migration-guide.md
Outdated
|
|
||
| - In Spark 3.1, SQL UI data adopts the `formatted` mode for the query plan explain results. To restore the behavior before Spark 3.0, you can set `spark.sql.ui.explainMode` to `extended`. | ||
|
|
||
| - In Spark 3.1, the column metadata will always be propagated in the API `name` and `as`. In Spark version 3.0 and earlier, the metadata of `NamedExpression` is set as the `explicitMetadata` for the new column. To restore the behavior before Spark 3.0, you can use the API `as(alias: String, metadata: Metadata)` with explicit metadata. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the patch has been merged to 3.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in the API name and as -> in the API Column#name and Column#as
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In Spark version 3.0 and earlier, the metadata of
NamedExpressionis set as theexplicitMetadatafor the new column.
I think the old behavior is using the metadata of NamedExpression at the time the API was called. The metadata won't change even if the underlying NamedExpression changes metadata.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, let me emphasize the usage of explicitMetadata.
|
Test build #122067 has finished for PR 28390 at commit
|
|
Test build #122098 has finished for PR 28390 at commit
|
|
thanks, merging to master/3.0! |
…y tests ### What changes were proposed in this pull request? - Rephrase the API doc for `Column.as` - Simplify the UTs ### Why are the changes needed? Address comments in #28326 ### Does this PR introduce any user-facing change? No ### How was this patch tested? New UT added. Closes #28390 from xuanyuanking/SPARK-27340-follow. Authored-by: Yuanjian Li <xyliyuanjian@gmail.com> Signed-off-by: Wenchen Fan <wenchen@databricks.com> (cherry picked from commit 7195a18) Signed-off-by: Wenchen Fan <wenchen@databricks.com>
|
Thanks for the review. |
|
Thank you all. |
What changes were proposed in this pull request?
Column.asWhy are the changes needed?
Address comments in #28326
Does this PR introduce any user-facing change?
No
How was this patch tested?
New UT added.