Skip to content

Conversation

@xuanyuanking
Copy link
Member

@xuanyuanking xuanyuanking commented Apr 28, 2020

What changes were proposed in this pull request?

  • Rephrase the API doc for Column.as
  • Simplify the UTs

Why are the changes needed?

Address comments in #28326

Does this PR introduce any user-facing change?

No

How was this patch tested?

New UT added.

@xuanyuanking
Copy link
Member Author

@SparkQA
Copy link

SparkQA commented Apr 28, 2020

Test build #121987 has finished for PR 28390 at commit 6e98444.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@gatorsmile gatorsmile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As @HeartSaVioR pointed out, this PR is actually changing the behaviors in metadata propagation for these public Column APIs. Let us add it to the migration guide?


- In Spark 3.1, SQL UI data adopts the `formatted` mode for the query plan explain results. To restore the behavior before Spark 3.0, you can set `spark.sql.ui.explainMode` to `extended`.

- In Spark 3.1, the column metadata will always be propagated in the API `name` and `as`. In Spark version 3.0 and earlier, the metadata of `NamedExpression` is set as the `explicitMetadata` for the new column. To restore the behavior before Spark 3.0, you can use the API `as(alias: String, metadata: Metadata)` with explicit metadata.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the patch has been merged to 3.0?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in the API name and as -> in the API Column#name and Column#as

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Spark version 3.0 and earlier, the metadata of NamedExpression is set as the explicitMetadata for the new column.

I think the old behavior is using the metadata of NamedExpression at the time the API was called. The metadata won't change even if the underlying NamedExpression changes metadata.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, let me emphasize the usage of explicitMetadata.

@SparkQA
Copy link

SparkQA commented Apr 29, 2020

Test build #122067 has finished for PR 28390 at commit 1101d05.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Apr 30, 2020

Test build #122098 has finished for PR 28390 at commit d9baf7a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master/3.0!

@cloud-fan cloud-fan closed this in 7195a18 Apr 30, 2020
cloud-fan pushed a commit that referenced this pull request Apr 30, 2020
…y tests

### What changes were proposed in this pull request?

- Rephrase the API doc for `Column.as`
- Simplify the UTs

### Why are the changes needed?
Address comments in #28326

### Does this PR introduce any user-facing change?
No

### How was this patch tested?
New UT added.

Closes #28390 from xuanyuanking/SPARK-27340-follow.

Authored-by: Yuanjian Li <xyliyuanjian@gmail.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
(cherry picked from commit 7195a18)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@xuanyuanking
Copy link
Member Author

Thanks for the review.

@xuanyuanking xuanyuanking deleted the SPARK-27340-follow branch April 30, 2020 07:04
@dongjoon-hyun
Copy link
Member

Thank you all.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants