Skip to content

Conversation

@aokolnychyi
Copy link
Contributor

@aokolnychyi aokolnychyi commented Dec 24, 2020

What changes were proposed in this pull request?

This PR adds a way to inject data source rewrite rules to branch-3.1 via backporting two JIRA issues.

  • [SPARK-33621][SQL] Add a way to inject data source rewrite rules
  • [SPARK-33784][SQL] Rename dataSourceRewriteRules batch

Why are the changes needed?

Right now SparkSessionExtensions allow us to inject optimization rules but they are added to operator optimization batch. There are cases when users need to run rules after the operator optimization batch (e.g. cases when a rule relies on the fact that expressions have been optimized). Currently, this is not possible.

Does this PR introduce any user-facing change?

Yes.

How was this patch tested?

This PR comes with a new test.

This PR tries to rename `dataSourceRewriteRules` into something more generic.

These changes are needed to address the post-review discussion [here](apache#30558 (comment)).

Yes but the changes haven't been released yet.

Existing tests.

Closes apache#30808 from aokolnychyi/spark-33784.

Authored-by: Anton Okolnychyi <aokolnychyi@apple.com>
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@github-actions github-actions bot added the SQL label Dec 24, 2020
@aokolnychyi
Copy link
Contributor Author

aokolnychyi commented Dec 24, 2020

@cloud-fan @dongjoon-hyun, here is the backport of PR #30808.

@SparkQA
Copy link

SparkQA commented Dec 24, 2020

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37944/

@SparkQA
Copy link

SparkQA commented Dec 24, 2020

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/37944/

@SparkQA
Copy link

SparkQA commented Dec 24, 2020

Test build #133353 has finished for PR 30917 at commit fcf283a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dongjoon-hyun dongjoon-hyun changed the title [SPARK-33784][SQL] Rename dataSourceRewriteRules batch [SPARK-33784][SQL][3.1] Rename dataSourceRewriteRules batch Dec 24, 2020
@dongjoon-hyun dongjoon-hyun changed the title [SPARK-33784][SQL][3.1] Rename dataSourceRewriteRules batch [SPARK-33621][SPARK-33784][SQL][3.1] Add a way to inject data source rewrite rules Dec 24, 2020
Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, @aokolnychyi and all reviewers.

Please note that this includes two JIRAs, but it's technically one PR and its follow-up. Technically, SPARK-33784 is a blocker for Apache Spark 3.1.0.

  • [SPARK-33621][SQL] Add a way to inject data source rewrite rules
  • [SPARK-33784][SQL] Rename dataSourceRewriteRules batch

I revised the PR title and PR description according to the PR content.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM.
Merged to branch-3.1 for Apache Spark 3.1.0.

Thank you, @aokolnychyi . Merry Christmas and Happy New Year!

dongjoon-hyun pushed a commit that referenced this pull request Dec 24, 2020
…rewrite rules

### What changes were proposed in this pull request?

This PR adds a way to inject data source rewrite rules to branch-3.1 via backporting two JIRA issues.

- [SPARK-33621][SQL] Add a way to inject data source rewrite rules
- [SPARK-33784][SQL] Rename dataSourceRewriteRules batch

### Why are the changes needed?

Right now `SparkSessionExtensions` allow us to inject optimization rules but they are added to operator optimization batch. There are cases when users need to run rules after the operator optimization batch (e.g. cases when a rule relies on the fact that expressions have been optimized). Currently, this is not possible.

### Does this PR introduce _any_ user-facing change?

Yes.

### How was this patch tested?

This PR comes with a new test.

Closes #30917 from aokolnychyi/backport-spark-33784.

Authored-by: Anton Okolnychyi <aokolnychyi@apple.com>
Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think this is fine. LGTM

@aokolnychyi
Copy link
Contributor Author

Thanks @HyukjinKwon @dongjoon-hyun! I wasn't sure what is the normal procedure for such cherry-picks in Spark so thanks for adapting the PR description, @dongjoon-hyun !

Happy holidays!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants