Skip to content

Conversation

@yeshengm
Copy link
Contributor

@yeshengm yeshengm commented Jun 25, 2019

What changes were proposed in this pull request?

This PR makes the predicate pushdown logic in catalyst optimizer more efficient by unifying two existing rules PushdownPredicates and PushPredicateThroughJoin. Previously pushing down a predicate for queries such as Filter(Join(Join(Join))) requires n steps. This patch essentially reduces this to a single pass.

To make this actually work, we need to unify a few rules such as CombineFilters, PushDownPredicate and PushDownPrdicateThroughJoin. Otherwise cases such as Filter(Join(Filter(Join))) still requires several passes to fully push down predicates. This unification is done by composing several partial functions, which makes a minimal code change and can reuse existing UTs.

Results show that this optimization can improve the catalyst optimization time by 16.5%. For queries with more joins, the performance is even better. E.g., for TPC-DS q64, the performance boost is 49.2%.

How was this patch tested?

Existing UTs + new a UT for the new rule.

@SparkQA
Copy link

SparkQA commented Jun 25, 2019

Test build #106853 has finished for PR 24956 at commit 9b6fe2e.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 25, 2019

Test build #106904 has finished for PR 24956 at commit d8b63e2.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@yeshengm yeshengm changed the title [WIP][SPARK-27815][SQL] Predicate pushdown in one pass for cascading joins [SPARK-27815][SQL] Predicate pushdown in one pass for cascading joins Jun 26, 2019
@SparkQA
Copy link

SparkQA commented Jun 26, 2019

Test build #106946 has finished for PR 24956 at commit b1eadb4.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jun 27, 2019

Test build #106948 has finished for PR 24956 at commit f46c5c6.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

CollapseRepartition,
CollapseProject,
CollapseWindow,
CombineFilters,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let us keep this.

@gatorsmile
Copy link
Member

LGTM except one comment

@SparkQA
Copy link

SparkQA commented Jul 3, 2019

Test build #107136 has finished for PR 24956 at commit dda6df3.

  • This patch fails SparkR unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

retest this please

@SparkQA
Copy link

SparkQA commented Jul 3, 2019

Test build #107151 has finished for PR 24956 at commit dda6df3.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

retest this please

1 similar comment
@yeshengm
Copy link
Contributor Author

yeshengm commented Jul 3, 2019

retest this please

@SparkQA
Copy link

SparkQA commented Jul 3, 2019

Test build #107170 has finished for PR 24956 at commit dda6df3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@gatorsmile
Copy link
Member

Thanks! Merged to master.

@gatorsmile gatorsmile closed this in 74f1176 Jul 3, 2019
@gatorsmile gatorsmile changed the title [SPARK-27815][SQL] Predicate pushdown in one pass for cascading joins [SPARK-28155][SQL] Predicate pushdown in one pass for cascading joins Jul 3, 2019
@dongjoon-hyun
Copy link
Member

dongjoon-hyun commented Jul 20, 2019

What about updating the JIRA instead since the commit(74f1176) is permanant? @gatorsmile and @cloud-fan .
We can switch both JIRAs.

@HyukjinKwon HyukjinKwon changed the title [SPARK-28155][SQL] Predicate pushdown in one pass for cascading joins [SPARK-27815][SQL] Predicate pushdown in one pass for cascading joins Jul 20, 2019
@HyukjinKwon
Copy link
Member

K. Let me do that.

@dongjoon-hyun
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants