Skip to content

Conversation

@AngersZhuuuu
Copy link
Contributor

@AngersZhuuuu AngersZhuuuu commented May 8, 2021

What changes were proposed in this pull request?

This is a followup from #31691. Push down limit through Window when the partitionSpec of all window functions is not empty and the same order is used.
Push down limit through Window when the partitionSpec of all window functions is not empty

And the origin author is @leoluan2009, since he didn't reply for long and did this follow up after invitation

Why are the changes needed?

Improve query performance.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Added UT

@github-actions github-actions bot added the SQL label May 8, 2021
@AngersZhuuuu AngersZhuuuu changed the title Spark 34775 [SPARK-34775][SQL] Push down limit through window when partitionSpec is not empty May 8, 2021
@SparkQA
Copy link

SparkQA commented May 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42804/

@SparkQA
Copy link

SparkQA commented May 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42804/

@SparkQA
Copy link

SparkQA commented May 8, 2021

Test build #138283 has finished for PR 32475 at commit c4f8220.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 10, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42837/

@SparkQA
Copy link

SparkQA commented May 10, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/42837/

@SparkQA
Copy link

SparkQA commented May 10, 2021

Test build #138315 has finished for PR 32475 at commit bf9d041.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AngersZhuuuu
Copy link
Contributor Author

ping @wangyum

@SparkQA
Copy link

SparkQA commented May 17, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43137/

@SparkQA
Copy link

SparkQA commented May 17, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/43137/

@SparkQA
Copy link

SparkQA commented May 17, 2021

Test build #138616 has finished for PR 32475 at commit 78dd97a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

// Sort is needed here because we need global sort.
window.copy(child = Limit(limitExpr, Sort(orderSpec, true, child)))
window.copy(child = Limit(limitExpr,
Sort(partitionSpec.map(SortOrder(_, Ascending)) ++ orderSpec, true, child)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan Do we need to changed the sort order from NULLS FIRST to NULLS LAST? Impala have changed it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is NULLS LAST faster?

// Adding an extra Limit below WINDOW when the partitionSpec of all window functions is empty.
case LocalLimit(limitExpr @ IntegerLiteral(limit),
window @ Window(windowExpressions, Nil, orderSpec, child))
window @ Window(windowExpressions, partitionSpec, orderSpec, child))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you provide a bit more context? why can we pushdown limit through window when there are partition specs?

@SparkQA
Copy link

SparkQA commented Aug 16, 2021

Test build #142483 has finished for PR 32475 at commit 78dd97a.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 16, 2021

Test build #142489 has finished for PR 32475 at commit 78dd97a.

  • This patch fails Spark unit tests.
  • This patch does not merge cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Aug 16, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46994/

@SparkQA
Copy link

SparkQA commented Aug 16, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/46994/

@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Nov 25, 2021
@github-actions github-actions bot closed this Nov 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants