Skip to content

Conversation

@github-actions
Copy link
Contributor

Cherry-picked from #58964

…nsumer (#58964)

Related PR: #21412

Problem Summary:

This pull request improves the handling of distribution properties
(specifically "must shuffle") for `PhysicalProject` and `PhysicalFilter`
nodes in the query planner, and adds comprehensive unit tests to ensure
correctness. The main logic ensures that when certain child nodes
require shuffling, the planner correctly adjusts the distribution
requirements, especially in the presence of `Project`, `Filter`, and
`Limit` nodes.

Key changes include:

**Distribution Property Handling Enhancements:**

* Added logic in `ChildrenPropertiesRegulator` to check if a child node
under a `PhysicalProject` or `PhysicalFilter` requires a "must shuffle"
distribution, and to adjust the children’s properties accordingly. This
is done via the new `mustShuffleUnderProjectOrFilter` method.
* Included `PhysicalLimit` in the set of nodes that can trigger a
shuffle requirement, by updating imports and logic.

**Testing Improvements:**

* Added a new test class `ChildrenPropertiesRegulatorTest.java` with
detailed unit tests for the handling of "must shuffle" properties under
`Project`, `Filter`, and `Limit` nodes. These tests use mocks to
simulate various plan trees and assert correct distribution
specification propagation.

**Regression Test Coverage:**

* Added a new regression test in `cte.groovy` to verify correct behavior
when multiple `Project` nodes are present on a CTE consumer, ensuring
the planner handles such cases as expected.

These changes collectively make the planner more robust in handling
complex plan trees with respect to distribution requirements, and ensure
correctness through thorough testing.
@github-actions github-actions bot requested a review from yiguolei as a code owner December 15, 2025 10:26
@hello-stephen
Copy link
Contributor

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

@dataroaring dataroaring reopened this Dec 15, 2025
@hello-stephen
Copy link
Contributor

run buildall

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Dec 17, 2025
@github-actions
Copy link
Contributor Author

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor Author

PR approved by anyone and no changes requested.

@yiguolei yiguolei merged commit 99afe88 into branch-4.0 Dec 17, 2025
24 of 26 checks passed
@github-actions github-actions bot deleted the auto-pick-58964-branch-4.0 branch December 17, 2025 08:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants