Skip to content

Conversation

@zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Sep 24, 2024

In the previous FileScanNode, some parts that used conjuncts for predicate conversion were placed in the init phase. However, for the Nereids planner, pushing the filter down to the scan happens in the Translator, which means that the ScanNode can only get the complete conjuncts in the finalized phase. Therefore, in this PR, I have removed all conjuncts variables in External for the Nereids planner. They no longer need to store conjuncts themselves or add them to the ScanNode. Instead, all places in the ScanNode that use conjuncts should be moved to the finalized phase.

This refactor also fix a performance issue introduced from #40176
After introducing the change of generating SelectNode for consecutive projects or filters, FileScan still adds conjuncts too early in the init phase, resulting in the discovery of consecutive filters when the upper layer continues to translate, a selectnode was unexpectedly generated on the scannode, causing the project to be unable to prune the scannode columns. However, the Project node trims columns of SelectNode and ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step, so that we can keep the structure from ScanNode to Project and achieve correct column trimming.

@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zy-kkk
Copy link
Member Author

zy-kkk commented Sep 24, 2024

run buildall

@Override
public void init(Analyzer analyzer) throws UserException {
super.init(analyzer);
buildQuery();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not modify legacy planner

@zy-kkk zy-kkk force-pushed the del_useless_conjuncts_for_external branch from 897bda4 to db088ff Compare September 25, 2024 14:58
@zy-kkk
Copy link
Member Author

zy-kkk commented Sep 25, 2024

run buildall

@zy-kkk zy-kkk force-pushed the del_useless_conjuncts_for_external branch from db088ff to 7e36a14 Compare September 26, 2024 11:25
@zy-kkk
Copy link
Member Author

zy-kkk commented Sep 26, 2024

run buildall

@zy-kkk zy-kkk force-pushed the del_useless_conjuncts_for_external branch 2 times, most recently from 6d91294 to 7bec2a1 Compare September 27, 2024 08:00
@morningman morningman force-pushed the del_useless_conjuncts_for_external branch from 7bec2a1 to 34701b2 Compare October 10, 2024 14:25
@morningman
Copy link
Contributor

run buildall

morningman
morningman previously approved these changes Oct 10, 2024
Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 10, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

@github-actions
Copy link
Contributor

PR approved by anyone and no changes requested.

@zy-kkk zy-kkk force-pushed the del_useless_conjuncts_for_external branch from 34701b2 to 1f6d21b Compare October 11, 2024 13:12
@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 11, 2024

run buildall

@github-actions github-actions bot removed the approved Indicates a PR has been approved by one committer. label Oct 12, 2024
@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 12, 2024

run buildall

2 similar comments
@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 14, 2024

run buildall

@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 14, 2024

run buildall

@zy-kkk zy-kkk force-pushed the del_useless_conjuncts_for_external branch from 81ebfa3 to 0828a41 Compare October 15, 2024 06:39
@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 15, 2024

run buildall

@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 16, 2024

run buildall

Copy link
Contributor

@morningman morningman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@github-actions github-actions bot added the approved Indicates a PR has been approved by one committer. label Oct 16, 2024
@github-actions
Copy link
Contributor

PR approved by at least one committer and no changes requested.

Copy link
Contributor

@morrySnow morrySnow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have external table partition prune test case in regression test?

@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 16, 2024

do we have external table partition prune test case in regression test?

Yes, such as: test_hive_partition, test_hive_default_partition

@morrySnow morrySnow merged commit 22aabb5 into apache:master Oct 17, 2024
@zy-kkk zy-kkk deleted the del_useless_conjuncts_for_external branch October 22, 2024 09:49
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Oct 22, 2024
…apache#41218)

In the previous FileScanNode, some parts that used conjuncts for
predicate conversion were placed in the init phase. However, for the
Nereids planner, pushing the filter down to the scan happens in the
Translator, which means that the ScanNode can only get the complete
conjuncts in the finalized phase. Therefore, in this PR, I have removed
all conjuncts variables in External for the Nereids planner. They no
longer need to store conjuncts themselves or add them to the ScanNode.
Instead, all places in the ScanNode that use conjuncts should be moved
to the finalized phase.

This refactor also fix a performance issue introduced from apache#40176
After introducing the change of generating SelectNode for consecutive
projects or filters, FileScan still adds conjuncts too early in the init
phase, resulting in the discovery of consecutive filters when the upper
layer continues to translate, a selectnode was unexpectedly generated on
the scannode, causing the project to be unable to prune the scannode
columns. However, the Project node trims columns of SelectNode and
ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step,
so that we can keep the structure from ScanNode to Project and achieve
correct column trimming.
morningman pushed a commit that referenced this pull request Oct 22, 2024
… Scan (#42261)

pick (#41218)

In the previous FileScanNode, some parts that used conjuncts for
predicate conversion were placed in the init phase. However, for the
Nereids planner, pushing the filter down to the scan happens in the
Translator, which means that the ScanNode can only get the complete
conjuncts in the finalized phase. Therefore, in this PR, I have removed
all conjuncts variables in External for the Nereids planner. They no
longer need to store conjuncts themselves or add them to the ScanNode.
Instead, all places in the ScanNode that use conjuncts should be moved
to the finalized phase.

This refactor also fix a performance issue introduced from #40176 After
introducing the change of generating SelectNode for consecutive projects
or filters, FileScan still adds conjuncts too early in the init phase,
resulting in the discovery of consecutive filters when the upper layer
continues to translate, a selectnode was unexpectedly generated on the
scannode, causing the project to be unable to prune the scannode
columns. However, the Project node trims columns of SelectNode and
ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step,
so that we can keep the structure from ScanNode to Project and achieve
correct column trimming.
zy-kkk added a commit to zy-kkk/doris that referenced this pull request Oct 31, 2024
…apache#41218)

In the previous FileScanNode, some parts that used conjuncts for
predicate conversion were placed in the init phase. However, for the
Nereids planner, pushing the filter down to the scan happens in the
Translator, which means that the ScanNode can only get the complete
conjuncts in the finalized phase. Therefore, in this PR, I have removed
all conjuncts variables in External for the Nereids planner. They no
longer need to store conjuncts themselves or add them to the ScanNode.
Instead, all places in the ScanNode that use conjuncts should be moved
to the finalized phase.

This refactor also fix a performance issue introduced from apache#40176
After introducing the change of generating SelectNode for consecutive
projects or filters, FileScan still adds conjuncts too early in the init
phase, resulting in the discovery of consecutive filters when the upper
layer continues to translate, a selectnode was unexpectedly generated on
the scannode, causing the project to be unable to prune the scannode
columns. However, the Project node trims columns of SelectNode and
ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step,
so that we can keep the structure from ScanNode to Project and achieve
correct column trimming.
morningman pushed a commit that referenced this pull request Oct 31, 2024
… Scan (#43018)

bp (#41218)

In the previous FileScanNode, some parts that used conjuncts for
predicate conversion were placed in the init phase. However, for the
Nereids planner, pushing the filter down to the scan happens in the
Translator, which means that the ScanNode can only get the complete
conjuncts in the finalized phase. Therefore, in this PR, I have removed
all conjuncts variables in External for the Nereids planner. They no
longer need to store conjuncts themselves or add them to the ScanNode.
Instead, all places in the ScanNode that use conjuncts should be moved
to the finalized phase.

This refactor also fix a performance issue introduced from #40176 After
introducing the change of generating SelectNode for consecutive projects
or filters, FileScan still adds conjuncts too early in the init phase,
resulting in the discovery of consecutive filters when the upper layer
continues to translate, a selectnode was unexpectedly generated on the
scannode, causing the project to be unable to prune the scannode
columns. However, the Project node trims columns of SelectNode and
ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step,
so that we can keep the structure from ScanNode to Project and achieve
correct column trimming.
morningman pushed a commit that referenced this pull request Dec 6, 2024
)

### What problem does this PR solve?
Problem Summary:
In the previous PR #41218, some partition pruning logic was changed,
which caused the hudi partition pruning to fail. This PR is to fix this
problem.

### Release note

[fix](hudi) fix hudi partition prune issue
github-actions bot pushed a commit that referenced this pull request Dec 6, 2024
)

### What problem does this PR solve?
Problem Summary:
In the previous PR #41218, some partition pruning logic was changed,
which caused the hudi partition pruning to fail. This PR is to fix this
problem.

### Release note

[fix](hudi) fix hudi partition prune issue
github-actions bot pushed a commit that referenced this pull request Dec 6, 2024
)

### What problem does this PR solve?
Problem Summary:
In the previous PR #41218, some partition pruning logic was changed,
which caused the hudi partition pruning to fail. This PR is to fix this
problem.

### Release note

[fix](hudi) fix hudi partition prune issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by one committer. dev/2.1.7-merged dev/3.0.3-merged p0_b reviewed

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants