Skip to content

Conversation

@zy-kkk
Copy link
Member

@zy-kkk zy-kkk commented Oct 31, 2024

bp (#41218)

In the previous FileScanNode, some parts that used conjuncts for predicate conversion were placed in the init phase. However, for the Nereids planner, pushing the filter down to the scan happens in the Translator, which means that the ScanNode can only get the complete conjuncts in the finalized phase. Therefore, in this PR, I have removed all conjuncts variables in External for the Nereids planner. They no longer need to store conjuncts themselves or add them to the ScanNode. Instead, all places in the ScanNode that use conjuncts should be moved to the finalized phase.

This refactor also fix a performance issue introduced from #40176 After introducing the change of generating SelectNode for consecutive projects or filters, FileScan still adds conjuncts too early in the init phase, resulting in the discovery of consecutive filters when the upper layer continues to translate, a selectnode was unexpectedly generated on the scannode, causing the project to be unable to prune the scannode columns. However, the Project node trims columns of SelectNode and ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step, so that we can keep the structure from ScanNode to Project and achieve correct column trimming.

…apache#41218)

In the previous FileScanNode, some parts that used conjuncts for
predicate conversion were placed in the init phase. However, for the
Nereids planner, pushing the filter down to the scan happens in the
Translator, which means that the ScanNode can only get the complete
conjuncts in the finalized phase. Therefore, in this PR, I have removed
all conjuncts variables in External for the Nereids planner. They no
longer need to store conjuncts themselves or add them to the ScanNode.
Instead, all places in the ScanNode that use conjuncts should be moved
to the finalized phase.

This refactor also fix a performance issue introduced from apache#40176
After introducing the change of generating SelectNode for consecutive
projects or filters, FileScan still adds conjuncts too early in the init
phase, resulting in the discovery of consecutive filters when the upper
layer continues to translate, a selectnode was unexpectedly generated on
the scannode, causing the project to be unable to prune the scannode
columns. However, the Project node trims columns of SelectNode and
ScanNode differently, which causes ScanNode to scan unnecessary columns.

My modification removes the addition of conjuncts in the scannode step,
so that we can keep the structure from ScanNode to Project and achieve
correct column trimming.
@doris-robot
Copy link

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR

Since 2024-03-18, the Document has been moved to doris-website.
See Doris Document.

@zy-kkk zy-kkk changed the title [opt](Catalog) Remove unnecessary conjuncts handling on External Scan [3.0][opt](Catalog) Remove unnecessary conjuncts handling on External Scan Oct 31, 2024
@zy-kkk
Copy link
Member Author

zy-kkk commented Oct 31, 2024

run buildall

@morningman morningman merged commit 78f31d7 into apache:branch-3.0 Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants