Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter push down rule cause the wrong plan #2038

Closed
jackwener opened this issue Mar 19, 2022 · 0 comments · Fixed by #2039
Closed

Filter push down rule cause the wrong plan #2038

jackwener opened this issue Mar 19, 2022 · 0 comments · Fixed by #2039
Labels
bug Something isn't working

Comments

@jackwener
Copy link
Member

Describe the bug

During I add new optimizer rule combine_adjacent_filter, I found that filter push down rule cause the wrong plan #2026.

This rule will cause the combined filter expressions to be split.

There is a bug in filter push down optimizer rule.

explain verbose select c1, c2 from test where c3 = true and c2 = 0.000001;
+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| plan_type                                             | plan                                                                                                                                |
+-------------------------------------------------------+-------------------------------------------------------------------------------------------------------------------------------------+
| initial_logical_plan                                  | Projection: #test.c1, #test.c2                                                                                                      |
|                                                       |   Filter: #test.c3 = Boolean(true) AND #test.c2 = Float64(0.000001)                                                                 |
|                                                       |     TableScan: test projection=None                                                                                                 |
| logical_plan after simplify_expressions               | Projection: #test.c1, #test.c2                                                                                                      |
|                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001) AS test.c3 = Boolean(true) AND test.c2 = Float64(0.000001)                      |
|                                                       |     TableScan: test projection=None                                                                                                 |
| logical_plan after eliminate_filter                   | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after common_sub_expression_eliminate    | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after eliminate_limit                    | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after projection_push_down               | Projection: #test.c1, #test.c2                                                                                                      |
|                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001) AS test.c3 = Boolean(true) AND test.c2 = Float64(0.000001)                      |
|                                                       |     TableScan: test projection=Some([0, 1, 2])                                                                                      |
| logical_plan after filter_push_down                   | Projection: #test.c1, #test.c2                                                                                                      |
|                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001)                                                                                 |
|                                                       |     TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]                                    |
| logical_plan after limit_push_down                    | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after SingleDistinctAggregationToGroupBy | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after ToApproxPerc                       | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after simplify_expressions               | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after eliminate_filter                   | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after common_sub_expression_eliminate    | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after eliminate_limit                    | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after projection_push_down               | SAME TEXT AS ABOVE                                                                                                                  |
| logical_plan after filter_push_down                   | Projection: #test.c1, #test.c2                                                                                                      |
|                                                       |   Filter: #test.c3                                                                                                                  |
|                                                       |     Filter: #test.c2 = Float64(0.000001)                                                                                            |
|                                                       |       TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]                                  |                             |

To Reproduce

create external table test (
c1 float,
c2 double,
c3 boolean
)
stored as csv
with header row
location '<!!! YOUR PATH !!!>/datafusion/tests/aggregate_simple.csv';
explain select c1, c2 from test where c3 = true and c2 = 0.000001;
+---------------+-------------------------------------------------------------------------------------------------------------------------------------+
| plan_type     | plan                                                                                                                                |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------+
| logical_plan  | Projection: #test.c1, #test.c2                                                                                                      |
|               |   Filter: #test.c3                                                                                                                  |
|               |     Filter: #test.c2 = Float64(0.000001)                                                                                            |
|               |       TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]                                  |
| physical_plan | ProjectionExec: expr=[c1@0 as c1, c2@1 as c2]                                                                                       |
|               |   CoalesceBatchesExec: target_batch_size=4096                                                                                       |
|               |     FilterExec: c3@2                                                                                                                |
|               |       CoalesceBatchesExec: target_batch_size=4096                                                                                   |
|               |         FilterExec: c2@1 = 0.000001                                                                                                 |
|               |           RepartitionExec: partitioning=RoundRobinBatch(8)                                                                          |
|               |             CsvExec: files=[/home/jakevin/code/arrow-datafusion/datafusion/tests/aggregate_simple.csv], has_header=true, limit=None |
|               |                                                                                                                                     |
+---------------+-------------------------------------------------------------------------------------------------------------------------------------+

Expected behavior
Shouldn't split the expressions

Additional context
None

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant