Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix bug the optimizer rule filter push down #2039

Merged
merged 3 commits into from
Mar 20, 2022

Conversation

jackwener
Copy link
Member

Which issue does this PR close?

Closes #2038.

What changes are included in this PR?

Fix the bug due to break loop cause used_columns is wrong.

It will cause the issue_filters get wrong used_columns param

Are there any user-facing changes?

None

@github-actions github-actions bot added the datafusion Changes in the datafusion crate label Mar 19, 2022
@jackwener
Copy link
Member Author

jackwener commented Mar 19, 2022

@liukun4515 @Dandandan @alamb @houqp @yjshen PTAL, thanks!😃❤

@jackwener
Copy link
Member Author

It's found during I add the combine_adjacent_filter rule.

I found that filter already is combined in the /sql/planner.rs.

The filter split is caused by this bug.

@jackwener
Copy link
Member Author

Now it's fixed.

 explain verbose select c1, c2 from test where c3 = true and c2 = 0.000001;
+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| plan_type                                             | plan                                                                                                                            |
+-------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------+
| initial_logical_plan                                  | Projection: #test.c1, #test.c2                                                                                                  |
|                                                       |   Filter: #test.c3 = Boolean(true) AND #test.c2 = Float64(0.000001)                                                             |
|                                                       |     TableScan: test projection=None                                                                                             |
| logical_plan after simplify_expressions               | Projection: #test.c1, #test.c2                                                                                                  |
|                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001) AS test.c3 = Boolean(true) AND test.c2 = Float64(0.000001)                  |
|                                                       |     TableScan: test projection=None                                                                                             |
| logical_plan after eliminate_filter                   | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after common_sub_expression_eliminate    | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after eliminate_limit                    | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after projection_push_down               | Projection: #test.c1, #test.c2                                                                                                  |
|                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001) AS test.c3 = Boolean(true) AND test.c2 = Float64(0.000001)                  |
|                                                       |     TableScan: test projection=Some([0, 1, 2])                                                                                  |
| logical_plan after filter_push_down                   | Projection: #test.c1, #test.c2                                                                                                  |
|                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001)                                                                             |
|                                                       |     TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]                                |
| logical_plan after limit_push_down                    | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after SingleDistinctAggregationToGroupBy | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after ToApproxPerc                       | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after simplify_expressions               | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after eliminate_filter                   | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after common_sub_expression_eliminate    | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after eliminate_limit                    | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after projection_push_down               | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after filter_push_down                   | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after limit_push_down                    | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after SingleDistinctAggregationToGroupBy | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan after ToApproxPerc                       | SAME TEXT AS ABOVE                                                                                                              |
| logical_plan                                          | Projection: #test.c1, #test.c2                                                                                                  |
|                                                       |   Filter: #test.c3 AND #test.c2 = Float64(0.000001)                                                                             |
|                                                       |     TableScan: test projection=Some([0, 1, 2]), filters=[#test.c3, #test.c2 = Float64(0.000001)]    

@Dandandan
Copy link
Contributor

Dandandan commented Mar 19, 2022

I think the change looks good! Can we add a test to make sure we get the correct plan now without splitting the original expressions into multiple filters?

@houqp houqp requested a review from Dandandan March 19, 2022 19:31
@houqp houqp added the bug Something isn't working label Mar 19, 2022
@houqp
Copy link
Member

houqp commented Mar 19, 2022

LGTM +1 on adding a unit test

@jackwener
Copy link
Member Author

Yes, add unit test is necessary.

@jackwener
Copy link
Member Author

The result of unit test:

Before

let expected ="Projection: #a, #b\
\n  Filter: #a = Int64(10)\
\n    Filter: #b > Int64(11)\
\n      TableScan: test projection=Some([0]), filters=[#a = Int64(10), #b > Int64(11)]";

After

let expected ="Projection: #a, #b\
    \n  Filter: #a = Int64(10) AND #b > Int64(11)\
    \n    TableScan: test projection=Some([0]), filters=[#a = Int64(10), #b > Int64(11)]";

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- thank you for persevering @jackwener and finding a good solution. This change seems much more preferable than #2026 to me.

I also double checked by running this test locally without the change and with the change

@alamb alamb merged commit 74bf7ab into apache:master Mar 20, 2022
@jackwener jackwener deleted the filter_push_bugfix branch December 6, 2022 15:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working datafusion Changes in the datafusion crate
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Filter push down rule cause the wrong plan
4 participants