Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite subexpressions of InSubquery in rewrite_expression #2765

Merged
merged 3 commits into from
Jun 22, 2022

Conversation

mrob95
Copy link
Contributor

@mrob95 mrob95 commented Jun 22, 2022

Which issue does this PR close?

Closes #2736.

Rationale for this change

What changes are included in this PR?

Looking at the expressions mentioned in #2725, I think the only one that needs fixing is Expr::InSubquery. Exists and ScalarSubquery both have subqueries but not subexpressions, and it looks like subqueries are currently treated as completely separate plans for the purposes of optimisation. I'm new to working on query engines (and rust, for that matter!) though so may be misinterpreting this.

I've modified rewrite_expression so InSubquery exprs use the subexpression passed in expressions rather than just returning a clone, and added a pushdown test with an InSubquery predicate which uses an aliased column.

Additionally, I noticed while I was creating the test case that InSubquery filters were not being pushed down at all. This was because the subexpressions of InSubquery were not being visited by ExpressionVisitor, so the column in InSubquery predicates was not found. Fixed with a simple modification to expr_visitor.rs. The test case I added covers this as well.

Are there any user-facing changes?

@github-actions github-actions bot added logical-expr Logical plan and expressions optimizer Optimizer rules labels Jun 22, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #2765 (db47b2a) into master (bc00776) will increase coverage by 0.00%.
The diff coverage is 89.47%.

@@           Coverage Diff           @@
##           master    #2765   +/-   ##
=======================================
  Coverage   84.95%   84.96%           
=======================================
  Files         271      271           
  Lines       48053    48071   +18     
=======================================
+ Hits        40824    40842   +18     
  Misses       7229     7229           
Impacted Files Coverage Δ
datafusion/expr/src/expr_visitor.rs 63.75% <0.00%> (ø)
datafusion/optimizer/src/filter_push_down.rs 98.23% <92.85%> (-0.10%) ⬇️
datafusion/optimizer/src/utils.rs 36.08% <100.00%> (+2.39%) ⬆️
datafusion/expr/src/logical_plan/plan.rs 73.71% <0.00%> (-0.20%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update bc00776...db47b2a. Read the comment docs.

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks @mrob95!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it -- this is a great first contribution @mrob95 ❤️

\n TableScan: test projection=None";
assert_eq!(format!("{:?}", plan), expected_before);

// rewrite filter col b to test.a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@mrob95
Copy link
Contributor Author

mrob95 commented Jun 22, 2022

Bonus point for removing expr_expressions and rewrite_expression and instead use an ExprMutator

Thanks, going to see if I can do this ^ as well, but that can be a separate PR :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions optimizer Optimizer rules
Projects
None yet
Development

Successfully merging this pull request may close these issues.

rewrite_expression does not properly handle Exists and ScalarSubquery
4 participants