Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve formatting of logical plans containing subqueries #2899

Merged
merged 7 commits into from
Jul 13, 2022

Conversation

andygrove
Copy link
Member

Which issue does this PR close?

Closes #2898

Rationale for this change

Before:

Projection: #employee_csv.id
  Filter: #employee_csv.state IN (Subquery: TableScan: employee_csv projection=[state])
    TableScan: employee_csv projection=[id, state]

After

Projection: #employee_csv.id
  Filter: #employee_csv.state IN (<subquery>)
    Subquery:
      TableScan: employee_csv projection=[state]
    TableScan: employee_csv projection=[id, state]

What changes are included in this PR?

  • IndentVisitor visits all inputs, including subqueries
  • Subquery expressions no longer display their subqueries

Are there any user-facing changes?

Yes, plans look different

@andygrove andygrove self-assigned this Jul 13, 2022
@github-actions github-actions bot added the logical-expr Logical plan and expressions label Jul 13, 2022
@github-actions github-actions bot added the sql SQL Planner label Jul 13, 2022
@github-actions github-actions bot added the optimizer Optimizer rules label Jul 13, 2022
@andygrove andygrove marked this pull request as ready for review July 13, 2022 14:58
@andygrove andygrove changed the title WIP: Improve formatting of logical plans containing subqueries Improve formatting of logical plans containing subqueries Jul 13, 2022
@andygrove
Copy link
Member Author

@avantgardnerio This might help simplify some of your tests

Comment on lines +399 to +402
LogicalPlan::Projection(Projection { .. }) => {
self.visit_all_inputs(visitor)?
}
LogicalPlan::Filter(Filter { .. }) => self.visit_all_inputs(visitor)?,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change. We now visit all inputs, including subqueries. I have only implemented this for Projection and Filter in this PR.

@codecov-commenter
Copy link

Codecov Report

Merging #2899 (7592829) into master (6a5de4f) will decrease coverage by 0.00%.
The diff coverage is 82.25%.

❗ Current head 7592829 differs from pull request most recent head 37ab28a. Consider uploading reports for the commit 37ab28a to get more accurate results

@@            Coverage Diff             @@
##           master    #2899      +/-   ##
==========================================
- Coverage   85.34%   85.34%   -0.01%     
==========================================
  Files         276      276              
  Lines       49294    49322      +28     
==========================================
+ Hits        42071    42092      +21     
- Misses       7223     7230       +7     
Impacted Files Coverage Δ
datafusion/expr/src/expr.rs 83.60% <0.00%> (-2.50%) ⬇️
datafusion/optimizer/src/filter_push_down.rs 98.23% <ø> (ø)
datafusion/optimizer/src/limit_push_down.rs 99.67% <ø> (ø)
...atafusion/optimizer/src/subquery_filter_to_join.rs 93.90% <ø> (ø)
datafusion/expr/src/logical_plan/plan.rs 76.20% <95.23%> (+1.64%) ⬆️
datafusion/expr/src/logical_plan/builder.rs 89.86% <100.00%> (ø)
datafusion/sql/src/planner.rs 81.31% <100.00%> (-0.07%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6a5de4f...37ab28a. Read the comment docs.

Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very helpful!

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice

@andygrove andygrove merged commit fd64e6f into apache:master Jul 13, 2022
@andygrove andygrove deleted the format-subqueries branch July 13, 2022 20:13
@ursabot
Copy link

ursabot commented Jul 13, 2022

Benchmark runs are scheduled for baseline = 6a5de4f and contender = fd64e6f. fd64e6f is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
logical-expr Logical plan and expressions optimizer Optimizer rules sql SQL Planner
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve formatting of logical plans containing subquery expressions
5 participants