Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache common plan properties to eliminate recursive calls in physical plan #9346

Merged
merged 26 commits into from
Feb 28, 2024

Conversation

mustafasrepo
Copy link
Contributor

@mustafasrepo mustafasrepo commented Feb 26, 2024

Which issue does this PR close?

Closes #9084 .

Rationale for this change

In great analysis by @gruuya at the issue 9084. @gruuya recognized that stack usage (depth) increases a lot during logical and physical planning. The root cause of aggressive stack usage is

  • In the logical planning is excessive use of .clone of LogicalPlan enum.
  • In physical planning is the recursive function calls in the getter APIs of the Arc<dyn ExecutionPlan>, such as EquivalenceProperties, output_partitioning, output_ordering, etc.

In the PR9084, @gruuya could reduce physical plan stack usage by caching equivalence_properties for ProjectionExec.

This PR introduces a new struct to cache PlanProperties (PlanPropertiesCache). With this struct, schema, output_partitioning, equivalence_properties, output_ordering is cached. This caching mechanism removes recursive calls during getter methods. Also, given .cache method is implemented, default implementations of the .output_partitioning, .equivalence_properties, output_ordering works out of the box.

With these changes stack depth decreases considerably, Since recursive calls are eliminated in the PhysicalPlan.
As an example Flame graph for the query 54 is converted from following graph
flamegraph_main_q54

to following graph

flamegraph_branch_q54.

What changes are included in this PR?

Are these changes tested?

Existing tests should work

Are there any user-facing changes?

api change

@github-actions github-actions bot added physical-expr Physical Expressions core Core DataFusion crate sqllogictest SQL Logic Tests (.slt) labels Feb 26, 2024
@ozankabak ozankabak added the api change Changes the API exposed to users of the crate label Feb 26, 2024
@ozankabak
Copy link
Contributor

ozankabak commented Feb 26, 2024

Super excited! I did an initial review already but will do another deep dive tonight

Copy link
Contributor

@ozankabak ozankabak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did another round of review and LGTM except a few minor things (whose comments I left inline).

One general comment is to add a succinct, single line docstring comment to create_cache functions. Something like:

This function creates the cache object that stores plan properties such as equivalence properties, output ordering etc.

Looking forward to getting some eyes on this from the community!

datafusion/core/src/test_util/mod.rs Outdated Show resolved Hide resolved
datafusion/core/tests/user_defined/user_defined_plan.rs Outdated Show resolved Hide resolved
datafusion/physical-plan/src/filter.rs Show resolved Hide resolved
datafusion/physical-plan/src/joins/hash_join.rs Outdated Show resolved Hide resolved
datafusion/physical-plan/src/repartition/mod.rs Outdated Show resolved Hide resolved
@alamb
Copy link
Contributor

alamb commented Feb 27, 2024

I plan to review this PR today

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this PR @mustafasrepo -- I agree with @ozankabak and @gruuya that the use of the term "cache" is confusing.

I also think we should remove the existing methods like ExecutionPlan::execution_mode to avoid potential bugs due to inconsistent implementations.

In addition to improving the stack depth, I think this PR may also significantly improve planning speed (as it avoids redundant calculations)

Otherwise I think this PR looks good to me

datafusion-examples/examples/custom_datasource.rs Outdated Show resolved Hide resolved
datafusion-examples/examples/custom_datasource.rs Outdated Show resolved Hide resolved
datafusion/physical-plan/src/lib.rs Show resolved Hide resolved
datafusion/physical-plan/src/lib.rs Show resolved Hide resolved
datafusion/physical-plan/src/lib.rs Show resolved Hide resolved
datafusion/physical-plan/src/lib.rs Outdated Show resolved Hide resolved
datafusion/physical-plan/src/lib.rs Outdated Show resolved Hide resolved
datafusion/physical-plan/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this PR is looking quite good now. I'll try and give it another careful look tomorrow. Maybe we can wait to see if @gruuya has any additional comments before merging.

I do think it would be better to make it impossible for the API's to be inconsistent (aka #9346 (comment)) but I think it would be ok to leave the API as it is currently in this PR

.github/workflows/docs.yaml Outdated Show resolved Hide resolved
@gruuya
Copy link
Contributor

gruuya commented Feb 28, 2024

No further comments from me, aside from thanks once again @mustafasrepo for handling this!

Perhaps just a side note that even RUST_MIN_STACK=300000 is insufficient for the original source of the issue now (tpcds_physical_q64), and seems to need a couple of orders more. Not sure whether #8837 is still active, but definitely worth tracking somewhere, and eventually un-ignoring the test.

@mustafasrepo
Copy link
Contributor Author

No further comments from me, aside from thanks once again @mustafasrepo for handling this!

Perhaps just a side note that even RUST_MIN_STACK=300000 is insufficient for the original source of the issue now (tpcds_physical_q64), and seems to need a couple of orders more. Not sure whether #8837 is still active, but definitely worth tracking somewhere, and eventually un-ignoring the test.

@metesynnada did some debugging of tpcds_physical_q64 why stack overflow occurs. It seems that overflow occurs in create_intial_plan stage of planner (It is the place, where LogicalPlan is converted to the PhysicalPlan. Not during any physical rule or logical plan creation). Examining code structure, it seems that recursive futures are used by constructed create_initial_plan recursively on the LogicalPlan. I am not familiar with its implications, However the problem might be related to this behavior.

@gruuya
Copy link
Contributor

gruuya commented Feb 28, 2024

It seems that overflow occurs in create_intial_plan stage of planner

Yup, agreed. My own brief investigation led me to believe that in the case of tpcds_physical_q64 it happens in LogicalPlan::Join variant handling in particular.

@simonvandel
Copy link
Contributor

I was curious on how this affected planning performance. Here are my results comparing 0c46d7f (parent of first commit in this PR) with a8fac85.

Seems like there are quite big regressions in physical planning. But I may have messed up.

Results
mold -run cargo bench --bench sql_planner --profile release-nonlto -- --baseline=0c46d7fa105fddc4a35a4c99e4aa2a063d967abb
     Running benches/sql_planner.rs (/home/svs/code/arrow-datafusion/target/release-nonlto/deps/sql_planner-355e1e15827c5f11)
Gnuplot not found, using plotters backend
logical_select_one_from_700
                        time:   [576.63 µs 577.55 µs 578.47 µs]
                        change: [+1.0656% +1.7066% +2.3733%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low severe
  3 (3.00%) high mild
  2 (2.00%) high severe

physical_select_one_from_700
                        time:   [4.0361 ms 4.0488 ms 4.0629 ms]
                        change: [-1.9349% -1.3951% -0.9091%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

logical_trivial_join_low_numbered_columns
                        time:   [586.18 µs 587.20 µs 588.35 µs]
                        change: [-0.4208% +1.0724% +3.2859%] (p = 0.32 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  10 (10.00%) high severe

logical_trivial_join_high_numbered_columns
                        time:   [627.61 µs 628.75 µs 629.94 µs]
                        change: [-1.5180% -1.0025% -0.5763%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

Benchmarking logical_aggregate_with_join: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.5s, enable flat sampling, or reduce sample count to 60.
logical_aggregate_with_join
                        time:   [1.0842 ms 1.0866 ms 1.0893 ms]
                        change: [-1.5018% -0.9371% -0.4389%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q1   time:   [7.8999 ms 7.9398 ms 7.9805 ms]
                        change: [+1113.8% +1122.0% +1129.1%] (p = 0.00 < 0.05)
                        Performance has regressed.

physical_plan_tpch_q2   time:   [11.743 ms 11.806 ms 11.878 ms]
                        change: [+1622.6% +1632.4% +1644.4%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

physical_plan_tpch_q3   time:   [3.8676 ms 3.8778 ms 3.8886 ms]
                        change: [+625.22% +628.60% +631.62%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 13 outliers among 100 measurements (13.00%)
  9 (9.00%) high mild
  4 (4.00%) high severe

physical_plan_tpch_q4   time:   [2.9474 ms 2.9549 ms 2.9635 ms]
                        change: [+442.21% +445.51% +448.46%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q5   time:   [5.8592 ms 5.8881 ms 5.9182 ms]
                        change: [+903.01% +912.33% +920.23%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

Benchmarking physical_plan_tpch_q6: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.3s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q6   time:   [1.8279 ms 1.8300 ms 1.8322 ms]
                        change: [+358.95% +361.52% +365.24%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q7   time:   [8.2906 ms 8.3330 ms 8.3757 ms]
                        change: [+950.61% +957.43% +963.46%] (p = 0.00 < 0.05)
                        Performance has regressed.

physical_plan_tpch_q8   time:   [11.821 ms 11.870 ms 11.920 ms]
                        change: [+1331.7% +1339.7% +1347.4%] (p = 0.00 < 0.05)
                        Performance has regressed.

physical_plan_tpch_q9   time:   [8.8488 ms 8.8885 ms 8.9294 ms]
                        change: [+1161.1% +1169.6% +1177.0%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

physical_plan_tpch_q10  time:   [6.0025 ms 6.0278 ms 6.0543 ms]
                        change: [+829.09% +837.51% +845.36%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  7 (7.00%) high mild

physical_plan_tpch_q11  time:   [4.8230 ms 4.8421 ms 4.8622 ms]
                        change: [+716.64% +721.76% +726.08%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q12  time:   [3.8354 ms 3.8458 ms 3.8573 ms]
                        change: [+553.06% +556.86% +560.04%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  10 (10.00%) high mild
  5 (5.00%) high severe

physical_plan_tpch_q13  time:   [2.4938 ms 2.4991 ms 2.5052 ms]
                        change: [+420.96% +422.92% +424.70%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q14  time:   [3.2362 ms 3.2430 ms 3.2510 ms]
                        change: [+527.89% +530.83% +533.51%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q16  time:   [4.9522 ms 4.9715 ms 4.9919 ms]
                        change: [+840.97% +846.02% +850.48%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high mild

physical_plan_tpch_q17  time:   [4.7371 ms 4.7556 ms 4.7757 ms]
                        change: [+848.11% +853.56% +858.89%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q18  time:   [5.4681 ms 5.5619 ms 5.6655 ms]
                        change: [+851.77% +869.08% +890.15%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  8 (8.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q19  time:   [9.6241 ms 9.6606 ms 9.6985 ms]
                        change: [+1401.0% +1409.1% +1416.4%] (p = 0.00 < 0.05)
                        Performance has regressed.

physical_plan_tpch_q20  time:   [6.1281 ms 6.1519 ms 6.1765 ms]
                        change: [+914.60% +921.14% +926.79%] (p = 0.00 < 0.05)
                        Performance has regressed.

physical_plan_tpch_q21  time:   [9.0428 ms 9.0806 ms 9.1191 ms]
                        change: [+991.69% +997.40% +1003.3%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

physical_plan_tpch_q22  time:   [4.3348 ms 4.3436 ms 4.3534 ms]
                        change: [+532.03% +534.80% +537.16%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_all  time:   [12.755 ms 12.775 ms 12.796 ms]
                        change: [-0.8907% -0.6753% -0.4745%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

@gruuya
Copy link
Contributor

gruuya commented Feb 28, 2024

Seems like there are quite big regressions in physical planning.

Perhaps it's the eager computation of plan properties now, as opposed to lazy computation before (meaning some props were not computed/needed)?

@mustafasrepo
Copy link
Contributor Author

mustafasrepo commented Feb 28, 2024

I was curious on how this affected planning performance. Here are my results comparing 0c46d7f (parent of first commit in this PR) with a8fac85.

Seems like there are quite big regressions in physical planning. But I may have messed up.

Results

I re-ran benchmarks in my machine. The results are below

Results
Gnuplot not found, using plotters backend
logical_select_one_from_700
                        time:   [482.54 µs 484.25 µs 487.11 µs]
                        change: [-4.6292% -2.1797% +0.0154%] (p = 0.08 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_select_one_from_700
                        time:   [3.0204 ms 3.0227 ms 3.0250 ms]
                        change: [-2.3949% -2.0254% -1.7849%] (p = 0.00 < 0.05)
                        Performance has improved.

logical_trivial_join_low_numbered_columns
                        time:   [452.55 µs 453.30 µs 454.05 µs]
                        change: [-3.4288% -1.6448% -0.6310%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

logical_trivial_join_high_numbered_columns
                        time:   [495.26 µs 498.76 µs 505.52 µs]
                        change: [-0.4885% +0.0379% +0.7595%] (p = 0.93 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low severe
  1 (1.00%) high mild
  1 (1.00%) high severe

logical_aggregate_with_join
                        time:   [769.12 µs 769.87 µs 770.62 µs]
                        change: [-18.781% -12.487% -6.5679%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q1   time:   [4.8582 ms 4.8620 ms 4.8663 ms]
                        change: [-17.477% -11.210% -6.1042%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q2   time:   [7.5602 ms 7.5707 ms 7.5822 ms]
                        change: [-30.715% -30.328% -30.007%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q3   time:   [2.5096 ms 2.5143 ms 2.5196 ms]
                        change: [-25.111% -24.050% -23.386%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q4   time:   [1.9527 ms 1.9591 ms 1.9711 ms]
                        change: [-14.387% -14.094% -13.679%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

physical_plan_tpch_q5   time:   [3.8057 ms 3.8091 ms 3.8127 ms]
                        change: [-57.720% -57.662% -57.601%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q6   time:   [1.3063 ms 1.3080 ms 1.3099 ms]
                        change: [-5.6737% -5.1326% -4.7813%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q7   time:   [5.4017 ms 5.4307 ms 5.4840 ms]
                        change: [-44.042% -43.729% -43.180%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q8   time:   [7.9176 ms 7.9289 ms 7.9414 ms]
                        change: [-71.699% -71.653% -71.598%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q9   time:   [5.9427 ms 5.9797 ms 6.0430 ms]
                        change: [-47.161% -46.493% -45.740%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

physical_plan_tpch_q10  time:   [3.8350 ms 3.8388 ms 3.8430 ms]
                        change: [-31.229% -31.025% -30.862%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q11  time:   [2.9252 ms 2.9415 ms 2.9715 ms]
                        change: [-16.985% -15.900% -14.659%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q12  time:   [2.6822 ms 2.6845 ms 2.6870 ms]
                        change: [-13.733% -13.572% -13.424%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

physical_plan_tpch_q13  time:   [1.5018 ms 1.5033 ms 1.5048 ms]
                        change: [-24.810% -24.629% -24.402%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q14  time:   [2.1145 ms 2.1264 ms 2.1480 ms]
                        change: [-4.5956% -3.8425% -2.7007%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q16  time:   [3.1242 ms 3.1278 ms 3.1319 ms]
                        change: [-17.445% -17.334% -17.217%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q17  time:   [2.8500 ms 2.8705 ms 2.8990 ms]
                        change: [-8.0314% -7.2135% -6.2413%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q18  time:   [3.0984 ms 3.1064 ms 3.1175 ms]
                        change: [-10.790% -10.508% -10.196%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q19  time:   [7.8895 ms 7.9316 ms 8.0083 ms]
                        change: [-2.4917% -1.9490% -0.7408%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q20  time:   [3.7943 ms 3.7987 ms 3.8037 ms]
                        change: [-19.417% -17.933% -16.538%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q21  time:   [5.7239 ms 5.7279 ms 5.7320 ms]
                        change: [-37.177% -36.725% -36.397%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

physical_plan_tpch_q22  time:   [2.6413 ms 2.6610 ms 2.6942 ms]
                        change: [-14.662% -13.905% -12.725%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

physical_plan_tpch_all  time:   [9.3429 ms 9.3899 ms 9.4571 ms]
                        change: [-2.7310% -1.5545% -0.5941%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

According to these results, new mechanism mostly improves planning performance (especially for complex queries). since our results are contradicting. I re-ran the same benchmark in main branch twice, and in new branch twice. Then compared these results with themselves to see how much variance benchmark run produces at different runs. Below results for the different benchmarks for the same branch can be found
Main Run 2 vs Main Run 1
Gnuplot not found, using plotters backend
logical_select_one_from_700
                        time:   [490.30 µs 505.86 µs 525.48 µs]
                        change: [-2.4316% +0.6167% +3.2873%] (p = 0.71 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  5 (5.00%) high severe

physical_select_one_from_700
                        time:   [3.0782 ms 3.0852 ms 3.0962 ms]
                        change: [+1.0064% +1.2946% +1.6436%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

logical_trivial_join_low_numbered_columns
                        time:   [457.67 µs 458.60 µs 459.94 µs]
                        change: [+0.3580% +1.3982% +3.7624%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  4 (4.00%) low mild
  2 (2.00%) high mild
  1 (1.00%) high severe

logical_trivial_join_high_numbered_columns
                        time:   [496.65 µs 497.61 µs 498.80 µs]
                        change: [-0.2179% +0.0540% +0.3247%] (p = 0.70 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) low severe
  1 (1.00%) high mild
  1 (1.00%) high severe

logical_aggregate_with_join
                        time:   [826.96 µs 908.94 µs 1.0125 ms]
                        change: [+6.3560% +14.209% +23.294%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 15 outliers among 100 measurements (15.00%)
  2 (2.00%) high mild
  13 (13.00%) high severe

physical_plan_tpch_q1   time:   [5.1788 ms 5.4759 ms 5.8885 ms]
                        change: [+1.3260% +7.2118% +15.244%] (p = 0.02 < 0.05)
                        Performance has regressed.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

physical_plan_tpch_q2   time:   [10.819 ms 10.866 ms 10.924 ms]
                        change: [-0.1234% +1.1456% +2.1468%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q3   time:   [3.2836 ms 3.3104 ms 3.3553 ms]
                        change: [-2.3241% -0.9046% +0.8137%] (p = 0.29 > 0.05)
                        No change in performance detected.
Found 14 outliers among 100 measurements (14.00%)
  6 (6.00%) high mild
  8 (8.00%) high severe

physical_plan_tpch_q4   time:   [2.2800 ms 2.2830 ms 2.2862 ms]
                        change: [-1.4797% -1.0937% -0.7281%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q5   time:   [8.9876 ms 8.9969 ms 9.0067 ms]
                        change: [-1.4174% -0.7961% -0.2552%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

physical_plan_tpch_q6   time:   [1.3748 ms 1.3814 ms 1.3940 ms]
                        change: [-9.3756% -5.6928% -2.5053%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q7   time:   [9.6410 ms 9.6510 ms 9.6617 ms]
                        change: [-0.9565% -0.4781% -0.0663%] (p = 0.03 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q8   time:   [27.948 ms 27.971 ms 27.993 ms]
                        change: [-0.2163% +0.0701% +0.3352%] (p = 0.62 > 0.05)
                        No change in performance detected.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

physical_plan_tpch_q9   time:   [11.114 ms 11.176 ms 11.285 ms]
                        change: [-0.2915% +0.4000% +1.4525%] (p = 0.49 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high severe

physical_plan_tpch_q10  time:   [5.5536 ms 5.5655 ms 5.5812 ms]
                        change: [-0.6752% -0.3443% +0.0157%] (p = 0.04 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q11  time:   [3.4787 ms 3.4976 ms 3.5313 ms]
                        change: [-1.3654% -0.6526% +0.4886%] (p = 0.18 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  3 (3.00%) high mild
  6 (6.00%) high severe

physical_plan_tpch_q12  time:   [3.1018 ms 3.1061 ms 3.1110 ms]
                        change: [-0.7425% -0.4918% -0.2454%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q13  time:   [1.9962 ms 1.9986 ms 2.0010 ms]
                        change: [-0.4561% -0.2516% -0.0440%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q14  time:   [2.2037 ms 2.2113 ms 2.2203 ms]
                        change: [-0.4389% +0.0583% +0.5657%] (p = 0.82 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  5 (5.00%) high mild
  7 (7.00%) high severe

physical_plan_tpch_q16  time:   [3.7811 ms 3.7837 ms 3.7864 ms]
                        change: [-0.4859% -0.3382% -0.1973%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

physical_plan_tpch_q17  time:   [3.0798 ms 3.0936 ms 3.1076 ms]
                        change: [+0.5410% +1.5036% +2.2740%] (p = 0.00 < 0.05)
                        Change within noise threshold.

physical_plan_tpch_q18  time:   [3.4659 ms 3.4712 ms 3.4769 ms]
                        change: [-0.1817% +0.0928% +0.3386%] (p = 0.50 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q19  time:   [8.0817 ms 8.0892 ms 8.0969 ms]
                        change: [-0.7482% -0.4924% -0.2533%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  3 (3.00%) high mild

physical_plan_tpch_q20  time:   [4.5514 ms 4.6288 ms 4.7138 ms]
                        change: [+5.5386% +7.4224% +9.4528%] (p = 0.00 < 0.05)
                        Performance has regressed.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

physical_plan_tpch_q21  time:   [9.0060 ms 9.0524 ms 9.1166 ms]
                        change: [-0.0386% +0.5447% +1.3691%] (p = 0.12 > 0.05)
                        No change in performance detected.
Found 17 outliers among 100 measurements (17.00%)
  11 (11.00%) high mild
  6 (6.00%) high severe

physical_plan_tpch_q22  time:   [3.0796 ms 3.0908 ms 3.1033 ms]
                        change: [-0.0510% +0.3826% +0.7615%] (p = 0.08 > 0.05)
                        No change in performance detected.
Found 12 outliers among 100 measurements (12.00%)
  2 (2.00%) high mild
  10 (10.00%) high severe

physical_plan_tpch_all  time:   [9.4791 ms 9.5382 ms 9.6367 ms]
                        change: [+0.6728% +1.7514% +2.8703%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 6 outliers among 100 measurements (6.00%)
  1 (1.00%) low mild
  3 (3.00%) high mild
  2 (2.00%) high severe

Branch Run 2 vs Branch Run 1
Gnuplot not found, using plotters backend
logical_select_one_from_700
                        time:   [482.54 µs 484.25 µs 487.16 µs]
                        change: [+0.1199% +1.0534% +2.4753%] (p = 0.06 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) low severe
  2 (2.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_select_one_from_700
                        time:   [3.0204 ms 3.0227 ms 3.0250 ms]
                        change: [-1.9446% -1.8220% -1.6973%] (p = 0.00 < 0.05)
                        Performance has improved.

logical_trivial_join_low_numbered_columns
                        time:   [452.55 µs 453.30 µs 454.05 µs]
                        change: [-1.7532% -1.1174% -0.6057%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  1 (1.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

logical_trivial_join_high_numbered_columns
                        time:   [495.25 µs 498.76 µs 505.57 µs]
                        change: [-1.0315% -0.4986% +0.3472%] (p = 0.15 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) low severe
  1 (1.00%) high mild
  1 (1.00%) high severe

logical_aggregate_with_join
                        time:   [769.12 µs 769.87 µs 770.63 µs]
                        change: [-0.7240% -0.4668% -0.2220%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q1   time:   [4.8582 ms 4.8620 ms 4.8663 ms]
                        change: [-1.5189% -1.3731% -1.2333%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q2   time:   [7.5602 ms 7.5707 ms 7.5822 ms]
                        change: [-1.4182% -1.1973% -0.9798%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q3   time:   [2.5095 ms 2.5143 ms 2.5196 ms]
                        change: [-1.1784% -0.7929% -0.4292%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 12 outliers among 100 measurements (12.00%)
  9 (9.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q4   time:   [1.9527 ms 1.9591 ms 1.9711 ms]
                        change: [-20.849% -15.072% -9.8504%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  2 (2.00%) high mild
  5 (5.00%) high severe

physical_plan_tpch_q5   time:   [3.8057 ms 3.8091 ms 3.8127 ms]
                        change: [-1.1569% -0.9889% -0.8191%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q6   time:   [1.3063 ms 1.3080 ms 1.3099 ms]
                        change: [-3.3001% -2.3030% -1.4936%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  3 (3.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q7   time:   [5.4017 ms 5.4307 ms 5.4840 ms]
                        change: [-0.8280% -0.1612% +0.8696%] (p = 0.81 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  6 (6.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q8   time:   [7.9177 ms 7.9289 ms 7.9413 ms]
                        change: [-1.5994% -1.2327% -0.9253%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q9   time:   [5.9426 ms 5.9797 ms 6.0429 ms]
                        change: [-0.3560% +0.2891% +1.2648%] (p = 0.67 > 0.05)
                        No change in performance detected.
Found 10 outliers among 100 measurements (10.00%)
  3 (3.00%) high mild
  7 (7.00%) high severe

physical_plan_tpch_q10  time:   [3.8351 ms 3.8388 ms 3.8430 ms]
                        change: [-0.5106% -0.3235% -0.1468%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q11  time:   [2.9252 ms 2.9415 ms 2.9717 ms]
                        change: [-2.8743% -1.1814% +0.3357%] (p = 0.17 > 0.05)
                        No change in performance detected.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q12  time:   [2.6821 ms 2.6845 ms 2.6870 ms]
                        change: [-0.4900% -0.3172% -0.1498%] (p = 0.00 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

physical_plan_tpch_q13  time:   [1.5018 ms 1.5033 ms 1.5048 ms]
                        change: [-2.7994% -1.2002% -0.2305%] (p = 0.07 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  1 (1.00%) low mild
  5 (5.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q14  time:   [2.1146 ms 2.1264 ms 2.1480 ms]
                        change: [-0.6216% +0.0295% +1.0359%] (p = 0.96 > 0.05)
                        No change in performance detected.
Found 3 outliers among 100 measurements (3.00%)
  1 (1.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q16  time:   [3.1242 ms 3.1278 ms 3.1318 ms]
                        change: [-0.7674% -0.3833% -0.0720%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q17  time:   [2.8501 ms 2.8705 ms 2.8994 ms]
                        change: [-0.1687% +0.6001% +1.6077%] (p = 0.22 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  3 (3.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q18  time:   [3.0984 ms 3.1064 ms 3.1174 ms]
                        change: [-0.3335% -0.0200% +0.3375%] (p = 0.92 > 0.05)
                        No change in performance detected.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q19  time:   [7.8894 ms 7.9316 ms 8.0084 ms]
                        change: [-0.1952% +0.3957% +1.3828%] (p = 0.54 > 0.05)
                        No change in performance detected.
Found 7 outliers among 100 measurements (7.00%)
  6 (6.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q20  time:   [3.7943 ms 3.7987 ms 3.8037 ms]
                        change: [-0.4696% -0.2517% -0.0323%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q21  time:   [5.7238 ms 5.7279 ms 5.7321 ms]
                        change: [-0.3685% -0.2011% -0.0428%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 4 outliers among 100 measurements (4.00%)
  4 (4.00%) high mild

physical_plan_tpch_q22  time:   [2.6413 ms 2.6610 ms 2.6939 ms]
                        change: [-0.2031% +0.5955% +1.8973%] (p = 0.34 > 0.05)
                        No change in performance detected.
Found 9 outliers among 100 measurements (9.00%)
  5 (5.00%) high mild
  4 (4.00%) high severe

physical_plan_tpch_all  time:   [9.3429 ms 9.3899 ms 9.4570 ms]
                        change: [-0.5264% +0.0285% +0.7099%] (p = 0.95 > 0.05)
                        No change in performance detected.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) low mild
  2 (2.00%) high mild
  2 (2.00%) high severe

according to these results, there is definitely some noise across different runs. However, they are generally within 10%.

@simonvandel
Copy link
Contributor

I re-ran benchmarks in my machine. The results are below

Sorry for the noise, I must have somehow messed up. I also get green numbers now.

I wanted to merge main into this PR to get a direct comparison, but there are some conflicts.
So this is only comparing the PR head with the main head.
Baseline: b220f03
This PR commit: a8fac85

I guess it would be good to do a final run when this PR merges cleanly with main.

Results
logical_select_one_from_700
                        time:   [571.99 µs 573.71 µs 575.87 µs]
                        change: [-2.3965% -1.9565% -1.5211%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 10 outliers among 100 measurements (10.00%)
  1 (1.00%) low mild
  4 (4.00%) high mild
  5 (5.00%) high severe

physical_select_one_from_700
                        time:   [4.0660 ms 4.0824 ms 4.1013 ms]
                        change: [-2.8819% -1.9879% -1.1440%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  4 (4.00%) high mild
  1 (1.00%) high severe

logical_trivial_join_low_numbered_columns
                        time:   [591.08 µs 593.13 µs 595.76 µs]
                        change: [-6.5247% -2.7512% -0.3801%] (p = 0.09 > 0.05)
                        No change in performance detected.
Found 8 outliers among 100 measurements (8.00%)
  2 (2.00%) high mild
  6 (6.00%) high severe

logical_trivial_join_high_numbered_columns
                        time:   [633.10 µs 634.01 µs 634.98 µs]
                        change: [-3.4749% -2.6612% -1.9606%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) low mild
  1 (1.00%) high mild
  2 (2.00%) high severe

Benchmarking logical_aggregate_with_join: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.5s, enable flat sampling, or reduce sample count to 60.
logical_aggregate_with_join
                        time:   [1.0910 ms 1.0924 ms 1.0939 ms]
                        change: [-3.0224% -2.1454% -1.3335%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  2 (2.00%) high mild
  4 (4.00%) high severe

physical_plan_tpch_q1   time:   [7.8388 ms 7.8798 ms 7.9215 ms]
                        change: [-5.4862% -4.6508% -3.8017%] (p = 0.00 < 0.05)
                        Performance has improved.

physical_plan_tpch_q2   time:   [11.599 ms 11.660 ms 11.732 ms]
                        change: [-29.491% -28.931% -28.370%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high severe

physical_plan_tpch_q3   time:   [3.8482 ms 3.8557 ms 3.8645 ms]
                        change: [-22.953% -22.571% -22.194%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  1 (1.00%) high mild
  3 (3.00%) high severe

physical_plan_tpch_q4   time:   [2.9394 ms 2.9436 ms 2.9481 ms]
                        change: [-13.579% -13.302% -13.037%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  3 (3.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q5   time:   [5.7565 ms 5.7768 ms 5.7990 ms]
                        change: [-58.072% -57.831% -57.580%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  4 (4.00%) high mild
  4 (4.00%) high severe

Benchmarking physical_plan_tpch_q6: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.3s, enable flat sampling, or reduce sample count to 50.
physical_plan_tpch_q6   time:   [1.8326 ms 1.8348 ms 1.8373 ms]
                        change: [-4.3461% -3.8854% -3.3835%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  3 (3.00%) high mild
  5 (5.00%) high severe

physical_plan_tpch_q7   time:   [8.1415 ms 8.1731 ms 8.2058 ms]
                        change: [-45.628% -45.283% -44.930%] (p = 0.00 < 0.05)
                        Performance has improved.

physical_plan_tpch_q8   time:   [11.600 ms 11.647 ms 11.697 ms]
                        change: [-72.389% -72.217% -72.048%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 1 outliers among 100 measurements (1.00%)
  1 (1.00%) high mild

physical_plan_tpch_q9   time:   [8.7531 ms 8.7829 ms 8.8135 ms]
                        change: [-47.833% -47.538% -47.232%] (p = 0.00 < 0.05)
                        Performance has improved.

physical_plan_tpch_q10  time:   [5.9565 ms 5.9761 ms 5.9972 ms]
                        change: [-31.327% -30.949% -30.571%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  5 (5.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q11  time:   [4.7788 ms 4.7900 ms 4.8030 ms]
                        change: [-23.407% -20.882% -18.549%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  3 (3.00%) high mild
  4 (4.00%) high severe

physical_plan_tpch_q12  time:   [3.8070 ms 3.8132 ms 3.8200 ms]
                        change: [-13.715% -13.448% -13.212%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 6 outliers among 100 measurements (6.00%)
  4 (4.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q13  time:   [2.4866 ms 2.4904 ms 2.4944 ms]
                        change: [-21.541% -21.256% -20.973%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high mild

physical_plan_tpch_q14  time:   [3.2541 ms 3.2599 ms 3.2664 ms]
                        change: [-2.6579% -2.3979% -2.1429%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 7 outliers among 100 measurements (7.00%)
  5 (5.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q16  time:   [4.9112 ms 4.9275 ms 4.9454 ms]
                        change: [-18.165% -17.709% -17.242%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 12 outliers among 100 measurements (12.00%)
  6 (6.00%) high mild
  6 (6.00%) high severe

physical_plan_tpch_q17  time:   [4.7187 ms 4.7405 ms 4.7678 ms]
                        change: [-6.2345% -5.6606% -5.0112%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 9 outliers among 100 measurements (9.00%)
  4 (4.00%) high mild
  5 (5.00%) high severe

physical_plan_tpch_q18  time:   [5.1662 ms 5.1835 ms 5.2024 ms]
                        change: [-10.562% -10.086% -9.5753%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 4 outliers among 100 measurements (4.00%)
  2 (2.00%) high mild
  2 (2.00%) high severe

physical_plan_tpch_q19  time:   [9.6546 ms 9.6977 ms 9.7416 ms]
                        change: [-3.6676% -3.0457% -2.4012%] (p = 0.00 < 0.05)
                        Performance has improved.

physical_plan_tpch_q20  time:   [6.1156 ms 6.1346 ms 6.1554 ms]
                        change: [-12.986% -12.532% -12.057%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_q21  time:   [9.0939 ms 9.1482 ms 9.2128 ms]
                        change: [-34.273% -33.805% -33.250%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 2 outliers among 100 measurements (2.00%)
  2 (2.00%) high severe

physical_plan_tpch_q22  time:   [4.3380 ms 4.3443 ms 4.3512 ms]
                        change: [-14.344% -14.025% -13.729%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 3 outliers among 100 measurements (3.00%)
  2 (2.00%) high mild
  1 (1.00%) high severe

physical_plan_tpch_all  time:   [12.851 ms 12.870 ms 12.890 ms]
                        change: [-0.5789% -0.3165% -0.0670%] (p = 0.02 < 0.05)
                        Change within noise threshold.
Found 2 outliers among 100 measurements (2.00%)
  1 (1.00%) high mild
  1 (1.00%) high severe

# Conflicts:
#	datafusion/core/src/physical_optimizer/join_selection.rs
#	datafusion/physical-plan/src/aggregates/mod.rs
#	datafusion/physical-plan/src/coalesce_partitions.rs
#	datafusion/physical-plan/src/joins/cross_join.rs
#	datafusion/physical-plan/src/lib.rs
#	datafusion/physical-plan/src/sorts/sort_preserving_merge.rs
#	datafusion/physical-plan/src/unnest.rs
#	datafusion/physical-plan/src/windows/window_agg_exec.rs
#	datafusion/physical-plan/src/work_table.rs
@ozankabak
Copy link
Contributor

@alamb, I did one final review to make sure all looks good. Since this will attract a lot of merge conflicts, it'd be a good idea to merge this sooner rather than later.

Given the benchmark confusion is out of the way too, are you OK with merging?

@alamb
Copy link
Contributor

alamb commented Feb 28, 2024

I am running benchmark numbers now

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I compared this branch with b220f03 (the merge base)

git checkout b220f03fffda22c70f03fa84e244cf04f0e6644c
cargo bench --bench sql_planner
git checkout feature/remove_default_cache 
cargo bench --bench sql_planner

All the performance seems to be faster or the same. Results: results.txt

Thanks again everyone

@alamb
Copy link
Contributor

alamb commented Feb 28, 2024

I have a follow on PR proposal #9389 to improve the documentation

@ozankabak
Copy link
Contributor

Thanks everyone for the awesome collaboration!

@ozankabak ozankabak merged commit d5b6359 into apache:main Feb 28, 2024
23 checks passed
@mustafasrepo mustafasrepo deleted the feature/remove_default_cache branch March 27, 2024 06:35
Michael-J-Ward added a commit to Michael-J-Ward/datafusion-python that referenced this pull request May 8, 2024
andygrove pushed a commit to apache/datafusion-python that referenced this pull request May 8, 2024
* deps: upgrade datafusion to 37.1.0

* feat: re-implement SessionContext::tables

The method was removed upstream but is used in many tests for `datafusion-python`.

Ref: apache/datafusion#9627

* feat: upgrade dataframe write_parquet and write_json

The options to write_parquet changed.

write_json has a new argument that I defaulted to None. We can expose that config later.

Ref: apache/datafusion#9382

* feat: impl new ExecutionPlanProperties for DatasetExec

Ref: apache/datafusion#9346

* feat: add upstream variant and method params

- `WindowFunction` and `AggregateFunction` have `null_treatment` options.
- `ScalarValue` and `DataType` have new variants
- `SchemaProvider::table` now returns a `Result`

* lint: allow(deprecated) for make_scalar_function

* feat: migrate functions.rs

`datafusion` completed an Epic that ported many of the `BuiltInFunctions` enum to `SclarUDF`.

I created new macros to simplify the port, and used these macros to refactor a few existing functions.

Ref: apache/datafusion#9285

* fixme: commented out last failing test

This is a bug upstream in datafusion

FAILED datafusion/tests/test_functions.py::test_array_functions - pyo3_runtime.PanicException: range end index 9 out of range for slice of length 8

* chore: update Cargo.toml package info
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api change Changes the API exposed to users of the crate core Core DataFusion crate physical-expr Physical Expressions sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Exponential nature of ExecutionPlans output_partitioning and equivalence_properties
6 participants