-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Description
Describe the bug
This is a followup to #13748 which was largely addressed by #16217 however I've noticed in my testing that with large enough # of columns planning time is still growing exponentially. Below is the output from a modified version of the sql_planner using a branch from DF 49:
physical_sorted_union_order_by_10
time: [12.089 ms 12.105 ms 12.125 ms]
change: [+6.4475% +6.6567% +6.8895%] (p = 0.00 < 0.05)
Performance has regressed.
physical_sorted_union_order_by_50
time: [323.92 ms 324.75 ms 325.53 ms]
change: [+6.5657% +6.8547% +7.1319%] (p = 0.00 < 0.05)
Performance has regressed.
Benchmarking physical_sorted_union_order_by_100: Warming up for 30.000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 18.5s.
physical_sorted_union_order_by_100
time: [1.7079 s 1.7143 s 1.7221 s]
change: [+6.5417% +6.9412% +7.5811%] (p = 0.00 < 0.05)
Performance has regressed.
Found 1 outliers among 10 measurements (10.00%)
1 (10.00%) high mild
Benchmarking physical_sorted_union_order_by_200: Warming up for 30.000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 118.6s.
physical_sorted_union_order_by_200
time: [10.271 s 10.321 s 10.400 s]
change: [+4.3930% +4.9196% +5.7641%] (p = 0.00 < 0.05)
Performance has regressed.
Found 3 outliers among 10 measurements (30.00%)
2 (20.00%) low mild
1 (10.00%) high severe
Benchmarking physical_sorted_union_order_by_500: Warming up for 30.000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 1350.0s.
physical_sorted_union_order_by_500
time: [129.75 s 131.00 s 132.82 s]
Ignored the regressed messages - tests are +-10% on my machine between runs. Note the time for the 500 columns.
Below is the output from one of the benchmark runs with 500 columns where I instrumented the logical plan optimize and physical plan optimizer to output timings:
Benchmarking physical_sorted_union_order_by_500: Collecting 10 samples in estimated 1202.6 s (10 iterations)[2025-08-20T14:03:18Z INFO datafusion_optimizer::analyzer] Analyzer::execute_and_check rule resolve_grouping_function took 46 ms
[2025-08-20T14:03:20Z INFO datafusion_optimizer::analyzer] Analyzer::execute_and_check rule type_coercion took 2007 ms
[2025-08-20T14:03:20Z INFO datafusion_optimizer::analyzer] Analyzer::execute_and_check rule processing took 2054 ms
[2025-08-20T14:03:20Z INFO datafusion_optimizer::analyzer] Analyzer::execute_and_check took 2129 ms
[2025-08-20T14:03:21Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_nested_union took > 50ms: 1544 ms
[2025-08-20T14:03:22Z INFO datafusion_optimizer::optimizer] Optimization for rule simplify_expressions took > 50ms: 823 ms
[2025-08-20T14:03:22Z INFO datafusion_optimizer::optimizer] Optimization for rule replace_distinct_aggregate took > 50ms: 66 ms
[2025-08-20T14:03:22Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_join took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule decorrelate_predicate_subquery took > 50ms: 130 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule scalar_subquery_to_join took > 50ms: 86 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule decorrelate_lateral_join took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule extract_equijoin_predicate took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_duplicated_expr took > 50ms: 68 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_filter took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_limit took > 50ms: 66 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule propagate_empty_relation took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_one_union took > 50ms: 68 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule filter_null_join_keys took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_outer_join took > 50ms: 68 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule push_down_limit took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule push_down_filter took > 50ms: 67 ms
[2025-08-20T14:03:23Z INFO datafusion_optimizer::optimizer] Optimization for rule single_distinct_aggregation_to_group_by took > 50ms: 67 ms
[2025-08-20T14:03:24Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_group_by_constant took > 50ms: 68 ms
[2025-08-20T14:03:24Z INFO datafusion_optimizer::optimizer] Optimization for rule common_sub_expression_eliminate took > 50ms: 138 ms
[2025-08-20T14:03:25Z INFO datafusion_optimizer::optimizer] Optimization for rule optimize_projections took > 50ms: 1721 ms
[2025-08-20T14:03:26Z INFO datafusion_optimizer::optimizer] Processing time for pass 0 was 5561ms
[2025-08-20T14:03:27Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_nested_union took > 50ms: 1648 ms
[2025-08-20T14:03:28Z INFO datafusion_optimizer::optimizer] Optimization for rule simplify_expressions took > 50ms: 804 ms
[2025-08-20T14:03:28Z INFO datafusion_optimizer::optimizer] Optimization for rule replace_distinct_aggregate took > 50ms: 72 ms
[2025-08-20T14:03:28Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_join took > 50ms: 75 ms
[2025-08-20T14:03:28Z INFO datafusion_optimizer::optimizer] Optimization for rule decorrelate_predicate_subquery took > 50ms: 135 ms
[2025-08-20T14:03:28Z INFO datafusion_optimizer::optimizer] Optimization for rule scalar_subquery_to_join took > 50ms: 91 ms
[2025-08-20T14:03:28Z INFO datafusion_optimizer::optimizer] Optimization for rule decorrelate_lateral_join took > 50ms: 76 ms
[2025-08-20T14:03:28Z INFO datafusion_optimizer::optimizer] Optimization for rule extract_equijoin_predicate took > 50ms: 72 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_duplicated_expr took > 50ms: 75 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_filter took > 50ms: 72 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_limit took > 50ms: 74 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule propagate_empty_relation took > 50ms: 72 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_one_union took > 50ms: 74 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule filter_null_join_keys took > 50ms: 72 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_outer_join took > 50ms: 74 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule push_down_limit took > 50ms: 71 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule push_down_filter took > 50ms: 75 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule single_distinct_aggregation_to_group_by took > 50ms: 73 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule eliminate_group_by_constant took > 50ms: 75 ms
[2025-08-20T14:03:29Z INFO datafusion_optimizer::optimizer] Optimization for rule common_sub_expression_eliminate took > 50ms: 175 ms
[2025-08-20T14:03:31Z INFO datafusion_optimizer::optimizer] Optimization for rule optimize_projections took > 50ms: 1745 ms
[2025-08-20T14:03:31Z INFO datafusion_optimizer::optimizer] optimizer pass 1 did not make changes. Processing time: 5826 ms
[2025-08-20T14:03:31Z INFO datafusion_optimizer::optimizer] Processing time for 3 passes was 11473ms
[2025-08-20T14:03:31Z INFO datafusion_optimizer::optimizer] Optimizer took 11559 ms
[2025-08-20T14:03:47Z INFO datafusion::physical_planner] Creating initial physical plan took 15814 ms
[2025-08-20T14:03:47Z INFO datafusion::physical_planner] Physical plan optimizer OutputRequirements took 0 ms
[2025-08-20T14:03:47Z INFO datafusion::physical_planner] Physical plan optimizer aggregate_statistics took 0 ms
[2025-08-20T14:03:47Z INFO datafusion::physical_planner] Physical plan optimizer join_selection took 0 ms
[2025-08-20T14:03:47Z INFO datafusion::physical_planner] Physical plan optimizer LimitedDistinctAggregation took 0 ms
[2025-08-20T14:03:47Z INFO datafusion::physical_planner] Physical plan optimizer FilterPushdown took 18 ms
[2025-08-20T14:04:52Z INFO datafusion::physical_planner] Physical plan optimizer EnforceDistribution took 64265 ms
[2025-08-20T14:04:52Z INFO datafusion::physical_planner] Physical plan optimizer CombinePartialFinalAggregate took 0 ms
[2025-08-20T14:05:02Z INFO datafusion::physical_planner] Physical plan optimizer EnforceSorting took 10527 ms
[2025-08-20T14:05:02Z INFO datafusion::physical_planner] Physical plan optimizer OptimizeAggregateOrder took 0 ms
[2025-08-20T14:05:04Z INFO datafusion::physical_planner] Physical plan optimizer ProjectionPushdown took 2373 ms
[2025-08-20T14:05:04Z INFO datafusion::physical_planner] Physical plan optimizer coalesce_batches took 3 ms
[2025-08-20T14:05:04Z INFO datafusion::physical_planner] Physical plan optimizer coalesce_async_exec_input took 0 ms
[2025-08-20T14:05:04Z INFO datafusion::physical_planner] Physical plan optimizer OutputRequirements took 0 ms
[2025-08-20T14:05:04Z INFO datafusion::physical_planner] Physical plan optimizer LimitAggregation took 0 ms
[2025-08-20T14:05:07Z INFO datafusion::physical_planner] Physical plan optimizer LimitPushdown took 2122 ms
[2025-08-20T14:05:07Z INFO datafusion::physical_planner] Physical plan optimizer ProjectionPushdown took 5 ms
[2025-08-20T14:05:07Z INFO datafusion::physical_planner] Physical plan optimizer EnsureCooperative took 0 ms
[2025-08-20T14:05:07Z INFO datafusion::physical_planner] Physical plan optimizer FilterPushdown(Post) took 79 ms
[2025-08-20T14:05:07Z INFO datafusion::physical_planner] Physical plan optimizer SanityCheckPlan took 63 ms
[2025-08-20T14:05:07Z INFO datafusion::physical_planner] Optimize physical plan took 79736 ms
Note the times for the logical plan optimizer eliminate_nested_union
and the physical plan initial plan create_initial_plan
and optimizers EnforceDistribution
, EnforceSorting
, ProjectionPushdown
and LimitPushdown
.
To Reproduce
No response
Expected behavior
Query planning should be much faster.
Additional context
No response