Skip to content

Conversation

Col-Waltz
Copy link
Contributor

Which issue does this PR close?

Rationale for this change

Common_sub_expression_eliminate rule failed with error:
SchemaError(FieldNotFound {field: <name>}, valid_fields: []})
due to the schema being changed by the second application of
find_common_exprs

As I understood the source of the problem was in sequential call of
find_common_exprs. First call returned original names as aggr_expr
and changed names as new_aggr_expr. Second call takes into account
only new_aggr_expr and if names was already changed by first call
will return changed names as aggr_expr(original ones)
and put them into Projection logic.

What changes are included in this PR?

I used existing NamePreserver mechanism to restore original schema names and
generate Projection with original name at the end of aggregate optimization.

Are these changes tested?

Yes this changes are tested and I added test to this commit.
Error emerges not only in PREPARE requests, I created SELECT request which
causes the same error.

Are there any user-facing changes?

No the change only fixes optimization rule.

Common_sub_expression_eliminate rule failed with error:
`SchemaError(FieldNotFound {field: <name>}, valid_fields: []})`
due to the schema being changed by the second application of
`find_common_exprs`

As I understood the source of the problem was in sequential call of
`find_common_exprs`. First call returned original names as `aggr_expr`
and changed names as `new_aggr_expr`. Second call takes into account
only `new_aggr_expr` and if names was already changed by first call
will return changed names as `aggr_expr`(original ones)
and put them into Projection logic.

I used NamePreserver mechanism to restore original schema names and
generate Projection with original name at the end of aggregate
optimization.
@github-actions github-actions bot added optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt) labels May 16, 2025
.map(|expr| Some(name_preserver.save(expr)))
.collect::<Vec<_>>()
} else {
new_aggr_expr
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the same as a vec of all None?

vec![None; new_agg_expr.len()]`

Copy link
Contributor Author

@Col-Waltz Col-Waltz May 16, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is, thanks for the comment. But this will not work, because vec! requires value to implement Clone trait but the Option<SavedName> doesn't. It seems to me easier to do this by map instead of adding trait to SavedName.

Copy link

Thank you for your contribution. Unfortunately, this pull request is stale because it has been open 60 days with no activity. Please remove the stale label or comment or this will be closed in 7 days.

@github-actions github-actions bot added the Stale PR has not had any activity for some time label Jul 18, 2025
@Col-Waltz
Copy link
Contributor Author

Col-Waltz commented Jul 22, 2025

Hi @alamb, thanks again for your comment, please have a look on this.
I tried to compact the code but it leads to additional trait implementations in another parts of the project, so I assumed that it will be better to fix the problem this way until find_common_exprs() will be improved as it mentioned in TODO.

Please let me know if this problem was already solved or move my commit out of stale label.

@github-actions github-actions bot removed the Stale PR has not had any activity for some time label Jul 23, 2025
(2.0, 20, -5),
(3.0, 20, 4);

# https://github.com/apache/datafusion/issues/15291
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for some reason this query is failing for me

> WITH s AS (
    SELECT
        COUNT(a) FILTER (WHERE (b * b) - 3600 <= b),
	COUNT(a) FILTER (WHERE (b * b) - 3000 <= b AND (c >= 0)),
	COUNT(a) FILTER (WHERE (b * b) - 3000 <= b AND (c >= 0) AND (c >= 0))
    FROM t
) SELECT * FROM s;  🤔 Invalid statement: SQL error: ParserError("Expected: ), found: ( at Line: 3, Column: 25")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nm, I can get it to reproduce like this

> set datafusion.sql_parser.dialect = 'postgres';
0 row(s) fetched.
Elapsed 0.000 seconds.

> CREATE TABLE t (
  a DOUBLE,
  b BIGINT,
  c INT
) AS VALUES
(1.0, 10, -5),
(2.0, 20, -5),
(3.0, 20, 4);
0 row(s) fetched.
Elapsed 0.002 seconds.

> WITH s AS (
    SELECT
        COUNT(a) FILTER (WHERE (b * b) - 3600 <= b),
	COUNT(a) FILTER (WHERE (b * b) - 3000 <= b AND (c >= 0)),
	COUNT(a) FILTER (WHERE (b * b) - 3000 <= b AND (c >= 0) AND (c >= 0))
    FROM t
) SELECT * FROM s
;
Optimizer rule 'common_sub_expression_eliminate' failed
caused by
Schema error: No field named "count(t.a) FILTER (WHERE t.b * t.b - Int64(3600) <= t.b)". Did you mean 'count(t.a) FILTER (WHERE __common_expr_1 AS t.b * t.b - Int64(3600) <= t.b)'?.

@alamb
Copy link
Contributor

alamb commented Jul 28, 2025

I am not sure, I merged up to main and started the CI again.

@Col-Waltz
Copy link
Contributor Author

Hi, @alamb, just a kind reminder, if all checks have passed and my test reproduces the issue maybe we can merge the changes or there are still any problems with my fix?

@alamb
Copy link
Contributor

alamb commented Aug 26, 2025

Sorry for the delay -- I lost track of this PR. I merged up again from main and I will give it a more thorough review tomorrow

@alamb
Copy link
Contributor

alamb commented Aug 26, 2025

🤖 ./gh_compare_branch_bench.sh Benchmark Script Running
Linux aal-dev 6.14.0-1014-gcp #15~24.04.1-Ubuntu SMP Fri Jul 25 23:26:08 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
Comparing fix_common_subexpression_eliminate (416e585) to 1756692 diff
BENCH_NAME=sql_planner
BENCH_COMMAND=cargo bench --bench sql_planner
BENCH_FILTER=
BENCH_BRANCH_NAME=fix_common_subexpression_eliminate
Results will be posted here when complete

@alamb
Copy link
Contributor

alamb commented Aug 27, 2025

🤖: Benchmark completed

Details

group                                         fix_common_subexpression_eliminate     main
-----                                         ----------------------------------     ----
logical_aggregate_with_join                   1.01    638.4±4.71µs        ? ?/sec    1.00    634.3±3.86µs        ? ?/sec
logical_plan_optimize                         1.02     178.2±3.15s        ? ?/sec    1.00     175.4±1.57s        ? ?/sec
logical_select_all_from_1000                  1.00     11.5±0.09ms        ? ?/sec    1.00     11.4±0.06ms        ? ?/sec
logical_select_one_from_700                   1.01    420.4±2.35µs        ? ?/sec    1.00    417.6±1.38µs        ? ?/sec
logical_trivial_join_high_numbered_columns    1.01    379.5±5.09µs        ? ?/sec    1.00    376.3±6.60µs        ? ?/sec
logical_trivial_join_low_numbered_columns     1.01    365.5±2.58µs        ? ?/sec    1.00    362.0±2.66µs        ? ?/sec
physical_intersection                         1.01    845.5±3.64µs        ? ?/sec    1.00    840.3±2.90µs        ? ?/sec
physical_join_consider_sort                   1.01  1408.3±20.83µs        ? ?/sec    1.00   1396.1±8.94µs        ? ?/sec
physical_join_distinct                        1.01    355.5±4.82µs        ? ?/sec    1.00    352.4±1.13µs        ? ?/sec
physical_many_self_joins                      1.01     10.4±0.05ms        ? ?/sec    1.00     10.3±0.04ms        ? ?/sec
physical_plan_clickbench_all                  1.02    200.4±2.72ms        ? ?/sec    1.00    196.5±2.93ms        ? ?/sec
physical_plan_clickbench_q1                   1.00      2.6±0.06ms        ? ?/sec    1.00      2.6±0.20ms        ? ?/sec
physical_plan_clickbench_q10                  1.00      3.5±0.05ms        ? ?/sec    1.01      3.5±0.06ms        ? ?/sec
physical_plan_clickbench_q11                  1.00      3.7±0.06ms        ? ?/sec    1.02      3.8±0.05ms        ? ?/sec
physical_plan_clickbench_q12                  1.00      3.8±0.05ms        ? ?/sec    1.01      3.9±0.07ms        ? ?/sec
physical_plan_clickbench_q13                  1.00      3.5±0.04ms        ? ?/sec    1.01      3.5±0.05ms        ? ?/sec
physical_plan_clickbench_q14                  1.01      3.8±0.09ms        ? ?/sec    1.00      3.8±0.09ms        ? ?/sec
physical_plan_clickbench_q15                  1.00      3.6±0.06ms        ? ?/sec    1.03      3.7±0.07ms        ? ?/sec
physical_plan_clickbench_q16                  1.00      3.5±0.05ms        ? ?/sec    1.00      3.4±0.06ms        ? ?/sec
physical_plan_clickbench_q17                  1.01      3.6±0.05ms        ? ?/sec    1.00      3.5±0.05ms        ? ?/sec
physical_plan_clickbench_q18                  1.01      3.0±0.05ms        ? ?/sec    1.00      3.0±0.05ms        ? ?/sec
physical_plan_clickbench_q19                  1.00      4.0±0.09ms        ? ?/sec    1.01      4.0±0.10ms        ? ?/sec
physical_plan_clickbench_q2                   1.00      3.1±0.06ms        ? ?/sec    1.00      3.1±0.06ms        ? ?/sec
physical_plan_clickbench_q20                  1.02      2.7±0.05ms        ? ?/sec    1.00      2.7±0.04ms        ? ?/sec
physical_plan_clickbench_q21                  1.00      3.1±0.06ms        ? ?/sec    1.00      3.1±0.05ms        ? ?/sec
physical_plan_clickbench_q22                  1.00      3.8±0.11ms        ? ?/sec    1.00      3.8±0.08ms        ? ?/sec
physical_plan_clickbench_q23                  1.00      4.1±0.08ms        ? ?/sec    1.01      4.1±0.06ms        ? ?/sec
physical_plan_clickbench_q24                  1.01      4.7±0.09ms        ? ?/sec    1.00      4.6±0.09ms        ? ?/sec
physical_plan_clickbench_q25                  1.00      3.3±0.05ms        ? ?/sec    1.01      3.3±0.06ms        ? ?/sec
physical_plan_clickbench_q26                  1.00      3.0±0.05ms        ? ?/sec    1.00      3.0±0.05ms        ? ?/sec
physical_plan_clickbench_q27                  1.01      3.3±0.06ms        ? ?/sec    1.00      3.3±0.07ms        ? ?/sec
physical_plan_clickbench_q28                  1.01      4.1±0.09ms        ? ?/sec    1.00      4.1±0.10ms        ? ?/sec
physical_plan_clickbench_q29                  1.00      4.8±0.10ms        ? ?/sec    1.01      4.9±0.12ms        ? ?/sec
physical_plan_clickbench_q3                   1.00      3.0±0.10ms        ? ?/sec    1.00      3.0±0.05ms        ? ?/sec
physical_plan_clickbench_q30                  1.01     14.2±0.32ms        ? ?/sec    1.00     14.1±0.36ms        ? ?/sec
physical_plan_clickbench_q31                  1.03      4.2±0.11ms        ? ?/sec    1.00      4.0±0.09ms        ? ?/sec
physical_plan_clickbench_q32                  1.03      4.3±0.26ms        ? ?/sec    1.00      4.1±0.12ms        ? ?/sec
physical_plan_clickbench_q33                  1.00      3.5±0.07ms        ? ?/sec    1.01      3.5±0.08ms        ? ?/sec
physical_plan_clickbench_q34                  1.00      3.2±0.07ms        ? ?/sec    1.01      3.2±0.13ms        ? ?/sec
physical_plan_clickbench_q35                  1.00      3.3±0.07ms        ? ?/sec    1.01      3.3±0.07ms        ? ?/sec
physical_plan_clickbench_q36                  1.00      4.1±0.10ms        ? ?/sec    1.00      4.1±0.09ms        ? ?/sec
physical_plan_clickbench_q37                  1.01      4.2±0.10ms        ? ?/sec    1.00      4.1±0.10ms        ? ?/sec
physical_plan_clickbench_q38                  1.03      4.3±0.12ms        ? ?/sec    1.00      4.1±0.13ms        ? ?/sec
physical_plan_clickbench_q39                  1.01      4.1±0.11ms        ? ?/sec    1.00      4.0±0.11ms        ? ?/sec
physical_plan_clickbench_q4                   1.00      2.7±0.04ms        ? ?/sec    1.00      2.7±0.05ms        ? ?/sec
physical_plan_clickbench_q40                  1.04      4.8±0.18ms        ? ?/sec    1.00      4.6±0.11ms        ? ?/sec
physical_plan_clickbench_q41                  1.01      4.2±0.10ms        ? ?/sec    1.00      4.2±0.09ms        ? ?/sec
physical_plan_clickbench_q42                  1.02      4.1±0.08ms        ? ?/sec    1.00      4.0±0.08ms        ? ?/sec
physical_plan_clickbench_q43                  1.03      4.4±0.08ms        ? ?/sec    1.00      4.3±0.11ms        ? ?/sec
physical_plan_clickbench_q44                  1.00      2.8±0.06ms        ? ?/sec    1.00      2.8±0.05ms        ? ?/sec
physical_plan_clickbench_q45                  1.00      2.8±0.05ms        ? ?/sec    1.00      2.8±0.06ms        ? ?/sec
physical_plan_clickbench_q46                  1.00      3.3±0.05ms        ? ?/sec    1.00      3.3±0.04ms        ? ?/sec
physical_plan_clickbench_q47                  1.02      4.0±0.08ms        ? ?/sec    1.00      3.9±0.06ms        ? ?/sec
physical_plan_clickbench_q48                  1.05      4.8±0.23ms        ? ?/sec    1.00      4.6±0.06ms        ? ?/sec
physical_plan_clickbench_q49                  1.02      5.1±0.13ms        ? ?/sec    1.00      5.0±0.17ms        ? ?/sec
physical_plan_clickbench_q5                   1.00      2.9±0.04ms        ? ?/sec    1.00      2.9±0.04ms        ? ?/sec
physical_plan_clickbench_q50                  1.05      4.6±0.12ms        ? ?/sec    1.00      4.4±0.08ms        ? ?/sec
physical_plan_clickbench_q51                  1.00      3.4±0.06ms        ? ?/sec    1.01      3.4±0.15ms        ? ?/sec
physical_plan_clickbench_q6                   1.00      2.9±0.04ms        ? ?/sec    1.03      3.0±0.09ms        ? ?/sec
physical_plan_clickbench_q7                   1.00      2.5±0.03ms        ? ?/sec    1.04      2.6±0.03ms        ? ?/sec
physical_plan_clickbench_q8                   1.00      3.6±0.07ms        ? ?/sec    1.02      3.6±0.08ms        ? ?/sec
physical_plan_clickbench_q9                   1.00      3.4±0.06ms        ? ?/sec    1.01      3.4±0.06ms        ? ?/sec
physical_plan_tpcds_all                       1.00   1039.3±3.31ms        ? ?/sec    1.00   1044.3±3.04ms        ? ?/sec
physical_plan_tpch_all                        1.00     62.9±0.26ms        ? ?/sec    1.01     63.3±0.22ms        ? ?/sec
physical_plan_tpch_q1                         1.00      2.1±0.00ms        ? ?/sec    1.00      2.0±0.00ms        ? ?/sec
physical_plan_tpch_q10                        1.00      3.9±0.02ms        ? ?/sec    1.00      3.9±0.02ms        ? ?/sec
physical_plan_tpch_q11                        1.00      3.3±0.01ms        ? ?/sec    1.00      3.3±0.01ms        ? ?/sec
physical_plan_tpch_q12                        1.00  1830.3±10.37µs        ? ?/sec    1.01   1841.9±9.27µs        ? ?/sec
physical_plan_tpch_q13                        1.00   1462.2±7.82µs        ? ?/sec    1.01   1472.6±8.85µs        ? ?/sec
physical_plan_tpch_q14                        1.00   1977.6±8.28µs        ? ?/sec    1.00   1982.0±4.56µs        ? ?/sec
physical_plan_tpch_q16                        1.00      2.5±0.01ms        ? ?/sec    1.01      2.5±0.01ms        ? ?/sec
physical_plan_tpch_q17                        1.01      2.4±0.01ms        ? ?/sec    1.00      2.4±0.01ms        ? ?/sec
physical_plan_tpch_q18                        1.02      2.7±0.01ms        ? ?/sec    1.00      2.7±0.01ms        ? ?/sec
physical_plan_tpch_q19                        1.01      3.3±0.02ms        ? ?/sec    1.00      3.2±0.01ms        ? ?/sec
physical_plan_tpch_q2                         1.00      5.5±0.04ms        ? ?/sec    1.00      5.5±0.01ms        ? ?/sec
physical_plan_tpch_q20                        1.00      3.1±0.01ms        ? ?/sec    1.01      3.1±0.01ms        ? ?/sec
physical_plan_tpch_q21                        1.00      4.1±0.02ms        ? ?/sec    1.00      4.1±0.01ms        ? ?/sec
physical_plan_tpch_q22                        1.00      2.7±0.01ms        ? ?/sec    1.00      2.7±0.02ms        ? ?/sec
physical_plan_tpch_q3                         1.00      2.6±0.01ms        ? ?/sec    1.00      2.6±0.01ms        ? ?/sec
physical_plan_tpch_q4                         1.00   1536.1±9.42µs        ? ?/sec    1.00   1533.0±4.73µs        ? ?/sec
physical_plan_tpch_q5                         1.00      3.2±0.01ms        ? ?/sec    1.00      3.2±0.01ms        ? ?/sec
physical_plan_tpch_q6                         1.00    872.2±4.68µs        ? ?/sec    1.00    870.1±3.44µs        ? ?/sec
physical_plan_tpch_q7                         1.00      4.3±0.02ms        ? ?/sec    1.00      4.3±0.02ms        ? ?/sec
physical_plan_tpch_q8                         1.00      5.2±0.02ms        ? ?/sec    1.00      5.2±0.02ms        ? ?/sec
physical_plan_tpch_q9                         1.00      4.1±0.02ms        ? ?/sec    1.00      4.1±0.01ms        ? ?/sec
physical_select_aggregates_from_200           1.01     17.1±0.10ms        ? ?/sec    1.00     17.0±0.07ms        ? ?/sec
physical_select_all_from_1000                 1.00     25.0±0.08ms        ? ?/sec    1.01     25.1±0.12ms        ? ?/sec
physical_select_one_from_700                  1.01  1095.0±16.03µs        ? ?/sec    1.00   1085.4±7.55µs        ? ?/sec
physical_sorted_union_order_by_10             1.00     13.3±0.07ms        ? ?/sec    1.00     13.3±0.09ms        ? ?/sec
physical_sorted_union_order_by_100            1.01       2.1±0.03s        ? ?/sec    1.00       2.1±0.02s        ? ?/sec
physical_sorted_union_order_by_200            1.01      12.9±0.07s        ? ?/sec    1.00      12.8±0.06s        ? ?/sec
physical_sorted_union_order_by_300            1.01      39.1±0.17s        ? ?/sec    1.00      38.9±0.28s        ? ?/sec
physical_sorted_union_order_by_50             1.01    393.3±2.59ms        ? ?/sec    1.00    390.8±2.31ms        ? ?/sec
physical_theta_join_consider_sort             1.01   1763.6±4.99µs        ? ?/sec    1.00   1744.6±7.48µs        ? ?/sec
physical_unnest_to_join                       1.00  1320.2±19.10µs        ? ?/sec    1.00   1315.7±6.12µs        ? ?/sec
with_param_values_many_columns                1.00    144.6±2.18µs        ? ?/sec    1.00    144.1±2.33µs        ? ?/sec

Copy link
Contributor

@Jefffrey Jefffrey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks in line with other parts of CSE:

// If there were common expressions extracted, then we need to
// make sure we restore the original column names.
// TODO: Although `find_common_exprs()` inserts aliases around
// extracted common expressions this doesn't mean that the
// original column names (schema) are preserved due to the
// inserted aliases are not always at the top of the
// expression.
// Let's consider improving `find_common_exprs()` to always
// keep column names and get rid of additional name
// preserving logic here.
if let Some(aggr_expr) = aggr_expr {
let name_preserver = NamePreserver::new_for_projection();
let saved_names = aggr_expr
.iter()
.map(|expr| name_preserver.save(expr))
.collect::<Vec<_>>();
let new_aggr_expr = rewritten_aggr_expr
.into_iter()
.zip(saved_names)
.map(|(new_expr, saved_name)| {
saved_name.restore(new_expr)
})
.collect::<Vec<Expr>>();

// If there were common expressions extracted, then we need to make sure
// we restore the original column names.
// TODO: Although `find_common_exprs()` inserts aliases around extracted
// common expressions this doesn't mean that the original column names
// (schema) are preserved due to the inserted aliases are not always at
// the top of the expression.
// Let's consider improving `find_common_exprs()` to always keep column
// names and get rid of additional name preserving logic here.
if let Some(window_expr_list) = window_expr_list {
let name_preserver = NamePreserver::new_for_projection();
let saved_names = window_expr_list
.iter()
.map(|exprs| {
exprs
.iter()
.map(|expr| name_preserver.save(expr))
.collect::<Vec<_>>()
})
.collect::<Vec<_>>();

So I think this is good to merge; perhaps copy those comments above in with this fix for visibility.

(btw what does find_common_exprs() refer to now; I can't grep that function 🤔 )

@alamb
Copy link
Contributor

alamb commented Sep 25, 2025

I merged up from main and plan to merge this PR once Ci passes. Thank you for your patience @Col-Waltz and for the revie @Jefffrey

@alamb alamb added this pull request to the merge queue Sep 28, 2025
Merged via the queue into apache:main with commit 948f6b8 Sep 28, 2025
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failed optimizations with Int64 type
3 participants