Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect results with expression resolution #10413

Closed
alamb opened this issue May 7, 2024 · 4 comments · Fixed by #10459
Closed

Incorrect results with expression resolution #10413

alamb opened this issue May 7, 2024 · 4 comments · Fixed by #10459
Assignees
Labels
bug Something isn't working

Comments

@alamb
Copy link
Contributor

alamb commented May 7, 2024

Describe the bug

DataFusion will sometimes resolve expressions incorrectly when the alias shadows an expression

To Reproduce

DataFusion CLI v37.1.0
> select a + b from (select 1 as a, 2 as b, 1 as "a + b");
+-------+
| a + b |
+-------+
| 1     | <- Should be 3
+-------+
1 row(s) fetched.
Elapsed 0.009 seconds.

Expected behavior

Result should be 3

Additional context

This was found by @peter-toth in #10396

@alamb alamb added the bug Something isn't working label May 7, 2024
@alamb alamb changed the title Incorrect results with common subexpression elimination Incorrect results with expression resolution May 7, 2024
@alamb
Copy link
Contributor Author

alamb commented May 7, 2024

As @peter-toth mentions in #10396 (comment) this appears to be related to name resolution (not CSE). I checked with EXPLAIN VERBOSE and indeed the initial plan appears to be incorret

DataFusion CLI v37.1.0
> explain verbose select a + b from (select 1 as a, 2 as b, 1 as "a + b");
+------------------------------------------------------------+--------------------------------------------------------------------------------------------+
| plan_type                                                  | plan                                                                                       |
+------------------------------------------------------------+--------------------------------------------------------------------------------------------+
| initial_logical_plan                                       | Projection: a + b                                                                          |
|                                                            |   Projection: Int64(1) AS a, Int64(2) AS b, Int64(1) AS a + b                              |
|                                                            |     EmptyRelation                                                                          |
| logical_plan after apply_function_rewrites                 | SAME TEXT AS ABOVE                                                                         |
| logical_plan after inline_table_scan                       | SAME TEXT AS ABOVE                                                                         |
| logical_plan after type_coercion                           | SAME TEXT AS ABOVE                                                                         |
| logical_plan after count_wildcard_rule                     | SAME TEXT AS ABOVE                                                                         |
| analyzed_logical_plan                                      | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_nested_union                  | SAME TEXT AS ABOVE                                                                         |
| logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                         |
| logical_plan after replace_distinct_aggregate              | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_join                          | SAME TEXT AS ABOVE                                                                         |
| logical_plan after decorrelate_predicate_subquery          | SAME TEXT AS ABOVE                                                                         |
| logical_plan after scalar_subquery_to_join                 | SAME TEXT AS ABOVE                                                                         |
| logical_plan after extract_equijoin_predicate              | SAME TEXT AS ABOVE                                                                         |
| logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after rewrite_disjunctive_predicate           | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_duplicated_expr               | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_filter                        | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_cross_join                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_limit                         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after propagate_empty_relation                | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_one_union                     | SAME TEXT AS ABOVE                                                                         |
| logical_plan after filter_null_join_keys                   | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_outer_join                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after push_down_filter                        | SAME TEXT AS ABOVE                                                                         |
| logical_plan after single_distinct_aggregation_to_group_by | SAME TEXT AS ABOVE                                                                         |
| logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                         |
| logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after optimize_projections                    | Projection: Int64(1) AS a + b                                                              |
|                                                            |   EmptyRelation                                                                            |
| logical_plan after eliminate_nested_union                  | SAME TEXT AS ABOVE                                                                         |
| logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                         |
| logical_plan after replace_distinct_aggregate              | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_join                          | SAME TEXT AS ABOVE                                                                         |
| logical_plan after decorrelate_predicate_subquery          | SAME TEXT AS ABOVE                                                                         |
| logical_plan after scalar_subquery_to_join                 | SAME TEXT AS ABOVE                                                                         |
| logical_plan after extract_equijoin_predicate              | SAME TEXT AS ABOVE                                                                         |
| logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after rewrite_disjunctive_predicate           | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_duplicated_expr               | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_filter                        | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_cross_join                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_limit                         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after propagate_empty_relation                | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_one_union                     | SAME TEXT AS ABOVE                                                                         |
| logical_plan after filter_null_join_keys                   | SAME TEXT AS ABOVE                                                                         |
| logical_plan after eliminate_outer_join                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after push_down_limit                         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after push_down_filter                        | SAME TEXT AS ABOVE                                                                         |
| logical_plan after single_distinct_aggregation_to_group_by | SAME TEXT AS ABOVE                                                                         |
| logical_plan after simplify_expressions                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan after unwrap_cast_in_comparison               | SAME TEXT AS ABOVE                                                                         |
| logical_plan after common_sub_expression_eliminate         | SAME TEXT AS ABOVE                                                                         |
| logical_plan after optimize_projections                    | SAME TEXT AS ABOVE                                                                         |
| logical_plan                                               | Projection: Int64(1) AS a + b                                                              |
|                                                            |   EmptyRelation                                                                            |
| initial_physical_plan                                      | ProjectionExec: expr=[1 as a + b]                                                          |
|                                                            |   PlaceholderRowExec                                                                       |
|                                                            |                                                                                            |
| initial_physical_plan_with_stats                           | ProjectionExec: expr=[1 as a + b], statistics=[Rows=Exact(1), Bytes=Exact(8), [(Col[0]:)]] |
|                                                            |   PlaceholderRowExec, statistics=[Rows=Exact(1), Bytes=Exact(0), []]                       |
|                                                            |                                                                                            |
| physical_plan after OutputRequirements                     | OutputRequirementExec                                                                      |
|                                                            |   ProjectionExec: expr=[1 as a + b]                                                        |
|                                                            |     PlaceholderRowExec                                                                     |
|                                                            |                                                                                            |
| physical_plan after aggregate_statistics                   | SAME TEXT AS ABOVE                                                                         |
| physical_plan after join_selection                         | SAME TEXT AS ABOVE                                                                         |
| physical_plan after LimitedDistinctAggregation             | SAME TEXT AS ABOVE                                                                         |
| physical_plan after EnforceDistribution                    | SAME TEXT AS ABOVE                                                                         |
| physical_plan after CombinePartialFinalAggregate           | SAME TEXT AS ABOVE                                                                         |
| physical_plan after EnforceSorting                         | SAME TEXT AS ABOVE                                                                         |
| physical_plan after OptimizeAggregateOrder                 | SAME TEXT AS ABOVE                                                                         |
| physical_plan after ProjectionPushdown                     | SAME TEXT AS ABOVE                                                                         |
| physical_plan after coalesce_batches                       | SAME TEXT AS ABOVE                                                                         |
| physical_plan after OutputRequirements                     | ProjectionExec: expr=[1 as a + b]                                                          |
|                                                            |   PlaceholderRowExec                                                                       |
|                                                            |                                                                                            |
| physical_plan after PipelineChecker                        | SAME TEXT AS ABOVE                                                                         |
| physical_plan after LimitAggregation                       | SAME TEXT AS ABOVE                                                                         |
| physical_plan after ProjectionPushdown                     | SAME TEXT AS ABOVE                                                                         |
| physical_plan                                              | ProjectionExec: expr=[1 as a + b]                                                          |
|                                                            |   PlaceholderRowExec                                                                       |
|                                                            |                                                                                            |
| physical_plan_with_stats                                   | ProjectionExec: expr=[1 as a + b], statistics=[Rows=Exact(1), Bytes=Exact(8), [(Col[0]:)]] |
|                                                            |   PlaceholderRowExec, statistics=[Rows=Exact(1), Bytes=Exact(0), []]                       |
|                                                            |                                                                                            |
+------------------------------------------------------------+--------------------------------------------------------------------------------------------+
77 row(s) fetched.
Elapsed 0.032 seconds.

@jonahgao
Copy link
Member

It is caused by the columnize_expr function. Based on the comments, I guess columnize_expr is for special needs related to aggregation, and it was introduced in PR #55.

Maybe we can reimplement columnize_expr by directly comparing the aggregated output expressions instead of their display names. Let me try to fix this issue.

@jonahgao
Copy link
Member

take

@alamb
Copy link
Contributor Author

alamb commented May 10, 2024

Wow #55 is a very old one. Thank you @jonahgao -- this will be a great fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants