Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix ambiguous reference when aliasing in combination with ORDER BY #8425

Merged
merged 41 commits into from
Dec 6, 2023
Merged
Show file tree
Hide file tree
Changes from 39 commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
7afeb8b
Minor: Improve the document format of JoinHashMap
Asura7969 Nov 8, 2023
6332bec
Merge remote-tracking branch 'origin/main'
Asura7969 Nov 10, 2023
cc5e0c7
Merge remote-tracking branch 'origin/main'
Asura7969 Nov 10, 2023
a114310
Merge remote-tracking branch 'origin/main'
Asura7969 Nov 11, 2023
928c811
Merge remote-tracking branch 'origin/main'
Asura7969 Nov 11, 2023
839093e
Merge remote-tracking branch 'origin/main'
Asura7969 Nov 12, 2023
a836cde
Merge remote-tracking branch 'origin/main'
Asura7969 Nov 13, 2023
5648dc7
Merge branch 'apache:main' into main
Asura7969 Nov 13, 2023
a670409
Merge branch 'apache:main' into main
Asura7969 Nov 14, 2023
22894a3
Merge branch 'apache:main' into main
Asura7969 Nov 14, 2023
73a59d2
Merge branch 'apache:main' into main
Asura7969 Nov 15, 2023
46409c2
Merge branch 'apache:main' into main
Asura7969 Nov 16, 2023
8a86a4c
Merge branch 'apache:main' into main
Asura7969 Nov 17, 2023
cf5c584
Merge branch 'apache:main' into main
Asura7969 Nov 17, 2023
62ae9b9
Merge branch 'apache:main' into main
Asura7969 Nov 19, 2023
da02fa2
Merge branch 'apache:main' into main
Asura7969 Nov 20, 2023
d98eb2e
Merge branch 'apache:main' into main
Asura7969 Nov 21, 2023
79e7216
Merge branch 'apache:main' into main
Asura7969 Nov 21, 2023
ba51abd
Merge branch 'apache:main' into main
Asura7969 Nov 23, 2023
2468f52
Merge branch 'apache:main' into main
Asura7969 Nov 23, 2023
180c303
Merge branch 'apache:main' into main
Asura7969 Nov 24, 2023
68980ba
Merge branch 'apache:main' into main
Asura7969 Nov 27, 2023
9411940
Merge branch 'apache:main' into main
Asura7969 Nov 27, 2023
ba28346
Merge branch 'apache:main' into main
Asura7969 Nov 28, 2023
df0942f
Merge branch 'apache:main' into main
Asura7969 Nov 29, 2023
edccb66
Merge branch 'apache:main' into main
Asura7969 Nov 29, 2023
fb74b99
Merge branch 'apache:main' into main
Asura7969 Nov 30, 2023
767b004
Merge branch 'apache:main' into main
Asura7969 Dec 1, 2023
2e0eef5
Merge branch 'apache:main' into main
Asura7969 Dec 2, 2023
749e0c8
Merge branch 'apache:main' into main
Asura7969 Dec 3, 2023
9e65482
ambiguous reference
Asura7969 Dec 4, 2023
443595c
ignore aliases
Asura7969 Dec 5, 2023
5d43a94
Merge branch 'apache:main' into main
Asura7969 Dec 5, 2023
b6d85d7
Merge branch 'main' into ambiguous_reference
Asura7969 Dec 5, 2023
e6f144e
Merge remote-tracking branch 'origin/ambiguous_reference' into ambigu…
Asura7969 Dec 5, 2023
dd5af6d
fix
Asura7969 Dec 5, 2023
d755ce6
fix
Asura7969 Dec 5, 2023
a338ac2
add test
Asura7969 Dec 6, 2023
47b81e6
add test
Asura7969 Dec 6, 2023
2f39667
add test
Asura7969 Dec 6, 2023
0f65aca
add test
Asura7969 Dec 6, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 4 additions & 5 deletions datafusion/optimizer/tests/optimizer_integration.rs
Original file line number Diff line number Diff line change
Expand Up @@ -324,11 +324,10 @@ fn push_down_filter_groupby_expr_contains_alias() {
fn test_same_name_but_not_ambiguous() {
let sql = "SELECT t1.col_int32 AS col_int32 FROM test t1 intersect SELECT col_int32 FROM test t2";
let plan = test_sql(sql).unwrap();
let expected = "LeftSemi Join: col_int32 = t2.col_int32\
\n Aggregate: groupBy=[[col_int32]], aggr=[[]]\
\n Projection: t1.col_int32 AS col_int32\
\n SubqueryAlias: t1\
\n TableScan: test projection=[col_int32]\
let expected = "LeftSemi Join: t1.col_int32 = t2.col_int32\
\n Aggregate: groupBy=[[t1.col_int32]], aggr=[[]]\
\n SubqueryAlias: t1\
\n TableScan: test projection=[col_int32]\
\n SubqueryAlias: t2\
\n TableScan: test projection=[col_int32]";
assert_eq!(expected, format!("{plan:?}"));
Expand Down
7 changes: 6 additions & 1 deletion datafusion/sql/src/select.rs
Original file line number Diff line number Diff line change
Expand Up @@ -384,7 +384,12 @@ impl<'a, S: ContextProvider> SqlToRel<'a, S> {
&[&[plan.schema()]],
&plan.using_columns()?,
)?;
let expr = col.alias(self.normalizer.normalize(alias));
let name = self.normalizer.normalize(alias);
// avoiding adding an alias if the column name is the same.
let expr = match &col {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add a comment here about what this is doing? I think it is avoiding adding an alias if the column name is the same.

Do we have to worry about relation names here too (like what if column.relation is non null?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand, but I didn't expect the scenario where column.relation is inconsistent🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking of a query like

EXPLAIN select a as a FROM table1 CROSS JOIN table2 order by a

Where both table1 and table2 have a column named a. I think we should add a test for this case too -- I'll try and do so later today

I would expect the test to fail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you expected

DataFusion CLI v33.0.0
❯ CREATE TABLE table1 (a int) as values(1);
0 rows in set. Query took 0.037 seconds.

❯ CREATE TABLE table2 (a int) as values(1);
0 rows in set. Query took 0.012 seconds.

❯ EXPLAIN select a as a FROM table1 CROSS JOIN table2 order by a;
Schema error: Ambiguous reference to unqualified field a

Expr::Column(column) if column.name.eq(&name) => col,
_ => col.alias(name),
};
Ok(vec![expr])
}
SelectItem::Wildcard(options) => {
Expand Down
17 changes: 14 additions & 3 deletions datafusion/sql/tests/sql_integration.rs
Original file line number Diff line number Diff line change
Expand Up @@ -3546,13 +3546,24 @@ fn test_select_unsupported_syntax_errors(#[case] sql: &str, #[case] error: &str)
fn select_order_by_with_cast() {
let sql =
"SELECT first_name AS first_name FROM (SELECT first_name AS first_name FROM person) ORDER BY CAST(first_name as INT)";
let expected = "Sort: CAST(first_name AS first_name AS Int32) ASC NULLS LAST\
\n Projection: first_name AS first_name\
\n Projection: person.first_name AS first_name\
let expected = "Sort: CAST(person.first_name AS Int32) ASC NULLS LAST\
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this certainly looks better

\n Projection: person.first_name\
\n Projection: person.first_name\
\n TableScan: person";
quick_test(sql, expected);
}

#[test]
fn test_avoid_add_alias() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ensure logical consistency

// avoiding adding an alias if the column name is the same.
// plan1 = plan2
let sql = "select person.id as id from person order by person.id";
let plan1 = logical_plan(sql).unwrap();
let sql = "select id from person order by id";
let plan2 = logical_plan(sql).unwrap();
assert_eq!(format!("{plan1:?}"), format!("{plan2:?}"));
}

#[test]
fn test_duplicated_left_join_key_inner_join() {
// person.id * 2 happen twice in left side.
Expand Down
17 changes: 17 additions & 0 deletions datafusion/sqllogictest/test_files/select.slt
Original file line number Diff line number Diff line change
Expand Up @@ -868,6 +868,23 @@ statement error DataFusion error: Error during planning: EXCLUDE or EXCEPT conta
SELECT * EXCLUDE(d, b, c, a, a, b, c, d)
FROM table1

# avoiding adding an alias if the column name is the same
query I
select a as a FROM table1 order by a
----
1
2

query TT
EXPLAIN select a as a FROM table1 order by a
----
logical_plan
Sort: table1.a ASC NULLS LAST
--TableScan: table1 projection=[a]
physical_plan
SortExec: expr=[a@0 ASC NULLS LAST]
--MemoryExec: partitions=1, partition_sizes=[1]

# run below query in multi partitions
statement ok
set datafusion.execution.target_partitions = 2;
Expand Down