Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix: fix tpcds_logical_q8 ambiguous name. #5335

Merged
merged 2 commits into from
Feb 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 23 additions & 6 deletions datafusion/common/src/dfschema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -284,12 +284,29 @@ impl DFSchema {
match matches.len() {
0 => Err(field_not_found(None, name, self)),
1 => Ok(matches[0]),
_ => Err(DataFusionError::SchemaError(
SchemaError::AmbiguousReference {
qualifier: None,
name: name.to_string(),
},
)),
_ => {
// When `matches` size > 1, it doesn't necessarily mean an `ambiguous name` problem.
// Because name may generate from Alias/... . It means that it don't own qualifier.
// For example:
// Join on id = b.id
// Project a.id as id TableScan b id
// In this case, there isn't `ambiguous name` problem. When `matches` just contains
// one field without qualifier, we should return it.
let fields_without_qualifier = matches
.iter()
.filter(|f| f.qualifier.is_none())
.collect::<Vec<_>>();
if fields_without_qualifier.len() == 1 {
Ok(fields_without_qualifier[0])
Comment on lines +295 to +300
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might be able to write this using find instead

Something like

Suggested change
let fields_without_qualifier = matches
.iter()
.filter(|f| f.qualifier.is_none())
.collect::<Vec<_>>();
if fields_without_qualifier.len() == 1 {
Ok(fields_without_qualifier[0])
if let Some(fields_without_qualifier) = matches
.iter()
.find(|f| f.qualifier.is_none()) {
Ok(fields_without_qualifier[0])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When fields_without_qualifier size > 1, we should still return a Error.
So we shouldn't use find

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense 👍

} else {
Err(DataFusionError::SchemaError(
SchemaError::AmbiguousReference {
qualifier: None,
name: name.to_string(),
},
))
}
}
}
}

Expand Down
2 changes: 1 addition & 1 deletion datafusion/optimizer/src/utils.rs
Original file line number Diff line number Diff line change
Expand Up @@ -416,7 +416,7 @@ pub fn only_or_err<T>(slice: &[T]) -> Result<&T> {
/// Rewrites `expr` using `rewriter`, ensuring that the output has the
/// same name as `expr` prior to rewrite, adding an alias if necessary.
///
/// This is important when optimzing plans to ensure the the output
/// This is important when optimizing plans to ensure the the output
/// schema of plan nodes don't change after optimization
pub fn rewrite_preserving_name<R>(expr: Expr, rewriter: &mut R) -> Result<Expr>
where
Expand Down
21 changes: 17 additions & 4 deletions datafusion/optimizer/tests/integration-test.rs
Original file line number Diff line number Diff line change
Expand Up @@ -320,6 +320,22 @@ fn push_down_filter_groupby_expr_contains_alias() {
assert_eq!(expected, format!("{plan:?}"));
}

#[test]
// issue: https://github.com/apache/arrow-datafusion/issues/5334
fn test_same_name_but_not_ambiguous() {
let sql = "SELECT t1.col_int32 AS col_int32 FROM test t1 intersect SELECT col_int32 FROM test t2";
let plan = test_sql(sql).unwrap();
let expected = "LeftSemi Join: col_int32 = t2.col_int32\
\n Distinct:\
\n Projection: t1.col_int32 AS col_int32\
\n SubqueryAlias: t1\
\n TableScan: test projection=[col_int32]\
\n Projection: t2.col_int32\
\n SubqueryAlias: t2\
\n TableScan: test projection=[col_int32]";
assert_eq!(expected, format!("{plan:?}"));
}

fn test_sql(sql: &str) -> Result<LogicalPlan> {
// parse the SQL
let dialect = GenericDialect {}; // or AnsiDialect, or your own dialect ...
Expand Down Expand Up @@ -348,10 +364,7 @@ struct MySchemaProvider {
}

impl ContextProvider for MySchemaProvider {
fn get_table_provider(
&self,
name: TableReference,
) -> datafusion_common::Result<Arc<dyn TableSource>> {
fn get_table_provider(&self, name: TableReference) -> Result<Arc<dyn TableSource>> {
let table_name = name.table();
if table_name.starts_with("test") {
let schema = Schema::new_with_metadata(
Expand Down