-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the parsing result for the special double number #1621
Merged
+410
−265
Merged
Changes from 5 commits
Commits
Show all changes
14 commits
Select commit
Hold shift + click to select a range
7befdca
handle struct access chain by CompoundFieldAccess
goldmedal 22b0746
scope unquoted hyphenated ident for specific dialect
goldmedal d9d0d06
add some documents
goldmedal 0ed3a4a
fix clippy
goldmedal e9c3305
add issue link
goldmedal b1d02f2
handle unquoted_hyphenated_identifier in the parser layer
goldmedal 4cf2521
remove the support_unquoted_hyphenated_identifiers from tokenizer
goldmedal 1fb4503
improve the doc
goldmedal b5c877b
remove unused argument
goldmedal 69385ea
fix the doc
goldmedal 08b89d5
fix fmt and clippy
goldmedal 21cf922
Merge branch 'main' into fix/1619-fix-number-value
goldmedal 6cbcfda
fix conflict
goldmedal 548e032
remove outdated document
goldmedal File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2964,6 +2964,115 @@ fn test_compound_expr() { | |
} | ||
} | ||
|
||
#[test] | ||
fn test_double_value() { | ||
// TODO: support double value for dialect that supports unquoted hyphenated identifiers | ||
// see issue: https://github.com/apache/datafusion-sqlparser-rs/issues/1622 | ||
let dialects = all_dialects_where(|dialect| !dialect.support_unquoted_hyphenated_identifiers()); | ||
let test_cases = vec![ | ||
gen_number_case_with_sign("0."), | ||
gen_number_case_with_sign("0.0"), | ||
gen_number_case_with_sign("0000."), | ||
gen_number_case_with_sign("0000.00"), | ||
gen_number_case_with_sign(".0"), | ||
gen_number_case_with_sign(".00"), | ||
gen_number_case_with_sign("0e0"), | ||
gen_number_case_with_sign("0e+0"), | ||
gen_number_case_with_sign("0e-0"), | ||
gen_number_case_with_sign("0.e-0"), | ||
gen_number_case_with_sign("0.e+0"), | ||
gen_number_case_with_sign(".0e-0"), | ||
gen_number_case_with_sign(".0e+0"), | ||
gen_number_case_with_sign("00.0e+0"), | ||
gen_number_case_with_sign("00.0e-0"), | ||
]; | ||
|
||
for (input, expected) in test_cases { | ||
for (i, expr) in input.iter().enumerate() { | ||
if let Statement::Query(query) = | ||
dialects.one_statement_parses_to(&format!("SELECT {}", expr), "") | ||
{ | ||
if let SetExpr::Select(select) = *query.body { | ||
assert_eq!(expected[i], select.projection[0]); | ||
} else { | ||
panic!("Expected a SELECT statement"); | ||
} | ||
} else { | ||
panic!("Expected a SELECT statement"); | ||
} | ||
} | ||
} | ||
} | ||
|
||
fn gen_number_case(value: &str) -> (Vec<String>, Vec<SelectItem>) { | ||
let input = vec![ | ||
value.to_string(), | ||
format!("{} col_alias", value), | ||
format!("{} AS col_alias", value), | ||
]; | ||
let expected = vec![ | ||
SelectItem::UnnamedExpr(Expr::Value(number(value))), | ||
SelectItem::ExprWithAlias { | ||
expr: Expr::Value(number(value)), | ||
alias: Ident::new("col_alias"), | ||
}, | ||
SelectItem::ExprWithAlias { | ||
expr: Expr::Value(number(value)), | ||
alias: Ident::new("col_alias"), | ||
}, | ||
]; | ||
(input, expected) | ||
} | ||
|
||
fn gen_sign_number_case(value: &str, op: UnaryOperator) -> (Vec<String>, Vec<SelectItem>) { | ||
match op { | ||
UnaryOperator::Plus | UnaryOperator::Minus => {} | ||
_ => panic!("Invalid sign"), | ||
} | ||
|
||
let input = vec![ | ||
format!("{}{}", op, value), | ||
format!("{}{} col_alias", op, value), | ||
format!("{}{} AS col_alias", op, value), | ||
]; | ||
let expected = vec![ | ||
SelectItem::UnnamedExpr(Expr::UnaryOp { | ||
op, | ||
expr: Box::new(Expr::Value(number(value))), | ||
}), | ||
SelectItem::ExprWithAlias { | ||
expr: Expr::UnaryOp { | ||
op, | ||
expr: Box::new(Expr::Value(number(value))), | ||
}, | ||
alias: Ident::new("col_alias"), | ||
}, | ||
SelectItem::ExprWithAlias { | ||
expr: Expr::UnaryOp { | ||
op, | ||
expr: Box::new(Expr::Value(number(value))), | ||
}, | ||
alias: Ident::new("col_alias"), | ||
}, | ||
]; | ||
(input, expected) | ||
} | ||
|
||
/// generate the test cases for signed and unsigned numbers | ||
/// For example, given "0.0", the test cases will be: | ||
/// - "0.0" | ||
/// - "+0.0" | ||
/// - "-0.0" | ||
fn gen_number_case_with_sign(number: &str) -> (Vec<String>, Vec<SelectItem>) { | ||
let (mut input, mut expected) = gen_number_case(number); | ||
for op in [UnaryOperator::Plus, UnaryOperator::Minus] { | ||
let (input_sign, expected_sign) = gen_sign_number_case(number, op); | ||
input.extend(input_sign); | ||
expected.extend(expected_sign); | ||
} | ||
(input, expected) | ||
} | ||
|
||
#[test] | ||
fn parse_negative_value() { | ||
let sql1 = "SELECT -1"; | ||
|
@@ -12470,6 +12579,41 @@ fn parse_composite_access_expr() { | |
all_dialects_where(|d| d.supports_struct_literal()).verified_stmt( | ||
"SELECT * FROM t WHERE STRUCT(STRUCT(1 AS a, NULL AS b) AS c, NULL AS d).c.a IS NOT NULL", | ||
); | ||
let support_struct = all_dialects_where(|d| d.supports_struct_literal()); | ||
let stmt = support_struct | ||
.verified_only_select("SELECT STRUCT(STRUCT(1 AS a, NULL AS b) AS c, NULL AS d).c.a"); | ||
let expected = SelectItem::UnnamedExpr(Expr::CompoundFieldAccess { | ||
Comment on lines
+12580
to
+12583
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The access chain for the struct literal is represented by |
||
root: Box::new(Expr::Struct { | ||
values: vec![ | ||
Expr::Named { | ||
name: Ident::new("c"), | ||
expr: Box::new(Expr::Struct { | ||
values: vec![ | ||
Expr::Named { | ||
name: Ident::new("a"), | ||
expr: Box::new(Expr::Value(Number("1".parse().unwrap(), false))), | ||
}, | ||
Expr::Named { | ||
name: Ident::new("b"), | ||
expr: Box::new(Expr::Value(Value::Null)), | ||
}, | ||
], | ||
fields: vec![], | ||
}), | ||
}, | ||
Expr::Named { | ||
name: Ident::new("d"), | ||
expr: Box::new(Expr::Value(Value::Null)), | ||
}, | ||
], | ||
fields: vec![], | ||
}), | ||
access_chain: vec![ | ||
AccessExpr::Dot(Expr::Identifier(Ident::new("c"))), | ||
AccessExpr::Dot(Expr::Identifier(Ident::new("a"))), | ||
], | ||
}); | ||
assert_eq!(stmt.projection[0], expected); | ||
} | ||
|
||
#[test] | ||
|
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goldmedal @ayman-sigma
I'm wondering if it would make sense to move this special case handling from the tokenizer and into the parser. Since this is a BigQuery specific feature and only applies in a specific context, I feel like it would be more complicated to pull off within the tokenizer without affecting other use cases/dialects. Thinking in the parser we have more context on when this syntax is valid and a lot of that work was done in #1109
(I'm imagining since the tokenizer would be left to parse decimals as usual, we won't have the consideration in #1622 ?)
Essentially I'm wondering if we can extend the work to cover this scenario doublle number scenario. Using the example from #1598 it would essentially mean that e.g. the following token stream
[Word("foo"), Minus, Number("123."), Word("bar")]
gets combined into
ObjectName(vec![Word("foo-123"), Word("bar")])
would something like that be feasible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a good point. "Keeping tokenizer parse decimals as usual" makes sense to me.
I'll try it. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed it in b1d02f2
The double-value tests are enabled for BigQuery.