-
Notifications
You must be signed in to change notification settings - Fork 554
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix the parsing result for the special double number #1621
Conversation
let support_struct = all_dialects_where(|d| d.supports_struct_literal()); | ||
let stmt = support_struct | ||
.verified_only_select("SELECT STRUCT(STRUCT(1 AS a, NULL AS b) AS c, NULL AS d).c.a"); | ||
let expected = SelectItem::UnnamedExpr(Expr::CompoundFieldAccess { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The access chain for the struct literal is represented by CompoundFieldAccess
currently. See #1551 for the details.
cc @ayman-sigma
Tested with DataFusion by goldmedal/datafusion#2 |
src/tokenizer.rs
Outdated
.map(|c| c.is_ascii_digit()) | ||
.unwrap_or(false) | ||
{ | ||
if self.dialect.support_unquoted_hyphenated_identifiers() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@goldmedal @ayman-sigma
I'm wondering if it would make sense to move this special case handling from the tokenizer and into the parser. Since this is a BigQuery specific feature and only applies in a specific context, I feel like it would be more complicated to pull off within the tokenizer without affecting other use cases/dialects. Thinking in the parser we have more context on when this syntax is valid and a lot of that work was done in #1109
(I'm imagining since the tokenizer would be left to parse decimals as usual, we won't have the consideration in #1622 ?)
Essentially I'm wondering if we can extend the work to cover this scenario doublle number scenario. Using the example from #1598 it would essentially mean that e.g. the following token stream
[Word("foo"), Minus, Number("123."), Word("bar")]
gets combined into
ObjectName(vec![Word("foo-123"), Word("bar")])
would something like that be feasible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Essentially I'm wondering if we can extend the work to cover this scenario doublle number scenario. Using the example from #1598 it would essentially mean that e.g. the following token stream
[Word("foo"), Minus, Number("123."), Word("bar")]
gets combined into
ObjectName(vec![Word("foo-123"), Word("bar")])
would something like that be feasible?
It's a good point. "Keeping tokenizer parse decimals as usual" makes sense to me.
I'll try it. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed it in b1d02f2
The double-value tests are enabled for BigQuery.
/// similar table clause. Currently, this is used only to support unquoted hyphenated identifiers in | ||
// this context on BigQuery. | ||
pub fn parse_identifier(&mut self, in_table_clause: bool) -> Result<Ident, ParserError> { | ||
pub fn parse_identifier(&mut self) -> Result<Ident, ParserError> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be a breaking change for the downstream project using this public method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thanks for fixing this @goldmedal!
cc @alamb
Thanks @iffyio 👍 |
🚀 |
close #1619
close #1622
Changed
CompoundFieldAccess
Introducesupport_unquoted_hyphenated_identifiers
to scope the change from Fix BigQuery hyphenated ObjectName with numbers #1598 to BigQuery only. See previous discussions for the details.Breaking change
Parser::parse_identifier
.