Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL datatype discrepancy between DDL and CAST #1589

Open
mvzink opened this issue Dec 11, 2024 · 0 comments
Open

MySQL datatype discrepancy between DDL and CAST #1589

mvzink opened this issue Dec 11, 2024 · 0 comments

Comments

@mvzink
Copy link
Contributor

mvzink commented Dec 11, 2024

In MySQL, creating a column expects integer unsigned (as an "attribute" on the integer type, see docs), while CAST expects unsigned integer or even just unsigned (see docs).

This can be seen in this transcript:

mysql> create table foo (x unsigned integer);
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'unsigned integer)' at line 1

mysql> create table foo (x integer unsigned);
Query OK, 0 rows affected (0.03 sec)

mysql> select cast(x as unsigned integer) from foo;
Empty set (0.01 sec)

mysql> select cast(x as integer unsigned) from foo;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'integer unsigned) from foo' at line 1

sqlparser just has one parse_data_type procedure used in both contexts, which currently parses according to the column datatype rules, which means it fails to parse CAST correctly.

Incorrectly failing to parse unsigned integer:

echo 'select cast(x as unsigned integer) from foo;' | cargo run --example cli - --mysql
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/examples/cli - --mysql`
Parsing from stdin using MySqlDialect
2024-12-11T19:37:28.945Z DEBUG [sqlparser::parser] Parsing sql 'select cast(x as unsigned integer) from foo;
'...
2024-12-11T19:37:28.945Z DEBUG [sqlparser::parser] parsing expr
2024-12-11T19:37:28.945Z DEBUG [sqlparser::parser] parsing expr
2024-12-11T19:37:28.946Z DEBUG [sqlparser::parser] prefix: Identifier(Ident { value: "x", quote_style: None, span: Span(Location(1,13)..Location(1,14)) })
2024-12-11T19:37:28.946Z DEBUG [sqlparser::dialect] get_next_precedence_full() TokenWithSpan { token: Word(Word { value: "as", quote_style: None, keyword: AS }), span: Span(Location(1,15)..Location(1,17)) }
2024-12-11T19:37:28.946Z DEBUG [sqlparser::parser] next precedence: 0
2024-12-11T19:37:28.946Z DEBUG [sqlparser::parser] parsing expr
2024-12-11T19:37:28.946Z DEBUG [sqlparser::parser] prefix: Identifier(Ident { value: "x", quote_style: None, span: Span(Location(1,13)..Location(1,14)) })
2024-12-11T19:37:28.946Z DEBUG [sqlparser::dialect] get_next_precedence_full() TokenWithSpan { token: Word(Word { value: "as", quote_style: None, keyword: AS }), span: Span(Location(1,15)..Location(1,17)) }
2024-12-11T19:37:28.946Z DEBUG [sqlparser::parser] next precedence: 0
Error during parsing: ParserError("Expected: ), found: integer at Line: 1, Column: 27")

Incorrectly succeeding in parsing integer unsigned:

echo 'select cast(x as integer unsigned) from foo;' | cargo run --example cli - --mysql
    Finished `dev` profile [unoptimized + debuginfo] target(s) in 0.01s
     Running `target/debug/examples/cli - --mysql`
Parsing from stdin using MySqlDialect
2024-12-11T19:38:23.359Z DEBUG [sqlparser::parser] Parsing sql 'select cast(x as integer unsigned) from foo;
'...
2024-12-11T19:38:23.360Z DEBUG [sqlparser::parser] parsing expr
2024-12-11T19:38:23.360Z DEBUG [sqlparser::parser] parsing expr
2024-12-11T19:38:23.360Z DEBUG [sqlparser::parser] prefix: Identifier(Ident { value: "x", quote_style: None, span: Span(Location(1,13)..Location(1,14)) })
2024-12-11T19:38:23.360Z DEBUG [sqlparser::dialect] get_next_precedence_full() TokenWithSpan { token: Word(Word { value: "as", quote_style: None, keyword: AS }), span: Span(Location(1,15)..Location(1,17)) }
2024-12-11T19:38:23.360Z DEBUG [sqlparser::parser] next precedence: 0
2024-12-11T19:38:23.360Z DEBUG [sqlparser::parser] prefix: Cast { kind: Cast, expr: Identifier(Ident { value: "x", quote_style: None, span: Span(Location(1,13)..Location(1,14)) }), data_type: UnsignedInteger(None), format: None }
2024-12-11T19:38:23.360Z DEBUG [sqlparser::dialect] get_next_precedence_full() TokenWithSpan { token: Word(Word { value: "from", quote_style: None, keyword: FROM }), span: Span(Location(1,36)..Location(1,40)) }
2024-12-11T19:38:23.360Z DEBUG [sqlparser::parser] next precedence: 0
Round-trip:
'SELECT CAST(x AS INTEGER UNSIGNED) FROM foo'
Parse results:
[
    Query(
        Query {
            with: None,
            body: Select(
                Select {
                    select_token: TokenWithSpan {
                        token: Word(
                            Word {
                                value: "select",
                                quote_style: None,
                                keyword: SELECT,
                            },
                        ),
                        span: Span(Location(1,1)..Location(1,7)),
                    },
                    distinct: None,
                    top: None,
                    top_before_distinct: false,
                    projection: [
                        UnnamedExpr(
                            Cast {
                                kind: Cast,
                                expr: Identifier(
                                    Ident {
                                        value: "x",
                                        quote_style: None,
                                        span: Span(Location(1,13)..Location(1,14)),
                                    },
                                ),
                                data_type: UnsignedInteger(
                                    None,
                                ),
                                format: None,
                            },
                        ),
                    ],
                    into: None,
                    from: [
                        TableWithJoins {
                            relation: Table {
                                name: ObjectName(
                                    [
                                        Ident {
                                            value: "foo",
                                            quote_style: None,
                                            span: Span(Location(1,41)..Location(1,44)),
                                        },
                                    ],
                                ),
                                alias: None,
                                args: None,
                                with_hints: [],
                                version: None,
                                with_ordinality: false,
                                partitions: [],
                                json_path: None,
                            },
                            joins: [],
                        },
                    ],
                    lateral_views: [],
                    prewhere: None,
                    selection: None,
                    group_by: Expressions(
                        [],
                        [],
                    ),
                    cluster_by: [],
                    distribute_by: [],
                    sort_by: [],
                    having: None,
                    named_window: [],
                    qualify: None,
                    window_before_qualify: false,
                    value_table_mode: None,
                    connect_by: None,
                },
            ),
            order_by: None,
            limit: None,
            limit_by: [],
            offset: None,
            fetch: None,
            locks: [],
            for_clause: None,
            settings: None,
            format_clause: None,
        },
    ),
]

I don't know if it would be better to have a second, somewhat redundant parse_data_type_for_cast which we enter for mysql-alikes, or add some special casing in the existing parse_data_type. That may depend on whether there are other discrepancies between create and cast that need to be addressed. Not sure when I'll have time to address this, but any feedback on approach would be welcome.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant