Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parser does not fail on invalid query #8057

Closed
shlomi-noach opened this issue May 6, 2021 · 12 comments
Closed

Parser does not fail on invalid query #8057

shlomi-noach opened this issue May 6, 2021 · 12 comments

Comments

@shlomi-noach
Copy link
Contributor

Based on #8055

The following query:

alter table corder zzzz zzzz zzzz

should fail parsing. However, it parses without error, and on top of that, claims isFullyParsed() == true.

@shlomi-noach
Copy link
Contributor Author

On vitess:

mysql> set @@ddl_strategy='direct';
Query OK, 0 rows affected (0.00 sec)

mysql> alter table corder zzzz zzzz zzzz;
Query OK, 0 rows affected (0.01 sec)

On MySQL:

mysql> alter table demo zzzz zzzz zzzz;
ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'zzzz zzzz zzzz' at line 1

@shlomi-noach
Copy link
Contributor Author

I traced the problem down to this path:

vitess/go/vt/sqlparser/sql.y

Lines 1869 to 1871 in 9cc166f

{
$$ = nil
}

@shlomi-noach
Copy link
Contributor Author

alter_commands_list allows just about anything. I'm not sure why that happens. The following query is also valid:

alter table corder zzzz zzzz zzzz remove partitioning;

in this path:

vitess/go/vt/sqlparser/sql.y

Lines 2052 to 2058 in 9cc166f

| alter_table_prefix alter_commands_list REMOVE PARTITIONING
{
$1.FullyParsed = true
$1.AlterOptions = $2
$1.PartitionSpec = &PartitionSpec{Action:RemoveAction}
$$ = $1
}

@shlomi-noach
Copy link
Contributor Author

I also notice this line gets executed on such invalid SQL:

log.Warningf("ignoring error parsing DDL '%s': %v", sql, tokenizer.LastError)

Other samples of wrongly accepted syntax:

alter table corder engine=innodb zzzz zzzz;
alter table corder drop column c zzzz zzzz;

@GuptaManan100
Copy link
Member

GuptaManan100 commented May 10, 2021

alter_command_list does not allow anything, it allows empty values. In the input alter table corder zzzz zzzz zzzz only alter table corder part is used in parsing and the remaining zzzz zzzz zzzz is not. alter_commands_list takes an empty value.

Similarily, in the input alter table corder zzzz zzzz zzzz remove partitioning; only the initial part is used, the remaining parts zzzz zzzz zzzz remove partioning is not used. This can be seen from the AST node produced in the end, had it gone down the path

vitess/go/vt/sqlparser/sql.y

Lines 2052 to 2058 in 9cc166f

| alter_table_prefix alter_commands_list REMOVE PARTITIONING
{
$1.FullyParsed = true
$1.AlterOptions = $2
$1.PartitionSpec = &PartitionSpec{Action:RemoveAction}
$$ = $1
}
then the PartitionSpec field would not be null, but it is.

The reason that this behavior happens is that when the parser sees the zzzz token, it does not match a keyword and there is no rule to accept an id. It keeps on reducing all the way to the top of the stack and the remaining input is left unused i.e the parsing is finished but there are tokens left. This can be seen from the lasttoken field of the tokenizer which is set to zzzz instead of empty string.

Since only the initial tokens are used in parsing, the query becomes valid and therefore FullyParsed is set to true

@GuptaManan100
Copy link
Member

I think the error lies in the lines

if typ, val := tokenizer.Scan(); typ != 0 {
return nil, nil, fmt.Errorf("extra characters encountered after end of DDL: '%s'", string(val))
}
. reading the error message, it seems that these lines should be catching the case where there is input which has not been used. The lastError field of the tokenizer indead shows that there is a syntax error at position 24 near zzzz. These lines do not catch this error and the errors are ignored in the subsequent lines. As to why tokenizer.Scan() returns 0 in these lines is still left to investigate

@shlomi-noach
Copy link
Contributor Author

Since only the initial tokens are used in parsing, the query becomes valid and therefore FullyParsed is set to true

I think "only the initial tokens are used in parsing" necessarily implies that FullyParsed should be set to false. What do you think?

@GuptaManan100
Copy link
Member

Yes, I agree with you, in this place where we are catching the errors, we should be setting the FullyParsed value to false.

if typ, val := tokenizer.Scan(); typ != 0 {
return nil, nil, fmt.Errorf("extra characters encountered after end of DDL: '%s'", string(val))
}
log.Warningf("ignoring error parsing DDL '%s': %v", sql, tokenizer.LastError)
tokenizer.ParseTree = tokenizer.partialDDL
. WDYT?

@shlomi-noach
Copy link
Contributor Author

Yes, that makes sense to me!

@shlomi-noach
Copy link
Contributor Author

It's funny though that Parse2 is not specific to DDLs though.

@GuptaManan100
Copy link
Member

Yes, it isn't but we set the PartialDDLs only for DDLs, so it should be okay. I'll get the PR ready ASAP

@shlomi-noach
Copy link
Contributor Author

Fixed by #8094

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants