Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parser: fix the compatibility problem of UNION statement #6335

Merged
merged 29 commits into from
May 14, 2018

Conversation

spongedu
Copy link
Contributor

Fix issue #6092

@sre-bot
Copy link
Contributor

sre-bot commented Apr 22, 2018

Hi contributor, thanks for your PR.

This patch needs to be approved by someone of admins. They should reply with "/ok-to-test" to accept this PR for running test automatically.

parser/parser.y Outdated
@@ -4548,16 +4548,104 @@ SelectLockOpt:
$$ = ast.SelectLockInShareMode
}

// See https://dev.mysql.com/doc/refman/5.7/en/union.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why remove this comment?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By mistake..

parser/parser.y Outdated
{
sel := &ast.SelectStmt {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we wrap these code into a function, and reuse it in select statement and here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a better way, I'm working on this. But the modification in selectStmt bring in some
shift/reduce, reduce/reduce conflicts, I'm trying to figure them out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From MySQL document https://dev.mysql.com/doc/refman/5.7/en/union.html:

To use an ORDER BY or LIMIT clause to sort or limit the entire UNION result, parenthesize the individual SELECT statements and place the ORDER BY or LIMIT after the last one. The following example uses both clauses:

(SELECT a FROM t1 WHERE a=10 AND B=1) UNION (SELECT a FROM t2 WHERE a=11 AND B=2) ORDER BY a LIMIT 10;

A statement without parentheses is equivalent to one parenthesized as just shown.

Maybe we should also consider the LIMIT clause.

Copy link
Contributor Author

@spongedu spongedu Apr 24, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@zz-jason

  1. LIMIT was done in this pr, too.
  2. In MySQL, LIMIT, ORDER BY is not allowed in the first few SelectStmts. Maybe we should handle this too ?
mysql> CREATE TABLE IF NOT EXISTS `t` (
    ->   `id` int(6) unsigned NOT NULL,
    ->   `tn` int(3) unsigned NOT NULL,
    ->   `type` varchar(200) NOT NULL,
    ->   PRIMARY KEY (`id`,`tn`)
    -> ) DEFAULT CHARSET=utf8;
Query OK, 0 rows affected (0.05 sec)

mysql> INSERT INTO `t` (`id`, `tn`, `type`) VALUES
    ->   ('1', '5', 'cat'),
    ->   ('2', '6', 'dog'),
    ->   ('3', '7', 'duck'),
    ->   ('4', '8', 'chicken');
Query OK, 4 rows affected (0.02 sec)
Records: 4  Duplicates: 0  Warnings: 0
mysql> select * from t limit 1 offset 2 union select 1,2,3 from dual;
ERROR 1221 (HY000): Incorrect usage of UNION and LIMIT
mysql> select * from t order by id  union select 1,2,3 from dual;
ERROR 1221 (HY000): Incorrect usage of UNION and ORDER BY

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! But may be we can dealing with issue in the next PR since this one is already large enough?

@XuHuaiyu XuHuaiyu added the contribution This PR is from a community contributor. label Apr 23, 2018
@zz-jason zz-jason changed the title Parser: fix issue-6092 parser: fix the compatibility problem of UNION statement Apr 23, 2018
@spongedu
Copy link
Contributor Author

@winkyao @zz-jason PTAL

@zz-jason
Copy link
Member

/run-all-tests

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason
Copy link
Member

zz-jason commented May 2, 2018

/run-all-tests

parser/parser.y Outdated
@@ -699,6 +699,9 @@ import (
SelectStmtFieldList "SELECT statement field list"
SelectStmtLimit "SELECT statement optional LIMIT clause"
SelectStmtOpts "Select statement options"
SelectStmtPart1 "SELECT statement part 1"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the difference between part1/2/3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are syntax segments that abstracted from SelectStmt that can be reused in UnionStmt, each of them corresponds to a syntax branch of SelectStmt.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing to a better name?
SelectStmtPart1 SelectStmtPart2 SelectStmtPart3 is not informative, maybe we can use
SelectStmtBasic SelectStmtFromDual SelectStmtFromTable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiancaiamao agree

@shenli shenli added the status/LGT1 Indicates that a PR has LGTM 1. label May 3, 2018
@shenli
Copy link
Member

shenli commented May 3, 2018

/run-all-tests

1 similar comment
@shenli
Copy link
Member

shenli commented May 3, 2018

/run-all-tests

if $6 != nil {
union.Limit = $6.(*ast.Limit)
}
$$ = union
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

$7 is not used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes. Since the order byclause and limit clause belong to the union statement, the SelectLockOpt belongs to the union statement, which means nothing according to SelectLockOpt definition in https://dev.mysql.com/doc/refman/5.7/en/innodb-locking-reads.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add SelectLockOpt on both side of the union in this case.
What do you think? @coocood

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tiancaiamao what if the left SelectStmt have already set SelectLockOpt ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

begin txn1
select a from t where union (xxx whatever) for update

begin txn2
update a xxx
commit txn2

commit txn1 (If for update works on union, this operation should fail in TiDB)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should promote to lock mode either SelectLock exists on left SelectStmt or UnionStmt.

The select lock have something to do with transaction isolation level.
TiDB's Snapshot Isolation read operation will not block write operation, but if for update is used, read operation is more strict and you can thought it use a lock to prevent write operation.
@spongedu

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should promote to lock mode either SelectLock exists on left SelectStmt or UnionStmt.

@tiancaiamao what this mean ? Fail to get your point..... @.@

BTW, I test these cases in MySQL 5.7.
In MySQL:
First, execute this in the first tx:

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t  union select * from t2 for update;
+----+----+---------+
| id | tn | type    |
+----+----+---------+
|  1 |  5 | cat     |
|  2 |  6 | dog     |
|  3 |  7 | duck    |
|  4 | 29 | chicken |
|  4 | 99 | chicken |
+----+----+---------+
5 rows in set (0.01 sec)

Then, execute this in the second tx:

mysql> update t  set tn = 31 where id = 4;                                                                                                     Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update t2  set tn = 31 where id = 4;
ERROR 1205 (HY000): Lock wait timeout exceeded; try restarting transaction

the FOR UPDATE statement here affect only the second SelectStmt.

Another condition:
First, execute this in the first tx:

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t  union select * from t2 limit 100  for update;                                                                          +----+-----+---------+
| id | tn  | type    |
+----+-----+---------+
|  1 |   5 | cat     |
|  2 |   6 | dog     |
|  3 |   7 | duck    |
|  4 | 187 | chicken |
+----+-----+---------+
4 rows in set (0.01 sec)

Then, execute this in the second tx:

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> update t2 set tn = 1 where id = 1;                                                                                                      Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

mysql> update t set tn = 1 where id = 1;
Query OK, 1 row affected (0.00 sec)
Rows matched: 1  Changed: 1  Warnings: 0

Both Update works. The FOR UPDATE statement here seems work on neither SelectStmt

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, there is some subtle difference about the transaction isolation between TiDB and MySQL.
It's hard to explain here. I'll update the implementation myself after this PR is merged.

parser/parser.y Outdated
@@ -699,6 +699,9 @@ import (
SelectStmtFieldList "SELECT statement field list"
SelectStmtLimit "SELECT statement optional LIMIT clause"
SelectStmtOpts "Select statement options"
SelectStmtPart1 "SELECT statement part 1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about changing to a better name?
SelectStmtPart1 SelectStmtPart2 SelectStmtPart3 is not informative, maybe we can use
SelectStmtBasic SelectStmtFromDual SelectStmtFromTable

parser/parser.y Outdated
@@ -4591,14 +4605,53 @@ SelectLockOpt:

// See https://dev.mysql.com/doc/refman/5.7/en/union.html
UnionStmt:
UnionClauseList "UNION" UnionOpt SelectStmt
UnionClauseList "UNION" UnionOpt SelectStmtPart1 OrderByOptional SelectStmtLimit SelectLockOpt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we extract a common SelectStmtPart which is SelectStmtPart1 | SelectStmtPart2 | SelectStmtPart3 ?
Then we can make UnionStmt more simpler here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SelectStmtPart1, SelectStmtPart2 and SelectStmtPart3 are parts of selectStmt. combining them into a new rule SelectStmtPart may cause reduce/reduce conflicts on rule SelectStmtPart and SelectStmt, I think. Besides, SelectStmtPart2 is significant different from the other two and can't be combined.

@@ -949,6 +949,8 @@ func (s *testSuite) TestUnion(c *C) {
tk.MustExec("create table t(a int)")
tk.MustExec("insert into t value(0),(0)")
tk.MustQuery("select 1 from (select a from t union all select a from t) tmp").Check(testkit.Rows("1", "1", "1", "1"))
tk.MustQuery("select 10 as a from dual union select a from t order by a desc limit 1 ").Check(testkit.Rows("10"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bug this PR fixed is that, order by a desc limit 1 belongs to union statement, but our old code thought
it belongs to the second select statement.

Copy link
Contributor Author

@spongedu spongedu May 8, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the last order by a desc limit 1 is expected to belong to the whole union statement.

@zz-jason
Copy link
Member

zz-jason commented May 9, 2018

@shenli
Copy link
Member

shenli commented May 11, 2018

/run-all-tests

@shenli shenli added status/LGT2 Indicates that a PR has LGTM 2. status/LGT1 Indicates that a PR has LGTM 1. and removed status/LGT1 Indicates that a PR has LGTM 1. status/LGT2 Indicates that a PR has LGTM 2. labels May 11, 2018
@@ -949,6 +949,8 @@ func (s *testSuite) TestUnion(c *C) {
tk.MustExec("create table t(a int)")
tk.MustExec("insert into t value(0),(0)")
tk.MustQuery("select 1 from (select a from t union all select a from t) tmp").Check(testkit.Rows("1", "1", "1", "1"))
tk.MustQuery("select 10 as a from dual union select a from t order by a desc limit 1 ").Check(testkit.Rows("10"))
tk.MustQuery("select -10 as a from dual union select a from t order by a limit 1 ").Check(testkit.Rows("-10"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will cause a CI error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ciscoxll The test failure seems not caused by this test. I can pass test on my dev machine using make dev

@tiancaiamao
Copy link
Contributor

LGTM

@tiancaiamao
Copy link
Contributor

@zz-jason

@tiancaiamao tiancaiamao added status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels May 14, 2018
Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contribution This PR is from a community contributor. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants