Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect results using a prefix index with OR condition #21234

Closed
kolbe opened this issue Nov 24, 2020 · 8 comments · Fixed by #21251
Closed

Incorrect results using a prefix index with OR condition #21234

kolbe opened this issue Nov 24, 2020 · 8 comments · Fixed by #21251
Assignees
Labels
severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.
Milestone

Comments

@kolbe
Copy link
Contributor

kolbe commented Nov 24, 2020

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

create table t1 (id int unsigned not null auto_increment primary key, t text, index(t(8)));
insert into t1 (t) values (md5('a')),(md5('b'));
select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';

2. What did you expect to see? (Required)

mysql> select * from t1;
+----+----------------------------------+
| id | t                                |
+----+----------------------------------+
|  1 | 0cc175b9c0f1b6a831c399e269772661 |
|  2 | 92eb5ffee6ae2fec3ad71c777531578f |
+----+----------------------------------+
2 rows in set (0.00 sec)

mysql> select /*+ IGNORE_INDEX(t1,t) */ * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
+----+----------------------------------+
| id | t                                |
+----+----------------------------------+
|  1 | 0cc175b9c0f1b6a831c399e269772661 |
|  2 | 92eb5ffee6ae2fec3ad71c777531578f |
+----+----------------------------------+
2 rows in set (0.00 sec)

3. What did you see instead (Required)

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
Empty set (0.00 sec)

4. What is your TiDB version? (Required)

tidb_version(): Release Version: v4.0.0-beta.2-1607-g93c3e6bec
Edition: Community
Git Commit Hash: 93c3e6becd5929cc37901d78d1f679cbc5d86342
Git Branch: HEAD
UTC Build Time: 2020-11-22 19:53:19
GoVersion: go1.13.4
Race Enabled: false
TiKV Min Version: v3.0.0-60965b006877ca7234adaced7890d7b029ed1306
Check Table Before Drop: false
1 row in set (0.00 sec)
@kolbe kolbe added the type/bug The issue is confirmed as a bug. label Nov 24, 2020
@kolbe kolbe changed the title Incorrect results using a prefix index Incorrect results using a prefix index on TEXT column Nov 24, 2020
@kolbe kolbe changed the title Incorrect results using a prefix index on TEXT column Incorrect results using a prefix index Nov 24, 2020
@kolbe
Copy link
Contributor Author

kolbe commented Nov 24, 2020

I originally spotted this in 4.0.8 and was able to reproduce it in master.

@kolbe
Copy link
Contributor Author

kolbe commented Nov 24, 2020

mysql> truncate t1;
Query OK, 0 rows affected (0.07 sec)

mysql> begin;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into t1 (t) values (md5('a')),(md5('b'));
Query OK, 2 rows affected (0.00 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
Empty set (0.00 sec)

mysql> commit;
Query OK, 0 rows affected (0.00 sec)

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
Empty set (0.00 sec)

mysql> analyze table t1;
Query OK, 0 rows affected (0.03 sec)

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
+----+----------------------------------+
| id | t                                |
+----+----------------------------------+
|  1 | 0cc175b9c0f1b6a831c399e269772661 |
|  2 | 92eb5ffee6ae2fec3ad71c777531578f |
+----+----------------------------------+
2 rows in set (0.00 sec)

@kolbe kolbe changed the title Incorrect results using a prefix index Incorrect results using a prefix index with OR condition Nov 24, 2020
@zz-jason
Copy link
Member

the execution plan is:

explain select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';TiDB(root@127.0.0.1:test) > explain select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
+-------------------------------+---------+-----------+----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                            | estRows | task      | access object        | operator info                                                                                                                                                                          |
+-------------------------------+---------+-----------+----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| IndexLookUp_11                | 16.00   | root      |                      |                                                                                                                                                                                        |
| ├─IndexRangeScan_8(Build)     | 20.00   | cop[tikv] | table:t1, index:t(t) | range:["0cc175b9c0f1b6a831c399e269772661","0cc175b9c0f1b6a831c399e269772661"], ["92eb5ffee6ae2fec3ad71c777531578f","92eb5ffee6ae2fec3ad71c777531578f"], keep order:false, stats:pseudo |
| └─Selection_10(Probe)         | 16.00   | cop[tikv] |                      | or(eq(test.t1.t, "0cc175b9c0f1b6a831c399e269772661"), eq(test.t1.t, "92eb5ffee6ae2fec3ad71c777531578f"))                                                                               |
|   └─TableRowIDScan_9          | 20.00   | cop[tikv] | table:t1             | keep order:false, stats:pseudo                                                                                                                                                         |
+-------------------------------+---------+-----------+----------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

I think the range is incorrect for the prefix index index(t(8))

@kolbe
Copy link
Contributor Author

kolbe commented Nov 24, 2020

The result is correct after the statistics for the table are updated, either because of explicit analyze table or because statistics have automatically been updated.

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
Empty set (0.01 sec)

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
Empty set (0.00 sec)

[...time passes...]

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
+----+----------------------------------+
| id | t                                |
+----+----------------------------------+
|  1 | 0cc175b9c0f1b6a831c399e269772661 |
|  2 | 92eb5ffee6ae2fec3ad71c777531578f |
+----+----------------------------------+
2 rows in set (0.00 sec)
mysql> truncate t1;
Query OK, 0 rows affected (0.09 sec)

mysql> insert into t1 (t) values (md5('a')),(md5('b'));
Query OK, 2 rows affected (0.01 sec)
Records: 2  Duplicates: 0  Warnings: 0

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
Empty set (0.00 sec)

mysql> analyze table t1;
Query OK, 0 rows affected (0.04 sec)

mysql> select * from t1 where t='0cc175b9c0f1b6a831c399e269772661' or t='92eb5ffee6ae2fec3ad71c777531578f';
+----+----------------------------------+
| id | t                                |
+----+----------------------------------+
|  1 | 0cc175b9c0f1b6a831c399e269772661 |
|  2 | 92eb5ffee6ae2fec3ad71c777531578f |
+----+----------------------------------+
2 rows in set (0.00 sec)

@shuke987 shuke987 added this to the v4.0.9 milestone Nov 24, 2020
@lzmhhh123
Copy link
Contributor

/assign @Reminiscent

@kolbe
Copy link
Contributor Author

kolbe commented Nov 24, 2020

Hm, analyze table did not make any difference for a more complex test environment on v4.0.8.

mysql> analyze table t2;
Query OK, 0 rows affected (1.23 sec)

mysql> select count(*) from t2 where a='40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114' or a='some other value';
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (0.01 sec)

mysql> select count(*) from t2 where a='40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114';
+----------+
| count(*) |
+----------+
|    87382 |
+----------+
1 row in set (0.28 sec)

mysql> select count(*) from t2 where a='some other value';
+----------+
| count(*) |
+----------+
|        2 |
+----------+
1 row in set (0.00 sec)
mysql> explain select count(*) from t2 where a='40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114' or a='some other value';
+----------------------------------+---------+-----------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id                               | estRows | task      | access object             | operator info                                                                                                                                                                                                                                                                                                                            |
+----------------------------------+---------+-----------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| StreamAgg_10                     | 1.00    | root      |                           | funcs:count(1)->Column#5                                                                                                                                                                                                                                                                                                                 |
| └─IndexLookUp_33                 | 0.00    | root      |                           |                                                                                                                                                                                                                                                                                                                                          |
|   ├─IndexRangeScan_30(Build)     | 0.00    | cop[tikv] | table:t2, index:a_2(a, c) | range:["40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114","40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114"], ["some other value","some other value"], keep order:false |
|   └─Selection_32(Probe)          | 0.00    | cop[tikv] |                           | or(eq(test.t2.a, "40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114"), eq(test.t2.a, "some other value"))                                                                                                                                                 |
|     └─TableRowIDScan_31          | 0.00    | cop[tikv] | table:t2                  | keep order:false                                                                                                                                                                                                                                                                                                                         |
+----------------------------------+---------+-----------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
5 rows in set (0.00 sec)

mysql> explain select count(*) from t2 where a='40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114';
+----------------------------------+----------+-----------+---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+
| id                               | estRows  | task      | access object             | operator info                                                                                                                                     |
+----------------------------------+----------+-----------+---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+
| HashAgg_21                       | 1.00     | root      |                           | funcs:count(Column#12)->Column#5                                                                                                                  |
| └─IndexLookUp_22                 | 1.00     | root      |                           |                                                                                                                                                   |
|   ├─IndexRangeScan_18(Build)     | 87469.00 | cop[tikv] | table:t2, index:a_2(a, c) | range:["40b24411","40b24411"], keep order:false                                                                                                   |
|   └─HashAgg_7(Probe)             | 1.00     | cop[tikv] |                           | funcs:count(1)->Column#12                                                                                                                         |
|     └─Selection_20               | 87469.00 | cop[tikv] |                           | eq(test.t2.a, "40b244112641dd78dd4f93b6c9190dd46e0099194d5a44257b7efad6ef9ff4683da1eda0244448cb343aa688f5d3efd7314dafe580ac0bcbf115aeca9e8dc114") |
|       └─TableRowIDScan_19        | 87469.00 | cop[tikv] | table:t2                  | keep order:false                                                                                                                                  |
+----------------------------------+----------+-----------+---------------------------+---------------------------------------------------------------------------------------------------------------------------------------------------+
6 rows in set (0.00 sec)

@Reminiscent
Copy link
Contributor

Do some experiments

mysql> drop table t2;
Query OK, 0 rows affected (0.02 sec)

mysql> create table t2 (id int unsigned not null auto_increment primary key, t text, index(t(3)));
Query OK, 0 rows affected (0.01 sec)

mysql> insert into t2 (t) values ('aaaa'),('bbbb'),('a');
Query OK, 3 rows affected (0.01 sec)
Records: 3  Duplicates: 0  Warnings: 0

mysql> select * from t2 where t='aaaa';
+----+------+
| id | t    |
+----+------+
|  1 | aaaa |
+----+------+
1 row in set (0.00 sec)

mysql> explain select * from t2 where t='aaaa';
+-------------------------------+---------+-----------+----------------------+-----------------------------------------------------+
| id                            | estRows | task      | access object        | operator info                                       |
+-------------------------------+---------+-----------+----------------------+-----------------------------------------------------+
| IndexLookUp_11                | 10.00   | root      |                      |                                                     |
| ├─IndexRangeScan_8(Build)     | 10.00   | cop[tikv] | table:t2, index:t(t) | range:["aaa","aaa"], keep order:false, stats:pseudo |
| └─Selection_10(Probe)         | 10.00   | cop[tikv] |                      | eq(test.t2.t, "aaaa")                               |
|   └─TableRowIDScan_9          | 10.00   | cop[tikv] | table:t2             | keep order:false, stats:pseudo                      |
+-------------------------------+---------+-----------+----------------------+-----------------------------------------------------+
4 rows in set (0.00 sec)

mysql> select * from t2 where t='aaaa' or t = 'a';
+----+------+
| id | t    |
+----+------+
|  3 | a    |
+----+------+
1 row in set (0.00 sec)

mysql> explain select * from t2 where t='aaaa' or t = 'a';
+-------------------------------+---------+-----------+----------------------+------------------------------------------------------------------+
| id                            | estRows | task      | access object        | operator info                                                    |
+-------------------------------+---------+-----------+----------------------+------------------------------------------------------------------+
| IndexLookUp_11                | 16.00   | root      |                      |                                                                  |
| ├─IndexRangeScan_8(Build)     | 20.00   | cop[tikv] | table:t2, index:t(t) | range:["a","a"], ["aaaa","aaaa"], keep order:false, stats:pseudo |
| └─Selection_10(Probe)         | 16.00   | cop[tikv] |                      | or(eq(test.t2.t, "aaaa"), eq(test.t2.t, "a"))                    |
|   └─TableRowIDScan_9          | 20.00   | cop[tikv] | table:t2             | keep order:false, stats:pseudo                                   |
+-------------------------------+---------+-----------+----------------------+------------------------------------------------------------------+
4 rows in set (0.00 sec)

mysql> analyze table t2;
Query OK, 0 rows affected (0.06 sec)

mysql> explain select * from t2 where t='aaaa';
+--------------------------+---------+-----------+---------------+----------------------------------+
| id                       | estRows | task      | access object | operator info                    |
+--------------------------+---------+-----------+---------------+----------------------------------+
| TableReader_7            | 0.00    | root      |               | data:Selection_6                 |
| └─Selection_6            | 0.00    | cop[tikv] |               | eq(test.t2.t, "aaaa")            |
|   └─TableRangeScan_5     | 3.00    | cop[tikv] | table:t2      | range:[0,+inf], keep order:false |
+--------------------------+---------+-----------+---------------+----------------------------------+
3 rows in set (0.00 sec)

mysql> explain select * from t2 where t='aaaa' or t = 'a';
+--------------------------+---------+-----------+---------------+-----------------------------------------------+
| id                       | estRows | task      | access object | operator info                                 |
+--------------------------+---------+-----------+---------------+-----------------------------------------------+
| TableReader_7            | 0.80    | root      |               | data:Selection_6                              |
| └─Selection_6            | 0.80    | cop[tikv] |               | or(eq(test.t2.t, "aaaa"), eq(test.t2.t, "a")) |
|   └─TableRangeScan_5     | 3.00    | cop[tikv] | table:t2      | range:[0,+inf], keep order:false              |
+--------------------------+---------+-----------+---------------+-----------------------------------------------+
3 rows in set (0.00 sec)

@ti-srebot
Copy link
Contributor

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
severity/critical sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants