Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return more results than expected when executing SQL with prefix indexes #24356

Closed
saup007 opened this issue Apr 28, 2021 · 3 comments · Fixed by #24590
Closed

Return more results than expected when executing SQL with prefix indexes #24356

saup007 opened this issue Apr 28, 2021 · 3 comments · Fixed by #24590
Assignees
Labels
cherry-pick-approved Cherry pick PR approved by release team. severity/critical sig/planner SIG: Planner type/bug The issue is confirmed as a bug.

Comments

@saup007
Copy link

saup007 commented Apr 28, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

TiDB

CREATE TABLE `t_prefix_index` (
  `tid` bigint(20) NOT NULL AUTO_INCREMENT,
  `user_name` varchar(100) DEFAULT NULL,
  `n_status` int(11) NOT NULL DEFAULT 0,
  PRIMARY KEY (`tid`),
  KEY `idx_name` (`user_name`(10),`n_status`));


insert into t_prefix_index(user_name) values('Jonexxxxxxxx');
insert into t_prefix_index(user_name) values('Jonexxxxxxii');
insert into t_prefix_index(user_name) values('Jonexxxxxxpp');
insert into t_prefix_index(user_name) values('Jonexxxxxxuu');
insert into t_prefix_index(user_name) values('Jonexxxxxxqq');
insert into t_prefix_index(user_name) values('Jonexxxxxxww');
insert into t_prefix_index(user_name) values('Jonexxxxxxee');
insert into t_prefix_index(user_name) values('Jonexxxxxxrr');
insert into t_prefix_index(user_name) values('Jonexxxxxxtt');
insert into t_prefix_index(user_name) values('Jonexxxxxxyy');
insert into t_prefix_index(user_name) values('Joneyyyyyyyyy');
insert into t_prefix_index(user_name) values('Jonexxxxxxuu');
insert into t_prefix_index(user_name) values('Jonexxxxxxii');
insert into t_prefix_index(user_name) values('Jonexxxxxxoo');
insert into t_prefix_index(user_name) values('Jonexxxxxxpp');
insert into t_prefix_index(user_name) values('Jonexxxxxxaa');
insert into t_prefix_index(user_name) values('Jonexxxxxxss');
insert into t_prefix_index(user_name) values('Jonexxxxxxdd');
insert into t_prefix_index(user_name) values('Jonexxxxxxffghhhhhhhhhh');
insert into t_prefix_index(user_name) values('Jonexxxxxxiiiuyysfdsf');
insert into t_prefix_index(user_name) values('Jonexxxxxxhhyuisradfs');

root@127.0.0.1:(test) 18:35:30>select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') ;
+-----+--------------+----------+
| tid | user_name    | n_status |
+-----+--------------+----------+
|   1 | Jonexxxxxxxx |        0 |
|   3 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
+-----+--------------+----------+
3 rows in set (0.05 sec)

root@127.0.0.1:(test) 18:36:13>select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;
+-----+--------------+----------+
| tid | user_name    | n_status |
+-----+--------------+----------+
|   1 | Jonexxxxxxxx |        0 |
|   1 | Jonexxxxxxxx |        0 |
|   3 | Jonexxxxxxpp |        0 |
|   3 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
+-----+--------------+----------+
6 rows in set (0.00 sec)

root@127.0.0.1:(test) 18:36:19>explain select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') ;
+-------------------------------+---------+-----------+-----------------------------------------------------------+-------------------------------------------------------------------+
| id                            | estRows | task      | access object                                             | operator info                                                     |
+-------------------------------+---------+-----------+-----------------------------------------------------------+-------------------------------------------------------------------+
| IndexLookUp_11                | 0.04    | root      |                                                           |                                                                   |
| ├─IndexRangeScan_8(Build)     | 0.05    | cop[tikv] | table:t_prefix_index, index:idx_name(user_name, n_status) | range:["Jonexxxxxx","Jonexxxxxx"], keep order:false, stats:pseudo |
| └─Selection_10(Probe)         | 0.04    | cop[tikv] |                                                           | in(test.t_prefix_index.user_name, "Jonexxxxxxxx", "Jonexxxxxxpp") |
|   └─TableRowIDScan_9          | 0.05    | cop[tikv] | table:t_prefix_index                                      | keep order:false, stats:pseudo                                    |
+-------------------------------+---------+-----------+-----------------------------------------------------------+-------------------------------------------------------------------+
4 rows in set (0.00 sec)

root@127.0.0.1:(test) 18:37:03>explain select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;
+-------------------------------+---------+-----------+-----------------------------------------------------------+--------------------------------------------------------------------------------------------------------+
| id                            | estRows | task      | access object                                             | operator info                                                                                          |
+-------------------------------+---------+-----------+-----------------------------------------------------------+--------------------------------------------------------------------------------------------------------+
| IndexLookUp_11                | 0.00    | root      |                                                           |                                                                                                        |
| ├─IndexRangeScan_8(Build)     | 0.00    | cop[tikv] | table:t_prefix_index, index:idx_name(user_name, n_status) | range:["Jonexxxxxx" 0,"Jonexxxxxx" 0], ["Jonexxxxxx" 0,"Jonexxxxxx" 0], keep order:false, stats:pseudo |
| └─Selection_10(Probe)         | 0.00    | cop[tikv] |                                                           | in(test.t_prefix_index.user_name, "Jonexxxxxxxx", "Jonexxxxxxpp")                                      |
|   └─TableRowIDScan_9          | 0.00    | cop[tikv] | table:t_prefix_index                                      | keep order:false, stats:pseudo                                                                         |
+-------------------------------+---------+-----------+-----------------------------------------------------------+--------------------------------------------------------------------------------------------------------+
4 rows in set (0.00 sec)

root@127.0.0.1:(test) 18:38:54>alter table t_prefix_index drop index idx_name;
Query OK, 0 rows affected (0.61 sec)

root@127.0.0.1:(test) 18:39:05>alter table t_prefix_index add index idx_name(`user_name`(10));
Query OK, 0 rows affected (0.50 sec)

root@127.0.0.1:(test) 18:39:55>select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;
+-----+--------------+----------+
| tid | user_name    | n_status |
+-----+--------------+----------+
|   1 | Jonexxxxxxxx |        0 |
|   3 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
+-----+--------------+----------+
3 rows in set (0.04 sec)


root@127.0.0.1:(test) 18:40:01>alter table t_prefix_index drop index idx_name;
Query OK, 0 rows affected (0.38 sec)

root@127.0.0.1:(test) 18:40:27>alter table t_prefix_index add index idx_name(n_status, `user_name`(10));
Query OK, 0 rows affected (0.41 sec)

root@127.0.0.1:(test) 18:40:34>select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;
+-----+--------------+----------+
| tid | user_name    | n_status |
+-----+--------------+----------+
|   1 | Jonexxxxxxxx |        0 |
|   3 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
+-----+--------------+----------+
3 rows in set (0.00 sec)

MySQL

root@localhost:(test) 18:41:59>explain select * from t_prefix_index force index(idx_name) where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') ;
+----+-------------+----------------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
| id | select_type | table          | partitions | type  | possible_keys | key      | key_len | ref  | rows | filtered | Extra       |
+----+-------------+----------------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
|  1 | SIMPLE      | t_prefix_index | NULL       | range | idx_name      | idx_name | 43      | NULL |   20 |   100.00 | Using where |
+----+-------------+----------------+------------+-------+---------------+----------+---------+------+------+----------+-------------+
1 row in set, 1 warning (0.00 sec)

root@localhost:(test) 18:42:12>select * from t_prefix_index force index(idx_name) where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') ;
+-----+--------------+----------+
| tid | user_name    | n_status |
+-----+--------------+----------+
|   1 | Jonexxxxxxxx |        0 |
|   3 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
+-----+--------------+----------+
3 rows in set (0.00 sec)

2. What did you expect to see? (Required)

a. Separate prefix index results are correct

idx_name(user_name(10))
select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;

b. The prefix index is not in the first position of the combined index, the result is also correct

idx_name(n_status, user_name(10))
select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;

c. The result is correct even if the prefix index is in the first position of the combined index, but the where only has the prefix index column

idx_name (user_name(10),n_status))
select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') ;

d. The prefix index must be in the first position of the combined index, and the where condition includes all combined index columns, the result is more than expected

idx_name (user_name(10),n_status))
select * from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;
Return more results than expected

Occasionally, errors occur in this case

ERROR 1105 (HY000): other error: Request range exceeds bound, request range:[748000000000000CFFC35F698000000000FF0000040147525F43FF53435F30FF305F00FF0000000000F90380FF0000000000000000FE, end:748000000000000CFFC35F698000000000FF0000040147525F43FF53435F30FF305F00FF0000000000F90380FF0000000000000100FE), physical bound:[748000000000000CFFC35F698000000000FF0000040147525F43FF53435F30FF305F00FF0000000000F90380FF0000000000000003FF8000000000798906FF0000000000000000F7, 748000000000000CFFC35F728000000000FF1F5F8D0000000000FA)

3. What did you see instead (Required)

4. What is your TiDB version? (Required)

3.0.16、3.0.19、4.0.8、4.0.10、4.0.12、5.0.0

@saup007 saup007 added the type/bug The issue is confirmed as a bug. label Apr 28, 2021
@tiancaiamao
Copy link
Contributor

The range calculation is incorrect for prefix index plus n_status = 0 condition.

range:["Jonexxxxxx" 0,"Jonexxxxxx" 0], ["Jonexxxxxx" 0,"Jonexxxxxx" 0]

The range repeats twice so the result is also doubled.

@winoros

@saup007
Copy link
Author

saup007 commented Apr 28, 2021

distinct

root@127.0.0.1:(test) 22:01:49>select distinct tid, user_name, n_status from t_prefix_index where user_name in ('Jonexxxxxxxx', 'Jonexxxxxxpp') and n_status = 0;
+-----+--------------+----------+
| tid | user_name    | n_status |
+-----+--------------+----------+
|   1 | Jonexxxxxxxx |        0 |
|   1 | Jonexxxxxxxx |        0 |
|   3 | Jonexxxxxxpp |        0 |
|   3 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
|  15 | Jonexxxxxxpp |        0 |
+-----+--------------+----------+
6 rows in set (0.01 sec)

@ti-srebot
Copy link
Contributor

Please edit this comment or add a new comment to complete the following information

Not a bug

  1. Remove the 'type/bug' label
  2. Add notes to indicate why it is not a bug

Duplicate bug

  1. Add the 'type/duplicate' label
  2. Add the link to the original bug

Bug

Note: Make Sure that 'component', and 'severity' labels are added
Example for how to fill out the template: #20100

1. Root Cause Analysis (RCA) (optional)

2. Symptom (optional)

3. All Trigger Conditions (optional)

4. Workaround (optional)

5. Affected versions

6. Fixed versions

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cherry-pick-approved Cherry pick PR approved by release team. severity/critical sig/planner SIG: Planner type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants