Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

executor: select with inl_hash_join return inconsistency with MySQL #24997

Open
AilinKid opened this issue May 31, 2021 · 3 comments
Open

executor: select with inl_hash_join return inconsistency with MySQL #24997

AilinKid opened this issue May 31, 2021 · 3 comments
Assignees
Labels
affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 affects-6.4 affects-6.5 affects-6.6 affects-7.0 affects-7.1 affects-7.5 affects-8.1 affects-8.5 component/executor severity/major sig/execution SIG execution type/bug The issue is confirmed as a bug.

Comments

@AilinKid
Copy link
Contributor

AilinKid commented May 31, 2021

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

table

CREATE TABLE `tbl_2` (
  `col_10` time NOT NULL DEFAULT '05:44:48',
  `col_11` time DEFAULT '05:53:29',
  `col_12` date DEFAULT '2031-07-19',
  `col_13` date DEFAULT NULL,
  `col_14` date NOT NULL,
  PRIMARY KEY (`col_10`),
  UNIQUE KEY `idx_18` (`col_10`,`col_11`),
  UNIQUE KEY `idx_20` (`col_10`,`col_11`,`col_12`),
  UNIQUE KEY `idx_21` (`col_10`,`col_11`),
  UNIQUE KEY `idx_22` (`col_10`,`col_11`,`col_12`,`col_13`,`col_14`),
  KEY `idx_17` (`col_10`,`col_11`,`col_12`,`col_13`),
  KEY `idx_19` (`col_10`,`col_11`),
  KEY `idx_23` (`col_10`,`col_11`,`col_12`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

CREATE TABLE `tbl_14` (
  `col_80` datetime DEFAULT NULL,
  `col_81` bit(1) DEFAULT b'0',
  `col_82` tinyint(1) DEFAULT NULL,
  `col_83` datetime NOT NULL,
  `col_84` tinyblob,
  `col_85` enum('Alice','Bob','Charlie','David') COLLATE utf8_unicode_ci DEFAULT NULL,
  `col_86` varchar(86) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT '',
  `col_87` varbinary(305) DEFAULT NULL,
  `col_88` int DEFAULT '155374992',
  UNIQUE KEY `idx_71` (`col_80`,`col_81`,`col_82`,`col_83`,`col_88`),
  UNIQUE KEY `idx_72` (`col_80`,`col_81`,`col_82`,`col_83`,`col_88`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci

data
tbl2.txt
tbl14.txt

query

select  /*+ inl_hash_join( tbl_14 , tbl_2 ) */ tbl_14.col_86, tbl_14.col_87, tbl_14.col_82 , tbl_2.col_10, tbl_2.col_11, tbl_2.col_12, tbl_2.col_13 from tbl_14 left join tbl_2 on tbl_14.col_87 <= tbl_2.col_13 or tbl_14.col_88 <> tbl_2.col_14 and tbl_14.col_80 > tbl_2.col_10 and tbl_14.col_88 >= tbl_2.col_10 and tbl_14.col_88 <= tbl_2.col_11 where not( tbl_14.col_87 between 'TdsEpUvPMz' and '' ) 

2. What did you expect to see? (Required)

result shoul d be the same since meta and data is the same

3. What did you see instead (Required)

ingest comparison fail

4. What is your TiDB version? (Required)

master

@AilinKid AilinKid added type/bug The issue is confirmed as a bug. component/executor sig/execution SIG execution labels May 31, 2021
@aytrack
Copy link
Contributor

aytrack commented Jul 22, 2021

execute following sql on echo release,

drop table if exists tbl_2, tbl_14;
CREATE TABLE `tbl_2` (
  `col_10` time NOT NULL DEFAULT '05:44:48',
  `col_11` time DEFAULT '05:53:29',
  `col_12` date DEFAULT '2031-07-19',
  `col_13` date DEFAULT NULL,
  `col_14` date NOT NULL,
  PRIMARY KEY (`col_10`),
  UNIQUE KEY `idx_18` (`col_10`,`col_11`),
  UNIQUE KEY `idx_20` (`col_10`,`col_11`,`col_12`),
  UNIQUE KEY `idx_21` (`col_10`,`col_11`),
  UNIQUE KEY `idx_22` (`col_10`,`col_11`,`col_12`,`col_13`,`col_14`),
  KEY `idx_17` (`col_10`,`col_11`,`col_12`,`col_13`),
  KEY `idx_19` (`col_10`,`col_11`),
  KEY `idx_23` (`col_10`,`col_11`,`col_12`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

CREATE TABLE `tbl_14` (
  `col_80` datetime DEFAULT NULL,
  `col_81` bit(1) DEFAULT b'0',
  `col_82` tinyint(1) DEFAULT NULL,
  `col_83` datetime NOT NULL,
  `col_84` tinyblob,
  `col_85` enum('Alice','Bob','Charlie','David') COLLATE utf8_unicode_ci DEFAULT NULL,
  `col_86` varchar(86) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci DEFAULT '',
  `col_87` varbinary(305) DEFAULT NULL,
  `col_88` int DEFAULT '155374992',
  UNIQUE KEY `idx_71` (`col_80`,`col_81`,`col_82`,`col_83`,`col_88`),
  UNIQUE KEY `idx_72` (`col_80`,`col_81`,`col_82`,`col_83`,`col_88`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;

 insert into tbl_14 values ("2035-10-07 00:00:00",null, 0,"2021-09-17 00:00:00", "TqHLIMQoecpLZ","Alice","NHDzbXXyaQc","e",1500362009);
insert into tbl_2 values ("00:59:59", "08:34:23", "2032-03-13", "2019-11-10", "2033-07-16");
insert into tbl_2 values ("04:39:58", "07:53:31", "2014-04-24", "1976-04-24", "2015-07-28");
insert into tbl_2 values ("05:14:17", "21:57:25", "2011-09-28", "2014-03-01", "1972-12-08");
insert into tbl_2 values ("06:00:56", "01:29:24", "1977-01-16", "1981-08-05", "2018-02-17");
insert into tbl_2 values ("06:12:55", "07:37:16", "1986-04-27", "2010-08-15", "2020-02-01");
insert into tbl_2 values ("07:33:45", "05:10:52", "1990-04-06", "1990-12-02", "1991-10-29");
insert into tbl_2 values ("08:05:24", "18:30:10", "2002-09-03", "1986-05-27", "2008-02-13");
insert into tbl_2 values ("11:43:02", "21:52:34", "1980-01-29", "2034-02-10", "1988-02-25");
insert into tbl_2 values ("14:53:25", "17:06:03", "1990-11-03", "1997-04-11", "1997-12-18");
insert into tbl_2 values ("20:01:46", "20:32:19", "1971-01-17", "2004-08-07", "2019-02-05");
insert into tbl_2 values ("09:48:21", "19:29:30", NULL, "2007-10-04","2003-10-07");
insert into tbl_2 values ("02:20:21", "10:08:16", "2001-02-20", NULL,"1973-04-14");

select  /*+ inl_hash_join( tbl_14 , tbl_2 ) */ tbl_14.col_86, tbl_14.col_87, tbl_14.col_82 , tbl_2.col_10, tbl_2.col_11, tbl_2.col_12, tbl_2.col_13 from tbl_14 left join tbl_2 on tbl_14.col_87 <= tbl_2.col_13 or tbl_14.col_88 <> tbl_2.col_14 and tbl_14.col_80 > tbl_2.col_10 and tbl_14.col_88 >= tbl_2.col_10 and tbl_14.col_88 <= tbl_2.col_11 where not( tbl_14.col_87 between 'TdsEpUvPMz' and '' ) 

tidb :

MySQL root@127.0.0.1:test> select  /*+ inl_hash_join( tbl_14 , tbl_2 ) */ tbl_14.col_86, tbl_14.col_87, tbl_14.col_82 , tbl_2.col_10, tbl_2.col_11, tbl_2.col_12, tbl_2.col_13 from tbl_14 left join tbl_2 on tbl_14.col_87 <
                       ->  = tbl_2.col_13 or tbl_14.col_88 <> tbl_2.col_14 and tbl_14.col_80 > tbl_2.col_10 and tbl_14.col_88 >= tbl_2.col_10 and tbl_14.col_88 <= tbl_2.col_11 where not( tbl_14.col_87 between 'TdsEpUvPMz'
                       ->   and '' )
                       ->
+-------------+--------+--------+--------+--------+--------+--------+
| col_86      | col_87 | col_82 | col_10 | col_11 | col_12 | col_13 |
+-------------+--------+--------+--------+--------+--------+--------+
| NHDzbXXyaQc | e      | 0      | <null> | <null> | <null> | <null> |
+-------------+--------+--------+--------+--------+--------+--------+

mysql 8.0.23:

MySQL root@172.16.5.103:hchwang> select  /*+ inl_hash_join( tbl_14 , tbl_2 ) */ tbl_14.col_86, tbl_14.col_87, tbl_14.col_82 , tbl_2.col_10, tbl_2.col_11, tbl_2.col_12, tbl_2.col_13 from tbl_14 left join tbl_2 on tbl_14.co
                             ->  l_87 <= tbl_2.col_13 or tbl_14.col_88 <> tbl_2.col_14 and tbl_14.col_80 > tbl_2.col_10 and tbl_14.col_88 >= tbl_2.col_10 and tbl_14.col_88 <= tbl_2.col_11 where not( tbl_14.col_87 between
                             ->  'TdsEpUvPMz' and '' )
                             ->
+-------------+--------+--------+----------+----------+------------+------------+
| col_86      | col_87 | col_82 | col_10   | col_11   | col_12     | col_13     |
+-------------+--------+--------+----------+----------+------------+------------+
| NHDzbXXyaQc | e      | 0      | 20:01:46 | 20:32:19 | 1971-01-17 | 2004-08-07 |
| NHDzbXXyaQc | e      | 0      | 14:53:25 | 17:06:03 | 1990-11-03 | 1997-04-11 |
| NHDzbXXyaQc | e      | 0      | 11:43:02 | 21:52:34 | 1980-01-29 | 2034-02-10 |
| NHDzbXXyaQc | e      | 0      | 9:48:21  | 19:29:30 | <null>     | 2007-10-04 |
| NHDzbXXyaQc | e      | 0      | 8:05:24  | 18:30:10 | 2002-09-03 | 1986-05-27 |
| NHDzbXXyaQc | e      | 0      | 7:33:45  | 5:10:52  | 1990-04-06 | 1990-12-02 |
| NHDzbXXyaQc | e      | 0      | 6:12:55  | 7:37:16  | 1986-04-27 | 2010-08-15 |
| NHDzbXXyaQc | e      | 0      | 6:00:56  | 1:29:24  | 1977-01-16 | 1981-08-05 |
| NHDzbXXyaQc | e      | 0      | 5:14:17  | 21:57:25 | 2011-09-28 | 2014-03-01 |
| NHDzbXXyaQc | e      | 0      | 4:39:58  | 7:53:31  | 2014-04-24 | 1976-04-24 |
| NHDzbXXyaQc | e      | 0      | 0:59:59  | 8:34:23  | 2032-03-13 | 2019-11-10 |
+-------------+--------+--------+----------+----------+------------+------------+

I can reproduce this on 4.0.14, 5.0.3 , 5.1.0 and master (11bc9c1)

@guo-shaoge
Copy link
Collaborator

guo-shaoge commented Jul 27, 2021

Simplified case:

drop table if exists t1, t2;
create table t1(c1 varbinary(100));
insert into t1 values('e');
create table t2(c1 date);
insert into t2 values('2019-11-10'); 
select * from t1 inner join t2 on t1.c1 <= t2.c1;

MySQL:

mysql> select * from t1 inner join t2 on t1.c1 <= t2.c1;
   
+------------+------------+
| c1         | c1         |
+------------+------------+
| 0x65       | 2019-11-10 |
+------------+------------+
1 row in set, 1 warning (0.00 sec)

TiDB:

mysql> select * from t1 inner join t2 on t1.c1 <= t2.c1;

Empty set, 1 warning (0.00 sec)

When convert invalid string to datetime(check here), we return isNull flag as true, which is unexpected.

So TiDB's result is empty.

Guess not a critical bug? Only have problems when cast invalid string to date. @aytrack

@guo-shaoge
Copy link
Collaborator

guo-shaoge commented Jul 29, 2021

Also remember to verify following case in #26442 and #26407 .

create table a(b timestamp default '2006-08-31');
replace a values();
SELECT b FROM a WHERE not(b BETWEEN 'o' AND 'a');

create table t(a timestamp);
replace t values('250729');
SELECT a, a NOT IN('','') from t;

@jebter jebter added affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. labels Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects-4.0 This bug affects 4.0.x versions. affects-5.0 This bug affects 5.0.x versions. affects-5.1 This bug affects 5.1.x versions. affects-5.2 This bug affects 5.2.x versions. affects-5.3 This bug affects 5.3.x versions. affects-5.4 This bug affects 5.4.x versions. affects-6.0 affects-6.1 affects-6.2 affects-6.3 affects-6.4 affects-6.5 affects-6.6 affects-7.0 affects-7.1 affects-7.5 affects-8.1 affects-8.5 component/executor severity/major sig/execution SIG execution type/bug The issue is confirmed as a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants