-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
planner, executor: implement the null-aware antiSemiJoin and null-aware antiLeftOuterSemiJoin (hash join with inner build) #37512
Merged
Merged
Changes from 18 commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
26a2aa5
implement the null aware anti semi join and null aware anti left out…
AilinKid dadb27a
make fmt
AilinKid 09c79c8
make fmt
AilinKid 2938927
make fmt
AilinKid 92e8bb9
make fmt
AilinKid a1241d0
fix old mpp hash join test cases
AilinKid 216d2fe
fix old join reorder test
AilinKid d3e74b1
fix TestLeadingJoinHint4OuterJoin test case
AilinKid a540255
fix getNullBucket won't clear the old data and fix test TestMultiColI…
AilinKid b1b3d6b
fix TestOrderedResultModeOnSubQuery
AilinKid 296451d
.
AilinKid 7e76f5a
do not ref null bit map in buildHashCtx directly, because it may be r…
AilinKid 16708c7
make fmt
AilinKid 3cf5504
fix comment
AilinKid f409fe7
shallow copy hashNullBucket for every probe worker after build
AilinKid 386b560
change NAAJ to Null-aware
AilinKid 03ce4a0
avoid generate tiflash naaj join
AilinKid 406facb
add naaj switch
AilinKid 0ac2d30
Update executor/hash_table.go
AilinKid 6656184
Merge branch 'master' into NAAJ
AilinKid c93e419
move the null bits collection out of loop
AilinKid 053aeb6
add unsafeSet for concurrentBitmap to speed up
AilinKid 3f3f9b7
Merge branch 'master' into NAAJ
AilinKid b3f486e
Merge branch 'master' into NAAJ
ti-chi-bot File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,213 @@ | ||
# naaj.test file is for null-aware anti join | ||
use test; | ||
set @@session.tidb_enable_null_aware_anti_join=1; | ||
# assert the cases for the left side without null. | ||
select "***************************************************** PART 1 *****************************************************************" as name; | ||
drop table if exists naaj_A, naaj_B; | ||
create table naaj_A(a int, b int, c int); | ||
create table naaj_B(a int, b int, c int); | ||
insert into naaj_A values (1,1,1); | ||
insert into naaj_B values (1,2,2); | ||
|
||
# assert 1: both side don't have null values. | ||
# AntiLeftOuterSemiJoin | ||
explain format = 'brief' select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
|
||
# AntiSemiJoin | ||
explain format = 'brief' select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# assert 2: right side has same key bucket. | ||
insert into naaj_B values(1,1,1); | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# assert 3: right side has null values. | ||
insert into naaj_B values(1, null, 2); | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# assert 4: right side have null values, but it can't pass the inner(join key related or not) filter. | ||
explain format = 'brief' select (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
select (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
|
||
explain format = 'brief' select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
|
||
explain format = 'brief' select (a, b) not in (select a, b from naaj_B where naaj_A.a != naaj_B.a) from naaj_A; | ||
select (a, b) not in (select a, b from naaj_B where naaj_A.a != naaj_B.a) from naaj_A; | ||
|
||
explain format = 'brief' select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.a != naaj_B.a); | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.a != naaj_B.a); | ||
|
||
# assert 5: right side is empty. | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B where false); | ||
select (a, b) not in (select a, b from naaj_B where false) from naaj_A; | ||
|
||
# assert 6: right side null bucket filter (not-null join key should match with each other). | ||
insert into naaj_B values(2, null, 2); | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
delete from naaj_B where a=1 and b=1 and c=1; | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# case 2: assert the cases for the left side has null. | ||
select "***************************************************** PART 2 *****************************************************************" as name; | ||
delete from naaj_A; | ||
delete from naaj_B; | ||
insert into naaj_A values(1,null,1); | ||
|
||
# assert 1: left side has null, while the right is empty. | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# assert 2: left side has null, while the right has a invalid null row (can't pass the nullBit filter). | ||
insert into naaj_B values(2, null, 2); | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# left side has null, while the right has a valid null row. (passed the nullBit filter). | ||
insert into naaj_B values(null, null, 2); | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# assert 3: left side has null, while the right has a valid non-null row. | ||
delete from naaj_B; | ||
insert into naaj_B values(2, 2, 2); | ||
select (a, b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B); | ||
|
||
# assert 4: left side has null, while the right has no valid rows (equivalent to ). | ||
insert into naaj_B values(2, null, 2); | ||
insert into naaj_B values(null, null, 2); | ||
explain format = 'brief' select (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
select (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
explain format = 'brief' select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
|
||
# assert 5: When the inner subq has a correlated EQ condition, we won't built the NA-EQ connecting condition here. | ||
explain format = 'brief' select (a, b) not in (select a, b from naaj_B where naaj_A.c = naaj_B.c) from naaj_A; | ||
select (a, b) not in (select a, b from naaj_B where naaj_A.c = naaj_B.c) from naaj_A; | ||
explain format = 'brief' select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.c = naaj_B.c); | ||
select * from naaj_A where (a, b) not in (select a, b from naaj_B where naaj_A.c = naaj_B.c); | ||
|
||
# case 3: assert the cases for the equivalent semantic predicate of != ALL | ||
select "***************************************************** PART 3 *****************************************************************" as name; | ||
drop table if exists naaj_A, naaj_B; | ||
create table naaj_A(a int, b int, c int); | ||
create table naaj_B(a int, b int, c int); | ||
insert into naaj_A values (1,1,1); | ||
insert into naaj_B values (1,2,2); | ||
|
||
# assert 1: both side don't have null values. | ||
# AntiLeftOuterSemiJoin | ||
explain format = 'brief' select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
|
||
# AntiSemiJoin | ||
explain format = 'brief' select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# assert 2: right side has same key bucket. | ||
insert into naaj_B values(1,1,1); | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# assert 3: right side has null values. | ||
insert into naaj_B values(1, null, 2); | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# assert 4: right side have null values, but it can't pass the inner(join key related or not) filter. | ||
explain format = 'brief' select (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
select (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
|
||
explain format = 'brief' select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
|
||
explain format = 'brief' select (a, b) != all (select a, b from naaj_B where naaj_A.a != naaj_B.a) from naaj_A; | ||
select (a, b) != all (select a, b from naaj_B where naaj_A.a != naaj_B.a) from naaj_A; | ||
|
||
explain format = 'brief' select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.a != naaj_B.a); | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.a != naaj_B.a); | ||
|
||
# assert 5: right side is empty. | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B where false); | ||
select (a, b) != all (select a, b from naaj_B where false) from naaj_A; | ||
|
||
# assert 6: right side null bucket filter (not-null join key should match with each other). | ||
insert into naaj_B values(2, null, 2); | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
delete from naaj_B where a=1 and b=1 and c=1; | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# case 4: assert the cases for the equivalent semantic predicate of != ALL | ||
select "***************************************************** PART 4 *****************************************************************" as name; | ||
delete from naaj_A; | ||
delete from naaj_B; | ||
insert into naaj_A values(1,null,1); | ||
|
||
# assert 1: left side has null, while the right is empty. | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# assert 2: left side has null, while the right has a invalid null row (can't pass the nullBit filter). | ||
insert into naaj_B values(2, null, 2); | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# left side has null, while the right has a valid null row. (passed the nullBit filter). | ||
insert into naaj_B values(null, null, 2); | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# assert 3: left side has null, while the right has a valid non-null row. | ||
delete from naaj_B; | ||
insert into naaj_B values(2, 2, 2); | ||
select (a, b) != all (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B); | ||
|
||
# assert 4: left side has null, while the right has no valid rows (equivalent to ). | ||
insert into naaj_B values(2, null, 2); | ||
insert into naaj_B values(null, null, 2); | ||
explain format = 'brief' select (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
select (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c) from naaj_A; | ||
explain format = 'brief' select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.c > naaj_B.c); | ||
|
||
# assert 5: When the inner subq has a correlated EQ condition, we won't built the NA-EQ connecting condition here. | ||
explain format = 'brief' select (a, b) != all (select a, b from naaj_B where naaj_A.c = naaj_B.c) from naaj_A; | ||
select (a, b) != all (select a, b from naaj_B where naaj_A.c = naaj_B.c) from naaj_A; | ||
explain format = 'brief' select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.c = naaj_B.c); | ||
select * from naaj_A where (a, b) != all (select a, b from naaj_B where naaj_A.c = naaj_B.c); | ||
|
||
# case 5: assert some bugs. | ||
select "***************************************************** PART 5 *****************************************************************" as name; | ||
delete from naaj_A; | ||
delete from naaj_B; | ||
insert into naaj_A values(1,1,1); | ||
insert into naaj_B values(2,null,2); | ||
|
||
# assert 1: although the probe key doesn't have null values, we still need to use buildNullBits to guarantee the non-null position has the exactly the same value. | ||
select (a,b) not in (select a, b from naaj_B) from naaj_A; | ||
select * from naaj_A where (a,b) not in (select a, b from naaj_B); | ||
|
||
# assert 2: should inject the projection under join. | ||
explain select (a+1,b*2) not in (select a, b from naaj_B) from naaj_A; | ||
select (a+1,b*2) not in (select a, b from naaj_B) from naaj_A; | ||
insert into naaj_B values(2,2,2); | ||
select (a+1,b*2) not in (select a, b from naaj_B) from naaj_A; | ||
|
||
explain select * from naaj_A where (a+1,b*2) not in (select a+1, b-1 from naaj_B); | ||
select * from naaj_A where (a+1,b*2) not in (select a, b from naaj_B); | ||
|
||
# assert 3: NA-EQ and EQ can't co-exist at the same time. | ||
explain select (a+1,b*2) not in (select a, b=1 from naaj_B where naaj_A.a = naaj_B.a) from naaj_A; | ||
explain select * from naaj_A where (a+1,b*2) not in (select a, b=1 from naaj_B where naaj_A.a = naaj_B.a); | ||
set @@session.tidb_enable_null_aware_anti_join=0; |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we have cases where naaj_A has (null, null, null) tuple?