-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: support left outer semi join for hash join v2 #57053
Conversation
Hi @wshwsh12. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #57053 +/- ##
=================================================
- Coverage 72.9983% 58.1168% -14.8815%
=================================================
Files 1657 1824 +167
Lines 457724 659502 +201778
=================================================
+ Hits 334131 383282 +49151
- Misses 103066 251034 +147968
- Partials 20527 25186 +4659
Flags with carried forward coverage won't be shown. Click here to find out more.
|
e340196
to
11e7453
Compare
11e7453
to
ec6ff80
Compare
/retest |
@wshwsh12: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/retest |
@wshwsh12: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/check_dev_2 |
/retest |
@wshwsh12: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
|
||
type leftOuterSemiJoinProbe struct { | ||
baseJoinProbe | ||
// build/probe side used columns and offset in result chunk |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the comments can not explain the code clearly?
@@ -747,6 +747,8 @@ func NewJoinProbe(ctx *HashJoinCtxV2, workID uint, joinType logicalop.JoinType, | |||
return newOuterJoinProbe(base, !rightAsBuildSide, rightAsBuildSide) | |||
case logicalop.RightOuterJoin: | |||
return newOuterJoinProbe(base, rightAsBuildSide, rightAsBuildSide) | |||
case logicalop.LeftOuterSemiJoin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
check rightAsBuildSide
is always true
here?
|
||
func (j *leftOuterSemiJoinProbe) buildResult(chk *chunk.Chunk, startProbeRow int) { | ||
selected := make([]bool, j.chunkRows) | ||
for i := startProbeRow; i < j.currentProbeRow; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if chk
is empty, and startProbeRow == 0 && j.currentProbeRow == j.chunkRows && j.currentChunk.sel() == nil
is true, can we use shallow copy instead of deep copy?
} | ||
|
||
if j.ctx.hasOtherCondition() { | ||
err = j.probeForInnerSideBuildWithOtherCondition(joinResult.chk, joinedChk, sqlKiller) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems no need to add ForInnerSideBuild
since it always use inner side build
if j.ctx.hasOtherCondition() { | ||
err = j.probeForInnerSideBuildWithOtherCondition(joinResult.chk, joinedChk, sqlKiller) | ||
} else { | ||
err = j.probeForInnerSideBuildWithoutOtherCondition(joinResult.chk, joinedChk, remainCap, sqlKiller) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
|
||
// isMatchedRows marks whether the left side row is matched | ||
isMatchedRows []bool | ||
// isNullRows marks whether the left side row is null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// isNullRows marks whether the left side row is null | |
// isNullRows marks whether the left side row match result is null |
baseJoinProbe: base, | ||
processedProbeRowIdxSet: make(map[int]struct{}), | ||
} | ||
probe.leftColUsed = base.lUsed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why copy this field, you can use base.lUsed
directly?
} | ||
|
||
if j.currentProbeRow == j.chunkRows && len(j.processedProbeRowIdxSet) == 0 { | ||
j.buildResult(chk, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it would exceed chk's capacity if the chk is not empty?
j.processedProbeRowIdxQueue.Push(i) | ||
} | ||
} | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like L56-L69 is exactly the same as L78-91, I think we can reuse the code
} | ||
if j.ctx.hasOtherCondition() { | ||
j.processedProbeRowIdxQueue.Clear() | ||
for i := 0; i < j.chunkRows; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since matchedRowsHeaders
is already set in base.SetChunkForProbe
, I think we can only add the rows that matchedRowsHeaders[rows] != 0
to processedProbeIdxQueue
.
isNulls []bool | ||
|
||
// used in other condition to record which rows need to be processed | ||
processedProbeRowIdxQueue *queue.Queue[int] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe unFinishedProbeRowIdxQueue
is a more self-explain name?
} | ||
if j.ctx.hasOtherCondition() { | ||
j.processedProbeRowIdxQueue.Clear() | ||
for i := 0; i < j.chunkRows; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
} | ||
|
||
func (j *leftOuterSemiJoinProbe) concatenateProbeAndBuildRows(joinedChk *chunk.Chunk, sqlKiller *sqlkiller.SQLKiller) error { | ||
joinedChkRemainCap := joinedChk.Capacity() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
joinedChkRemainCap := joinedChk.Capacity() | |
joinedChkRemainCap := joinedChk.Capacity() - joinedChk.NumRows() |
} | ||
} | ||
|
||
func TestLeftOuterSemiJoinSpillBasic(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need add 3 tests for spill
- basic tests
- with other condition
- with other condition and sel array.
|
||
err := failpoint.Enable("github.com/pingcap/tidb/pkg/executor/join/slowWorkers", `return(true)`) | ||
require.NoError(t, err) | ||
defer failpoint.Disable("github.com/pingcap/tidb/pkg/executor/join/slowWorkers") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think basic test don't need to enable slow workers?
if startProbeRow == 0 && j.currentProbeRow == j.chunkRows && j.currentChunk.Sel() == nil && chk.NumRows() == 0 && len(j.spilledIdx) == 0 { | ||
// TODO: Can do a shallow copy by directly copying the Column pointers | ||
for index, colIndex := range j.lUsed { | ||
chk.SetCol(index, j.currentChunk.Column(colIndex).CopyConstruct(nil)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not pass chk.Column(index)
to CopyConstruct
?
/retest |
@wshwsh12: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
[LGTM Timeline notifier]Timeline:
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: AilinKid, windtalker, xzhangxian1008 The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/retest |
@wshwsh12: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test unit-test |
@windtalker: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/test unit-test |
@windtalker: Cannot trigger testing until a trusted user reviews the PR and leaves an In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
ref pingcap#53127 handle key-too-large error from MemBuffer Signed-off-by: you06 <you1474600@gmail.com> test MemBuffer's oversize error to tidb error Signed-off-by: you06 <you1474600@gmail.com> update errdoc Signed-off-by: you06 <you1474600@gmail.com>
ref pingcap#53127 handle key-too-large error from MemBuffer Signed-off-by: you06 <you1474600@gmail.com> test MemBuffer's oversize error to tidb error Signed-off-by: you06 <you1474600@gmail.com> update errdoc Signed-off-by: you06 <you1474600@gmail.com>
What problem does this PR solve?
Issue Number: ref #53127
Problem Summary:
What changed and how does it work?
Support left outer semi join for hash join v2.
Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.