Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exec: reset internal unsafeBatch in orderedAggregator #40668

Merged
merged 1 commit into from
Sep 17, 2019
Merged

exec: reset internal unsafeBatch in orderedAggregator #40668

merged 1 commit into from
Sep 17, 2019

Conversation

asubiotto
Copy link
Contributor

@asubiotto asubiotto commented Sep 11, 2019

Not doing so could lead to correctness results. This was not caught by
tests, so runTests has been extended to check for operators that can be
initialized with a variable output size. This increases
verifySelAndNullResets's test coverage, since it doesn't do anything if
only a single batch is output, which is the case for most unit tests.

Release note: None

Release justification: This commit fixes correctness bugs and increases
test coverage (Category 2).

Fixes #40641

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@asubiotto
Copy link
Contributor Author

Modified the variable size interface so that the merge joiner can also be tested and the tests are failing. Will investigate.

@asubiotto
Copy link
Contributor Author

Not too sure what's up. Pushed up the changes to test the mergejoiner. @yuzefovich would you mind taking a quick look to see if you see anything obvious? make test PKG=./pkg/sql/exec TESTS=TestMergeJoiner/verifySelAndNullResets should fail on this branch.

@yuzefovich
Copy link
Member

Nice find! This was a bug with the merge joiner. Here is the diff that fixes it:

--- a/pkg/sql/exec/mergejoiner_tmpl.go
+++ b/pkg/sql/exec/mergejoiner_tmpl.go
@@ -1238,13 +1238,13 @@ func (o *mergeJoin_JOIN_TYPE_STRING_FILTER_INFO_STRINGOp) calculateOutputCount(
 }
 
 func (o *mergeJoin_JOIN_TYPE_STRING_FILTER_INFO_STRINGOp) Next(ctx context.Context) coldata.Batch {
+       if o.needToResetOutput {
+               o.needToResetOutput = false
+               o.output.ResetInternalBatch()
+       }
        for {
                switch o.state {
                case mjEntry:
-                       if o.needToResetOutput {
-                               o.needToResetOutput = false
-                               o.output.ResetInternalBatch()
-                       }
                        o.initProberState(ctx)
 
                        if o.nonEmptyBufferedGroup() {

The incorrect assumption was that we would always get into mjEntry state before returning an output batch. However, this is not the case when the output cross product spans several batches - in such a case, on the second, third, etc output batches we go straight to mjBuild state, so we would never actually reset the internal output batch.

Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: Thanks!

Reviewed 3 of 3 files at r1.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @asubiotto)


pkg/sql/exec/utils_test.go, line 81 at r1 (raw file):

// initialized with variable output size batches. This allows runTests to
// increase test coverage of these operators.
type variableBatchSizeInitializer interface {

[nit]: also would call this variableOutputBatchSizeInitializer.

I added a similar mjTestInitializer interface in mergejoiner_test.go, I think we should remove that in favor of the one you're introducing.


pkg/sql/exec/utils_test.go, line 82 at r1 (raw file):

// increase test coverage of these operators.
type variableBatchSizeInitializer interface {
	initWithBatchSize(outputSize uint16)

[nit]: I'd call this initWithOutputBatchSize and rename the argument to just size.

@yuzefovich
Copy link
Member

Here is a better fix:

--- a/pkg/sql/exec/mergejoiner.go
+++ b/pkg/sql/exec/mergejoiner.go
@@ -351,9 +351,8 @@ type mergeJoinBase struct {
        right mergeJoinInput
 
        // Output buffer definition.
-       output            coldata.Batch
-       needToResetOutput bool
-       outputBatchSize   uint16
+       output          coldata.Batch
+       outputBatchSize uint16
        // outputReady is a flag to indicate that merge joiner is ready to emit an
        // output batch.
        outputReady bool
diff --git a/pkg/sql/exec/mergejoiner_tmpl.go b/pkg/sql/exec/mergejoiner_tmpl.go
index b40213cb7d..5fc228e389 100644
--- a/pkg/sql/exec/mergejoiner_tmpl.go
+++ b/pkg/sql/exec/mergejoiner_tmpl.go
@@ -1238,13 +1238,10 @@ func (o *mergeJoin_JOIN_TYPE_STRING_FILTER_INFO_STRINGOp) calculateOutputCount(
 }
 
 func (o *mergeJoin_JOIN_TYPE_STRING_FILTER_INFO_STRINGOp) Next(ctx context.Context) coldata.Batch {
+       o.output.ResetInternalBatch()
        for {
                switch o.state {
                case mjEntry:
-                       if o.needToResetOutput {
-                               o.needToResetOutput = false
-                               o.output.ResetInternalBatch()
-                       }
                        o.initProberState(ctx)
 
                        if o.nonEmptyBufferedGroup() {
@@ -1281,7 +1278,6 @@ func (o *mergeJoin_JOIN_TYPE_STRING_FILTER_INFO_STRINGOp) Next(ctx context.Conte
                                o.output.SetLength(o.builderState.outCount)
                                // Reset builder out count.
                                o.builderState.outCount = uint16(0)
-                               o.needToResetOutput = true
                                o.outputReady = false
                                return o.output
                        }

Copy link
Contributor Author

@asubiotto asubiotto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Applied the mergejoiner patch

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @yuzefovich)


pkg/sql/exec/utils_test.go, line 81 at r1 (raw file):

Previously, yuzefovich wrote…

[nit]: also would call this variableOutputBatchSizeInitializer.

I added a similar mjTestInitializer interface in mergejoiner_test.go, I think we should remove that in favor of the one you're introducing.

Done.


pkg/sql/exec/utils_test.go, line 82 at r1 (raw file):

Previously, yuzefovich wrote…

[nit]: I'd call this initWithOutputBatchSize and rename the argument to just size.

Done.

Not doing so could lead to correctness results. This was not caught by
tests, so runTests has been extended to check for operators that can be
initialized with a variable output size. This increases
verifySelAndNullResets's test coverage, since it doesn't do anything if
only a single batch is output, which is the case for most unit tests.

Release note: None

Release justification: This commit fixes correctness bugs and increases
test coverage (Category 2).
@asubiotto
Copy link
Contributor Author

bors r=yuzefovich

@asubiotto
Copy link
Contributor Author

bors r-

@asubiotto
Copy link
Contributor Author

bors r=yuzefovich

@craig
Copy link
Contributor

craig bot commented Sep 17, 2019

Build failed

@asubiotto
Copy link
Contributor Author

Release justification missing from PR description

bors r=yuzefovich

craig bot pushed a commit that referenced this pull request Sep 17, 2019
40668: exec: reset internal unsafeBatch in orderedAggregator r=yuzefovich a=asubiotto

Not doing so could lead to correctness results. This was not caught by
tests, so runTests has been extended to check for operators that can be
initialized with a variable output size. This increases
verifySelAndNullResets's test coverage, since it doesn't do anything if
only a single batch is output, which is the case for most unit tests.

Release note: None

Release justification: This commit fixes correctness bugs and increases
test coverage (Category 2).

Fixes #40641

Co-authored-by: Alfonso Subiotto Marqués <alfonso@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Sep 17, 2019

Build succeeded

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: vec: experimental_on has incorrect results
3 participants