Skip to content

Conversation

@knz
Copy link
Contributor

@knz knz commented Apr 24, 2019

Sub-tests that invoke t.Parallel() get to run concurrently with
their parent test, and may be delayed arbitrarily past beyond the end
of the termination of the parent test (including beyond the execution
of its defer calls).

This means it's unsafe to call t.Parallel() with
e.g. leaktest.AfterTest().

Release note: None

@knz knz requested review from a team, RaduBerinde and tbg April 24, 2019 10:19
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 2 of 2 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)


pkg/rpc/context_test.go, line 803 at r1 (raw file):

		pName := fmt.Sprintf("client%sserver,server%sclient", connIcon(c.partitionC2S), connIcon(c.partitionS2C))
		t.Run(kaName+"/"+pName, func(t *testing.T) {
			// TODO(knz): t.Parallel disabled until leaktest.AfterTest learns

Hmm, that's a bummer. Test takes 28s (up from <4s with enough parallelism). I think here I'd rip out the subtests and start them as goroutines with a waitgroup, writing into an error channel. That way we get even better parallelism than with t.Parallel and the main test will wait. Or, if we decide that sometimes use of t.Parallel is ok and we don't want to lint it away, we can keep using it and we'll just add the waitgroup to the test.

var wg sync.WaitGroup
wg.Add(len(testCases))
errCh := make(chan error, len(testCases))
for _, c := range testCases {
  go func(c testCase) {
    errCh <- errors.Wrapf(runTestCase(c), "%+v", c)
  }(c)
}
wg.Wait()
close(errCh)

for err := range errCh {
  t.Errorf("%+v", err)
}

pkg/sql/physical_props_test.go, line 363 at r1 (raw file):

				// how to wait for all sub-tests to complete.
				//
				// t.Parallel()

This one doesn't change the duration of the test (I verified), so I'd just nuke this.


pkg/sql/physical_props_test.go, line 572 at r1 (raw file):

			// how to wait for all sub-tests to complete.
			//
			//			t.Parallel()

Verified that this can be nuked as well.


pkg/sql/physical_props_test.go, line 769 at r1 (raw file):

		tc := testCases[i]
		t.Run(tc.name, func(t *testing.T) {
			// TODO(knz): t.Parallel disabled until leaktest.AfterTest learns

Not tested, but pretty sure all the tests in this file don't need t.Parallel.

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting. Seems like there is a missing API in test that we should file an issue for - either a function to register a cleanup to happen after the parent waits for child subtests, or a function to wait for all subtests.

We could use a wait group though, replacing every t.Parallel() call with the block wg.Add(1); defer wg.Done(); t.Parallel()

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @tbg)


pkg/rpc/context_test.go, line 803 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Hmm, that's a bummer. Test takes 28s (up from <4s with enough parallelism). I think here I'd rip out the subtests and start them as goroutines with a waitgroup, writing into an error channel. That way we get even better parallelism than with t.Parallel and the main test will wait. Or, if we decide that sometimes use of t.Parallel is ok and we don't want to lint it away, we can keep using it and we'll just add the waitgroup to the test.

var wg sync.WaitGroup
wg.Add(len(testCases))
errCh := make(chan error, len(testCases))
for _, c := range testCases {
  go func(c testCase) {
    errCh <- errors.Wrapf(runTestCase(c), "%+v", c)
  }(c)
}
wg.Wait()
close(errCh)

for err := range errCh {
  t.Errorf("%+v", err)
}
</blockquote></details>

Ah, I made the same suggestion. Why do we need the error channel though? The test infrastructure should still work, we just need the `Wait()`.


<!-- Sent from Reviewable.io -->

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)


pkg/rpc/context_test.go, line 803 at r1 (raw file):

Previously, RaduBerinde wrote…

Ah, I made the same suggestion. Why do we need the error channel though? The test infrastructure should still work, we just need the Wait().

Using testing.T on a goroutine is not supported by the testing framework, and will cause problems. In particular, t.Fatal off the test goroutine will result in an uncaught runtime.Goexit. Note how runTestCase doesn't even get to use the t. That's to avoid this problem in the first place. Using t.Error is fine btw, but that's not a pattern I want to put out there.

@RaduBerinde
Copy link
Member

RaduBerinde commented Apr 24, 2019


pkg/rpc/context_test.go, line 803 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Using testing.T on a goroutine is not supported by the testing framework, and will cause problems. In particular, t.Fatal off the test goroutine will result in an uncaught runtime.Goexit. Note how runTestCase doesn't even get to use the t. That's to avoid this problem in the first place. Using t.Error is fine btw, but that's not a pattern I want to put out there.

But you don't need a goroutine. Just call Run like before and put defer wg.Done() inside that. Run will return when the inner test calls Parallel (it has to, or the parent can't continue to call more Runs).

@RaduBerinde
Copy link
Member

Raphael points out that the subtests block inside Parallel() until the parent test completes, so my suggestion won't work. I filed golang/go#31651.

@knz knz force-pushed the 20190424-parallel branch from ce21a47 to 9ab7189 Compare April 24, 2019 11:34
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RFAL

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @tbg)


pkg/rpc/context_test.go, line 803 at r1 (raw file):

Previously, RaduBerinde wrote…

But you don't need a goroutine. Just call Run like before and put wg.Done() inside that. Run will return when the inner test calls Parallel (it has to, or the parent can't continue to call more Runs).

Modified to use Tobias' suggestion.


pkg/sql/physical_props_test.go, line 363 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

This one doesn't change the duration of the test (I verified), so I'd just nuke this.

Done.


pkg/sql/physical_props_test.go, line 572 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Verified that this can be nuked as well.

Done.


pkg/sql/physical_props_test.go, line 769 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Not tested, but pretty sure all the tests in this file don't need t.Parallel.

Done.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

optional suggestion: add a lint against t.Parallel following the example here (ironic that there would be a call to t.Parallel() in that same snippet):

t.Parallel()
cmd, stderr, filter, err := dirCmd(
pkgDir,
"git",
"grep",
"-nE",
`\.Clone\([^)]`,
"--",
"*.go",
":!util/protoutil/clone_test.go",
":!util/protoutil/clone.go",
)

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz and @RaduBerinde)


pkg/rpc/context_test.go, line 803 at r1 (raw file):

Previously, knz (kena) wrote…

Modified to use Tobias' suggestion.

Optional nit: move this out of the test so that it doesn't have access to a *testing.T, or do t := (*testing.T)(nil) here to reliably catch any accidental calls in the method.

@knz knz requested a review from a team April 24, 2019 13:07
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

optional suggestion: add a lint against t.Parallel

Good idea. Done. Will merge this PR once #36952 has merged, as this will remove the one remaining call to t.Parallel().

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @tbg)


pkg/rpc/context_test.go, line 803 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Optional nit: move this out of the test so that it doesn't have access to a *testing.T, or do t := (*testing.T)(nil) here to reliably catch any accidental calls in the method.

Done. This was a good call -- there was a remaining use of t inside the function.

@knz knz force-pushed the 20190424-parallel branch from bebcaa3 to cb5f1d6 Compare April 24, 2019 13:09
Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewed 1 of 1 files at r3, 4 of 4 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @knz)


pkg/rpc/context_test.go, line 803 at r1 (raw file):

Previously, knz (kena) wrote…

Done. This was a good call -- there was a remaining use of t inside the function.

🐛 🔨


pkg/testutils/lint/lint_test.go, line 775 at r4 (raw file):

			stream.GrepNot(`// SAFE FOR TESTING`),
		), func(s string) {
			t.Errorf("\n%s <- forbidden, use a sync.WaitGroup instead", s)

Refer to golang/go#31651 in the message to satisfy the curious.

@knz knz force-pushed the 20190424-parallel branch from cb5f1d6 to 0f1a488 Compare April 24, 2019 13:29
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @tbg)


pkg/testutils/lint/lint_test.go, line 775 at r4 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Refer to golang/go#31651 in the message to satisfy the curious.

Done.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 1 files at r5.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

knz added 3 commits April 24, 2019 15:53
Sub-tests that invoke `t.Parallel()` get to run concurrently with
their parent test, and may be delayed arbitrarily past beyond the end
of the termination of the parent test (including beyond the execution
of its `defer` calls).

This means it's unsafe to call `t.Parallel()` with
e.g. `leaktest.AfterTest()`.

This patch removes `t.Parallel` from the test in
`sql/physical_props_test.go` and modifies
`pkg/rpc/TestGRPCKeepaliveFailureFailsInflightRPCs` to use a wait
group instead.

Release note: None
…ightRPCs

Since this now runs on its own goroutine, it is invalid for it to
access the surrounding `testing.T`. This patch moves the code so that
`t` is clearly out of scope.

Release note: None
This suggests using `sync.WaitGroup` instead.

Release note: None
@knz knz force-pushed the 20190424-parallel branch from 0f1a488 to 6f30625 Compare April 24, 2019 13:54
@knz
Copy link
Contributor Author

knz commented Apr 24, 2019

bors r=tbg,RaduBerinde

craig bot pushed a commit that referenced this pull request Apr 24, 2019
37072: tests: avoid t.Parallel() when the top test has defers r=tbg,RaduBerinde a=knz

Sub-tests that invoke `t.Parallel()` get to run concurrently with
their parent test, and may be delayed arbitrarily past beyond the end
of the termination of the parent test (including beyond the execution
of its `defer` calls).

This means it's unsafe to call `t.Parallel()` with
e.g. `leaktest.AfterTest()`.

Release note: None

Co-authored-by: Raphael 'kena' Poss <knz@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Apr 24, 2019

Build succeeded

@craig craig bot merged commit 6f30625 into cockroachdb:master Apr 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants