Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest/bank: don't start chaos until nodes are serving SQL #30989

Merged
merged 3 commits into from
Oct 15, 2018

Conversation

nvanbenschoten
Copy link
Member

@nvanbenschoten nvanbenschoten commented Oct 4, 2018

Informs #30064.

Release note: None

@nvanbenschoten
Copy link
Member Author

Please hold off on this. I thought it would fix the second class of failures we saw in #30064, but I just had a repro so it doesn't seem like it did.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised why you didn't change the break on l156 into a return.

@nvanbenschoten
Copy link
Member Author

I still have this cleanup on my plate, including the l156 fix, but I'd like to actually get this test working without flakes first, which is turning out to be surprisingly tricky.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@nvanbenschoten nvanbenschoten changed the title roachtest/bank: don't deadlock on teardown roachtest/bank: don't start chaos until nodes are serving SQL Oct 15, 2018
@nvanbenschoten
Copy link
Member Author

Ok, this has been updated based on the discussion in #30064. PTAL.

Things changed a bit in #31011, so the first commit was replaced with a bit of cleanup around that change.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r1, 1 of 1 files at r2.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/cmd/roachtest/bank.go, line 210 at r1 (raw file):

	consistentIdx int,
) {
	s.waitGroup.Add(1)

Is this kosher? You're calling Add from a goroutine and then calling Wait from the main routine. I think this would hit an internal assertion in WaitGroup, except that the Wait doesn't happen right away.


pkg/cmd/roachtest/bank.go, line 215 at r3 (raw file):

	// Don't begin the chaos monkey until all nodes are serving SQL connections.
	// This ensures that we don't test cluster initialization under chaos.
	for i := 0; i < c.nodes; i++ {

Nit: for i := 0; i <= c.nodes; i++. Then in the body you can use i instead of i+1.

Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/cmd/roachtest/bank.go, line 210 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Is this kosher? You're calling Add from a goroutine and then calling Wait from the main routine. I think this would hit an internal assertion in WaitGroup, except that the Wait doesn't happen right away.

Yeah, you're absolutely right.


pkg/cmd/roachtest/bank.go, line 215 at r3 (raw file):

Previously, petermattis (Peter Mattis) wrote…

Nit: for i := 0; i <= c.nodes; i++. Then in the body you can use i instead of i+1.

Make it for i := 1; i <= c.nodes; i++ and you've got a deal.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: mod Peter's comments which it looks you're fixing up right now.

Reviewed 1 of 1 files at r1, 1 of 1 files at r2, 1 of 1 files at r3.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)

Copy link
Member Author

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale)


pkg/cmd/roachtest/bank.go, line 210 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Yeah, you're absolutely right.

Done.


pkg/cmd/roachtest/bank.go, line 215 at r3 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Make it for i := 1; i <= c.nodes; i++ and you've got a deal.

Done.

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale)

Copy link
Contributor

@a-robinson a-robinson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having just read through the recent comments on #30064 I agree this is the right change to make. We should also reduce the migration lease duration, though. There's no need for it to be so long given that we renew it in the background while running migrations.

@nvanbenschoten
Copy link
Member Author

TFTRs.

We should also reduce the migration lease duration, though.

I was hoping for a quick fix here, but it looks like this is tied to the sql.LeaseManager, which uses the same duration for all leases. This will have to wait on other improvements to the leasing system, like #24041.

bors r+

craig bot pushed a commit that referenced this pull request Oct 15, 2018
30989: roachtest/bank: don't start chaos until nodes are serving SQL r=nvanbenschoten a=nvanbenschoten

Informs #30064.

Release note: None

Co-authored-by: Nathan VanBenschoten <nvanbenschoten@gmail.com>
@craig
Copy link
Contributor

craig bot commented Oct 15, 2018

Build succeeded

@craig craig bot merged commit b1ab7dd into cockroachdb:master Oct 15, 2018
@nvanbenschoten nvanbenschoten deleted the nvabenschoten/backFix branch October 16, 2018 00:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants