-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kvserver: TestReplicateQueueDownReplicate #49812
kvserver: TestReplicateQueueDownReplicate #49812
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and thanks!
weird that it didn't repro. Did you try roachprod-stress?
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
pkg/kv/kvserver/replicate_queue_test.go, line 265 at r1 (raw file):
tc.ToggleReplicateQueues(false) testKey := tc.ScratchRange(t)
Alternatively, you could throw in a call to tc.ScratchRange(t)
before the call to WaitForFullReplication
so that in the future the ScratchRange
is just at 3x initially as probably everyone expects it to. I assume you're not keen on potentially making other tests flaky though.
pkg/kv/kvserver/replicate_queue_test.go, line 268 at r1 (raw file):
desc := tc.LookupRangeOrFatal(t, testKey) // At the end of StartTestCluster(), all ranges have 5 replicas since they're // all "system ranges". When the ScratchRange(), it also starts up with 5
sentence is off.
pkg/testutils/testcluster/testcluster.go, line 756 at r1 (raw file):
// I think it shouldn't need any retries. See #38565. func (tc *TestCluster) WaitForFullReplication() error { log.Infof(context.TODO(), "WaitForFullReplication")
Remove?
This test was trying to upreplicate a range and then wait for the replication queue to downreplicate it. The test had rotted back when we switched the replication factor of system ranges from 3x to 5x; with 5x replication, the range started off as having 5 replicas so the upreplication step was unnecessary. Worse, the upreplication part was flaky, presumably because it was constantly racing with the replication queue trying to downreplicate the range (although I couldn't repro despite a lot of stress). Fixes cockroachdb#48284 Release note: None
701ed98
to
8221950
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't roachprod-stress but I did stress it for hours on a single GCE worker. I couldn't stressrace it very long because of the tsan issue.
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @tbg)
pkg/kv/kvserver/replicate_queue_test.go, line 265 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Alternatively, you could throw in a call to
tc.ScratchRange(t)
before the call toWaitForFullReplication
so that in the future theScratchRange
is just at 3x initially as probably everyone expects it to. I assume you're not keen on potentially making other tests flaky though.
yeah... I dunno where to add more magic. I'll leave it for next time.
pkg/kv/kvserver/replicate_queue_test.go, line 268 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
sentence is off.
done
pkg/testutils/testcluster/testcluster.go, line 756 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Remove?
this was intentional; I've found it useful in the past to tell when this phase of the testcluster startup runs.
Build succeeded |
This test was trying to upreplicate a range and then wait for the
replication queue to downreplicate it. The test had rotted back when we
switched the replication factor of system ranges from 3x to 5x; with
5x replication, the range started off as having 5 replicas so the
upreplication step was unnecessary. Worse, the upreplication part was
flaky, presumably because it was constantly racing with the replication
queue trying to downreplicate the range (although I couldn't repro
despite a lot of stress).
Fixes #48284
Release note: None