Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: deflake decommission/nodes=4 test #51907

Conversation

irfansharif
Copy link
Contributor

Fixes #51713. Previously this test used to decommission+wipe node 1,
which played a bit badly with #51329, where node 1 is tasked to
initialize the cluster (and no-ops attempts on finding a persisted
file). Given that previously this test wiped away the store dir in its
entirety, this test attempted to re-initialize the cluster and naturally
failed to do so. As a short workaround we just decommission+wipe nodes
towards the end of the cluster list (so node 4), instead of the
beginning. And while here, clean the test code up a bit.

Release note: None.

@irfansharif irfansharif requested review from knz and tbg July 27, 2020 04:51
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @irfansharif, @knz, and @tbg)


pkg/cmd/roachtest/decommission.go, line 84 at r1 (raw file):

	// but conceivably we'll want to run a test with replication factor five
	// at some point.
	numDecom := (defaultReplicationFactor - 1) / 2

[nit] maybe add an check that nodes > numDecom

Fixes cockroachdb#51713. Previously this test used to decommission+wipe node 1,
which played a bit badly with cockroachdb#51329, where node 1 is tasked to
initialize the cluster (and no-ops attempts on finding a persisted
file). Given that previously this test wiped away the store dir in its
entirety, this test attempted to re-initialize the cluster and naturally
failed to do so. As a short workaround we just decommission+wipe nodes
towards the end of the cluster list (so node 4), instead of the
beginning. And while here, clean the test code up a bit.

Release note: None.
@irfansharif irfansharif force-pushed the 200723.rewrite-decommission/nodes-roachtest branch from 34eb0f0 to 30aaf6d Compare July 28, 2020 14:19
@irfansharif
Copy link
Contributor Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 28, 2020

Build succeeded:

@craig craig bot merged commit 4c24889 into cockroachdb:master Jul 28, 2020
@irfansharif irfansharif deleted the 200723.rewrite-decommission/nodes-roachtest branch July 28, 2020 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

roachtest: decommission/nodes=4/duration=1h0m0s failed
4 participants