Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

backport-2.1: assorted testing-related backports #30503

Conversation

tbg
Copy link
Member

@tbg tbg commented Sep 21, 2018

Backport:

Please see individual PRs for details.

/cc @cockroachdb/release

We saw a consistency failure in cockroachdb#29252 that would've been much more
useful had it occurred close to the time around which the inconsistency
must have been introduced. Instead of leaving it to chance, add a switch
that runs aggressive checks in (roach) tests that want them such as the
clearrange test.

Release note: None
Unfortunately, the method to determine the range count is quite slow
since crdb_internal.ranges internally sends an RPC for each range to
determine the leaseholder.

Anecdotally, I've seen ~25% of the merges completed after less than 15
minutes. I know that it's slowing down over time, but @benesch will fix
that.

Also throws in aggressive consistency checks so that when something goes
out of sync, we find out right there.

Release note: None
Nobody likes scrolling for minutes at a time.

Release note: None
Previously, the message for existing issues didn't contain the output.
This was annoying as you care about the most recent failures first and
foremost, typically.

Release note: None
This tends to create duplicate issues for a similar root cause. I think
it's better to collect failures from different branches in an umbrella
issue, even though it implies that we need to be more careful about
closing them.

Release note: None
all of its subtests are already stable, but in running a test locally I
noticed that the top-level test was marked as passing as unstable. I'm
not sure, but this might mean that the top-level test would actually not
fail? Either way, better to mark it as stable explicitly.

We should also spend some thought on how diverging notions of Stable in
sub vs top level test are treated, not sure that this is well-defined.

Release note: None
setReplicaID refreshes the proposal and was thus synchronously writing
to the commandProposed chan. This channel could have filled up due to
an earlier reproposal already, deadlocking the test.

Fixes cockroachdb#28132.

Release note: None
AddReplicas was verifying that a replica had indeed been added, but
there's no guarantee that the replicate queue wouldn't have removed
it in the meantime. Attempt to work around this somewhat. The real
solution is not to provide that guarantee, but some tests likely
rely on it (and the failure is extremely rare, i.e. the new for
loop basically never runs).

Observed in cockroachdb#28368.

Release note: None
The test could deadlock if the callback fired during shutdown due to a
send on an unbuffered channel. Haven't been able to get this to fail
after this commit.

Fixes cockroachdb#29982.

Release note: None
The teamcity output is so much less pleasant to read. Better to just
open the artifact sometimes.

Release note: None
The kill signal was sometimes a noop (when issued before the process
to be killed started). Prior to this patch, that would leave the test
stuck.

Fixes cockroachdb#30475.

Release note: None
I suspect not doing so previously can cause issues like cockroachdb#30397.
Local roachtests are simply not set up to run concurrently, but once a
test times out we're doing that (though we attempt to teardown the
cluster, tests can apparently still "do stuff" with it, for example
the rapid restarts test was running one-off invocations of CockroachDB
via Exec; also note the comment near the added lines suggesting that
in local clusters we may not actually be destroying them).

We can revisit this when we allow multiple local clusters in parallel,
as we can then use a different one.

Optimistically:

Fixes cockroachdb#30397.

Release note: None
@tbg tbg requested review from a team September 21, 2018 15:11
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tbg tbg merged commit 7048f0e into cockroachdb:release-2.1 Sep 22, 2018
@tbg tbg deleted the backport2.1-29646-30420-30419-30405-30450-30451-30452-30455-30456-30373-30479-30496-30497 branch September 22, 2018 05:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants