Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TestRaftRemoveRace is flaky with multiple CPUs #2417

Closed
tamird opened this issue Sep 9, 2015 · 2 comments
Closed

TestRaftRemoveRace is flaky with multiple CPUs #2417

tamird opened this issue Sep 9, 2015 · 2 comments

Comments

@tamird
Copy link
Contributor

tamird commented Sep 9, 2015

Couple different failures:

make test PKG=./storage TESTS=TestRaftRemoveRace CPUS=4,4,4 TESTFLAGS='-count 10'
go test -tags ''  -i ./storage
go test -tags ''  -run TestRaftRemoveRace -cpu 4,4,4 ./storage -timeout 1m10s -count 10
panic: requested entry at index is unavailable

goroutine 4775 [running]:
github.com/coreos/etcd/raft.(*raftLog).term(0xc8204a2380, 0x30, 0x100, 0x0, 0x0)
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/log.go:231 +0x1a9
github.com/coreos/etcd/raft.(*raftLog).matchTerm(0xc8204a2380, 0x30, 0x6, 0x42d74cb)
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/log.go:265 +0x2b
github.com/coreos/etcd/raft.(*raftLog).maybeAppend(0xc8204a2380, 0x30, 0x6, 0x35, 0xc82029dd40, 0x5, 0x8, 0x35, 0xc8202f7800)
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/log.go:78 +0x5d
github.com/coreos/etcd/raft.(*raft).handleAppendEntries(0xc820283930, 0x3, 0x300000003, 0x100000001, 0x6, 0x6, 0x30, 0xc82029dd40, 0x5, 0x8, ...)
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/raft.go:684 +0x1f1
github.com/coreos/etcd/raft.stepFollower(0xc820283930, 0x3, 0x300000003, 0x100000001, 0x6, 0x6, 0x30, 0xc82029dd40, 0x5, 0x8, ...)
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/raft.go:655 +0x3b9
github.com/coreos/etcd/raft.(*raft).Step(0xc820283930, 0x3, 0x300000003, 0x100000001, 0x6, 0x6, 0x30, 0xc82029dd40, 0x5, 0x8, ...)
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/raft.go:508 +0x2a8
github.com/coreos/etcd/raft.(*multiNode).run(0xc8201e9860)
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/multinode.go:238 +0x114c
created by github.com/coreos/etcd/raft.StartMultiNode
    /Users/tamird/src/go/src/github.com/coreos/etcd/raft/multinode.go:56 +0x30d
make test PKG=./storage TESTS=TestRaftRemoveRace CPUS=4,4,4 TESTFLAGS='-count 10'
go test -tags ''  -i ./storage
go test -tags ''  -run TestRaftRemoveRace -cpu 4,4,4 ./storage -timeout 1m10s -count 10
I0908 21:22:07.987061 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:09.272085 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:10.418441 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:11.567381 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:12.847565 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:14.064740 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
E0908 21:22:15.175041 45827 multiraft/transport.go:179  sending rpc failed: read tcp 127.0.0.1:56742->127.0.0.1:56739: read: connection reset by peer
I0908 21:22:15.182221 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:16.489833 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:17.816160 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:18.941213 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:20.282636 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:21.411305 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:22.536976 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:23.815758 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:24.996491 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:26.173720 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:27.358105 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:28.567721 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:29.746479 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:30.977441 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
W0908 21:22:32.479388 45827 multiraft/multiraft.go:962  node 200000002 failed to send message to 100000001: multiraft/transport.go:135: transport is closed
W0908 21:22:32.479642 45827 multiraft/multiraft.go:962  node 200000002 failed to send message to 300000003: multiraft/transport.go:135: transport is closed
I0908 21:22:32.508499 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:33.689695 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:35.031316 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:36.279998 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:37.515040 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:38.948663 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
W0908 21:22:39.582743 45827 multiraft/multiraft.go:916  aborting configuration change: key range ""-"" outside of bounds of range ""-""
W0908 21:22:39.584253 45827 multiraft/multiraft.go:916  aborting configuration change: key range ""-"" outside of bounds of range ""-""
W0908 21:22:39.647524 45827 multiraft/multiraft.go:916  aborting configuration change: key range ""-"" outside of bounds of range ""-""
--- FAIL: TestRaftRemoveRace (1.75s)
    testing.go:150: storage/client_test.go:346: condition failed to evaluate within 1s: storage/client_test.go:342: range not found on store 2
I0908 21:22:40.689118 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:42.009783 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:43.480678 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
I0908 21:22:44.729970 45827 storage/replica_command.go:1045  range 1: new leader lease replica 1:1 19:00:00.000 +1.000s
FAIL
FAIL    github.com/cockroachdb/cockroach/storage    38.302s
make: *** [test] Error 1
@bdarnell
Copy link
Contributor

Looks like a dupe of #1878; the fix is to complete the work in the replica_tombstone RFC.

@tamird
Copy link
Contributor Author

tamird commented Sep 10, 2015

Updated #1878. Closing.

@tamird tamird closed this as completed Sep 10, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants