You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Pick any node in the cluster; call it slow-leader.
Set up a libfaketime wrapper for that node's PD server which runs it at 0.5x realtime.
For all other nodes, set their wrappers to run at 1x realtime.
Force slow-leader to become a leader by calling any PD node's /leader/transfer API.
Create a network partition which splits the network into randomly selected components: a majority component, with n/2+1 nodes, and a minority component, containing slow-leader, with n/2-1 nodes.
Wait.
You can reproduce this problem with Jepsen 479745e, by running
lein run test -w append --time-limit 60 --nemesis slow-primary
Note that the test will (at least for me) stall indefinitely as the nemesis waits for a new leader to come online.
What did you expect to see?
I think under these conditions that the majority component should elect a new PD leader and go on to service requests; their clocks all run at 1x realtime, they have uninterrupted network connectivity to each other, a majority of the cluster, all are running...
What did you see instead?
Sometimes, the majority-component PD nodes get stuck, and are unable to elect a leader. It looks like different components of PD are failing to restart, or initialize, due to dropped Raft proposals?
TiDB might log:
[2019/05/22 11:18:40.813 -07:00] [ERROR] [kv.go:245] ["fail to load safepoint from pd"] [error="context deadline exceeded"] [errorVerbose="context deadline exceeded\ngithub.com/pingcap/errors.AddStack\n\t/home/jenkins/workspace/tidb_ghpr_build/go/pkg/mod/github.com/pingcap/errors@v0.11.1/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/home/jenkins/workspace/tidb_ghpr_build/go/pkg/mod/github.com/pingcap/errors@v0.11.1/juju_adaptor.go:15\ngithub.com/pingcap/tidb/store/tikv.(*EtcdSafePointKV).Get\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/store/tikv/safepoint.go:108\ngithub.com/pingcap/tidb/store/tikv.loadSafePoint\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/store/tikv/safepoint.go:127\ngithub.com/pingcap/tidb/store/tikv.(*tikvStore).runSafePointChecker\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/store/tikv/kv.go:238\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"] [stack="github.com/pingcap/tidb/store/tikv.(*tikvStore).runSafePointChecker\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/store/tikv/kv.go:245"]
[2019/05/22 11:18:40.861 -07:00] [WARN] [manager.go:170] ["failed to new session to etcd"] [ownerInfo="[ddl] /tidb/ddl/all_schema_versions/cc9181ed-51b0-4b59-b714-44640432856d"] [error="rpc error: code = Unknown desc = raft proposal dropped"]
[2019/05/22 11:18:40.889 -07:00] [ERROR] [domain.go:417] ["server restart failed"] [error="rpc error: code = Unknown desc = raft proposal dropped"] [errorVerbose="rpc error: code = Unknown desc = raft proposal dropped\ngithub.com/pingcap/errors.AddStack\n\t/home/jenkins/workspace/tidb_ghpr_build/go/pkg/mod/github.com/pingcap/errors@v0.11.1/errors.go:174\ngithub.com/pingcap/errors.Trace\n\t/home/jenkins/workspace/tidb_ghpr_build/go/pkg/mod/github.com/pingcap/errors@v0.11.1/juju_adaptor.go:15\ngithub.com/pingcap/tidb/owner.NewSession\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/owner/manager.go:176\ngithub.com/pingcap/tidb/domain.(*InfoSyncer).newSessionAndStoreServerInfo\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/domain/info.go:166\ngithub.com/pingcap/tidb/domain.(*InfoSyncer).Restart\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/domain/info.go:157\ngithub.com/pingcap/tidb/domain.(*Domain).infoSyncerKeeper\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/domain/domain.go:416\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"] [stack="github.com/pingcap/tidb/domain.(*Domain).infoSyncerKeeper\n\t/home/jenkins/workspace/tidb_ghpr_build/go/src/github.com/pingcap/tidb/domain/domain.go:417"]
[2019/05/22 11:18:40.889 -07:00] [INFO] [domain.go:419] ["server info syncer restarted"]
[2019/05/22 11:18:40.889 -07:00] [INFO] [domain.go:415] ["server info syncer need to restart"]
[2019/05/22 11:18:40.889 -07:00] [WARN] [manager.go:170] ["failed to new session to etcd"] [ownerInfo="[Info-syncer] /tidb/server/info/cc9181ed-51b0-4b59-b714-44640432856d"] [error="rpc error: code = Unknown desc = raft proposal dropped"]
While PD seems to get stuck repeatedly trying to hold elections for a single term, because of leases which aren't expired? Lease values fluctuate though...
[logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 4)
2019/05/22 11:21:21.081 raft.go:1135: [info] 5aa594b5d9b66c42 no leader at term 3; dropping proposal
2019/05/22 11:21:21.139 raft.go:857: [info] 5aa594b5d9b66c42 is starting a new election at term 3
2019/05/22 11:21:21.139 raft.go:684: [info] 5aa594b5d9b66c42 became pre-candidate at term 3
2019/05/22 11:21:21.139 raft.go:755: [info] 5aa594b5d9b66c42 received MsgPreVoteResp from 5aa594b5d9b66c42 at term 3
2019/05/22 11:21:21.139 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 7f6143cbd22aca00 at term 3
2019/05/22 11:21:21.139 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to f82e563e5c75137e at term 3
2019/05/22 11:21:21.139 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 9b116f88cab4dc9 at term 3
2019/05/22 11:21:21.139 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 5440ff22fe632778 at term 3
2019/05/22 11:21:21.139 raft.go:782: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463, vote: 9b116f88cab4dc9] ignored MsgPreVote from f82e563e5c75137e [logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 6)
2019/05/22 11:21:21.282 raft.go:1135: [info] 5aa594b5d9b66c42 no leader at term 3; dropping proposal
2019/05/22 11:21:21.439 raft.go:857: [info] 5aa594b5d9b66c42 is starting a new election at term 3
2019/05/22 11:21:21.439 raft.go:684: [info] 5aa594b5d9b66c42 became pre-candidate at term 3
2019/05/22 11:21:21.439 raft.go:755: [info] 5aa594b5d9b66c42 received MsgPreVoteResp from 5aa594b5d9b66c42 at term 3
2019/05/22 11:21:21.439 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 7f6143cbd22aca00 at term 3
2019/05/22 11:21:21.439 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to f82e563e5c75137e at term 3
2019/05/22 11:21:21.439 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 9b116f88cab4dc9 at term 3
2019/05/22 11:21:21.439 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 5440ff22fe632778 at term 3
2019/05/22 11:21:21.439 raft.go:782: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463, vote: 9b116f88cab4dc9] ignored MsgPreVote from f82e563e5c75137e [logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 6)
2019/05/22 11:21:21.441 raft.go:782: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463, vote: 9b116f88cab4dc9] ignored MsgPreVote from 7f6143cbd22aca00 [logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 6)
2019/05/22 11:21:21.482 raft.go:1135: [info] 5aa594b5d9b66c42 no leader at term 3; dropping proposal
2019/05/22 11:21:21.683 raft.go:1135: [info] 5aa594b5d9b66c42 no leader at term 3; dropping proposal
2019/05/22 11:21:21.739 raft.go:857: [info] 5aa594b5d9b66c42 is starting a new election at term 3
2019/05/22 11:21:21.739 raft.go:684: [info] 5aa594b5d9b66c42 became pre-candidate at term 3
2019/05/22 11:21:21.739 raft.go:755: [info] 5aa594b5d9b66c42 received MsgPreVoteResp from 5aa594b5d9b66c42 at term 3
2019/05/22 11:21:21.739 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 9b116f88cab4dc9 at term 3
2019/05/22 11:21:21.739 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 5440ff22fe632778 at term 3
2019/05/22 11:21:21.739 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 7f6143cbd22aca00 at term 3
2019/05/22 11:21:21.739 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to f82e563e5c75137e at term 3
2019/05/22 11:21:21.739 raft.go:782: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463, vote: 9b116f88cab4dc9] ignored MsgPreVote from f82e563e5c75137e [logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 6)
2019/05/22 11:21:21.739 raft.go:755: [info] 5aa594b5d9b66c42 received MsgPreVoteResp from 7f6143cbd22aca00 at term 3
2019/05/22 11:21:21.739 raft.go:1148: [info] 5aa594b5d9b66c42 [quorum:3] has received 2 MsgPreVoteResp votes and 0 vote rejections
2019/05/22 11:21:21.884 raft.go:1135: [info] 5aa594b5d9b66c42 no leader at term 3; dropping proposal
2019/05/22 11:21:21.939 raft.go:782: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463, vote: 9b116f88cab4dc9] ignored MsgPreVote from 7f6143cbd22aca00 [logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 2)
2019/05/22 11:21:22.039 raft.go:857: [info] 5aa594b5d9b66c42 is starting a new election at term 3
2019/05/22 11:21:22.039 raft.go:684: [info] 5aa594b5d9b66c42 became pre-candidate at term 3
2019/05/22 11:21:22.039 raft.go:755: [info] 5aa594b5d9b66c42 received MsgPreVoteResp from 5aa594b5d9b66c42 at term 3
2019/05/22 11:21:22.039 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 9b116f88cab4dc9 at term 3
2019/05/22 11:21:22.039 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 5440ff22fe632778 at term 3
2019/05/22 11:21:22.039 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 7f6143cbd22aca00 at term 3
2019/05/22 11:21:22.039 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to f82e563e5c75137e at term 3
2019/05/22 11:21:22.039 raft.go:782: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463, vote: 9b116f88cab4dc9] ignored MsgPreVote from f82e563e5c75137e [logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 6)
2019/05/22 11:21:22.085 raft.go:1135: [info] 5aa594b5d9b66c42 no leader at term 3; dropping proposal
2019/05/22 11:21:22.228 log.go:86: [warning] etcdserver: [timed out waiting for read index response]
2019/05/22 11:21:22.286 raft.go:1135: [info] 5aa594b5d9b66c42 no leader at term 3; dropping proposal
2019/05/22 11:21:22.339 raft.go:857: [info] 5aa594b5d9b66c42 is starting a new election at term 3
2019/05/22 11:21:22.339 raft.go:684: [info] 5aa594b5d9b66c42 became pre-candidate at term 3
2019/05/22 11:21:22.339 raft.go:755: [info] 5aa594b5d9b66c42 received MsgPreVoteResp from 5aa594b5d9b66c42 at term 3
2019/05/22 11:21:22.339 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 9b116f88cab4dc9 at term 3
2019/05/22 11:21:22.339 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 5440ff22fe632778 at term 3
2019/05/22 11:21:22.339 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to 7f6143cbd22aca00 at term 3
2019/05/22 11:21:22.339 raft.go:742: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463] sent MsgPreVote request to f82e563e5c75137e at term 3
2019/05/22 11:21:22.339 raft.go:782: [info] 5aa594b5d9b66c42 [logterm: 3, index: 463, vote: 9b116f88cab4dc9] ignored MsgPreVote from f82e563e5c75137e [logterm: 3, index: 463] at term 3: lease is not expired (remaining ticks: 6)
What version of TiDB are you using (tidb-server -V or run select tidb_version(); on TiDB)?
This occurred with TiDB 3.0.0-beta1; I'm confirming with TiDB 3.0.0-beta.1-40.
The text was updated successfully, but these errors were encountered:
aphyr
changed the title
PD can't elect a leader during a n/2+1 network partition
PD can get stuck electing a leader during a n/2+1 network partition
May 22, 2019
Thanks, @aphyr, this is already known bug about PreVote in raft, there is an issue in detail: etcd-io/etcd#10355. We fix it in v3.0.0-rc.1 after upgrade etcd, can
you use the new version to try to test again?
Ah, excellent! Yeah, I've been having trouble reproducing this on 3.0.0-beta.1-40 too. I think it's likely fixed in later versions. I am still seeing slow leader elections (2+ minutes?), but the logs look different, so I think that might be a different issue.
Bug Report
slow-leader
.slow-leader
to become a leader by calling any PD node's/leader/transfer
API.slow-leader
, with n/2-1 nodes.You can reproduce this problem with Jepsen 479745e, by running
Note that the test will (at least for me) stall indefinitely as the nemesis waits for a new leader to come online.
I think under these conditions that the majority component should elect a new PD leader and go on to service requests; their clocks all run at 1x realtime, they have uninterrupted network connectivity to each other, a majority of the cluster, all are running...
Sometimes, the majority-component PD nodes get stuck, and are unable to elect a leader. It looks like different components of PD are failing to restart, or initialize, due to dropped Raft proposals?
TiDB might log:
While PD seems to get stuck repeatedly trying to hold elections for a single term, because of leases which aren't expired? Lease values fluctuate though...
tidb-server -V
or runselect tidb_version();
on TiDB)?This occurred with TiDB 3.0.0-beta1; I'm confirming with TiDB 3.0.0-beta.1-40.
20190522T140201.000-0400.zip
The text was updated successfully, but these errors were encountered: