Decrease rate of Raft heartbeat messages. #3708

danielmai · 2019-07-23T01:18:17Z

Setting Raft's HeartbeatTick to 100ms to reduce the number of heartbeats that need to be sent. This changes how often heartbeats are sent but does not affect the time needed for a leader election to occur if the leader is in a network partition (= 100ms * 20 = 2 seconds).

Notes on when leader re-election happens:

The followers of the group will need to wait for the election tick duration if the leader is (1) in a network partition or (2) forcibly killed (SIGKILL).
If the leader stops gracefully (SIGTERM) then a follower will receive a timeout message from the leader and immediately start an election, as shown in these logs from Node 0x2 where Node 0x1 was a leader and was stopped with SIGTERM:

I0723 01:03:15.444784       1 raft_server.go:233] RaftComm: [0x2] Received msg of type: MsgTimeoutNow from 0x1
I0723 01:03:15.444864       1 log.go:34] 2 [term 8] received MsgTimeoutNow from 1 and starts an election to get leadership.
I0723 01:03:15.444890       1 log.go:34] 2 became candidate at term 9
I0723 01:03:15.444898       1 log.go:34] 2 received MsgVoteResp from 2 at term 9
I0723 01:03:15.444974       1 log.go:34] 2 [logterm: 8, index: 3456] sent MsgVote request to 1 at term 9
I0723 01:03:15.445119       1 log.go:34] 2 [logterm: 8, index: 3456] sent MsgVote request to 3 at term 9
I0723 01:03:15.445199       1 log.go:34] raft.node: 2 lost leader 1 at term 9
I0723 01:03:15.447296       1 node.go:249] RaftComm: [0x2] Sending message of type MsgVote to 0x1
I0723 01:03:15.447372       1 node.go:249] RaftComm: [0x2] Sending message of type MsgVote to 0x3
I0723 01:03:15.450316       1 raft_server.go:233] RaftComm: [0x2] Received msg of type: MsgVoteResp from 0x1
I0723 01:03:15.450365       1 log.go:34] 2 received MsgVoteResp from 1 at term 9
I0723 01:03:15.450390       1 log.go:34] 2 [quorum:2] has received 2 MsgVoteResp votes and 0 vote rejections
I0723 01:03:15.450420       1 log.go:34] 2 became leader at term 9
I0723 01:03:15.450521       1 log.go:34] raft.node: 2 elected leader 2 at term 9
I0723 01:03:15.580813       1 groups.go:808] Leader idx=0x2 of group=1 is connecting to Zero for txn updates
I0723 01:03:15.580854       1 groups.go:817] Got Zero leader: zero1:5180
W0723 01:03:16.445513       1 pool.go:237] Connection lost with alpha1:7180. Error: rpc error: code = Unavailable desc = transport is closing
W0723 01:03:16.480873       1 node.go:415] Unable to send message to peer: 0x1. Error: EOF
I0723 01:03:17.453171       1 raft_server.go:233] RaftComm: [0x2] Received msg of type: MsgVoteResp from 0x3

These logs all happen within 1-2ms, but above we see that Node 0x2 starts an election at I0723 01:03:15.444864 even before the log message that acknowledges that it lost leader 0x1 at I0723 01:03:15.445199.

This change is

A new log "Called n.Raft().Tick()" appears whenever a Raft tick is called. Running `dgraph increment --num=10000` shows that proposals are going through just fine. The glog "Called n.Raft().Tick()" is called every 30 seconds. This doesn't change anything in conn/draft.go, so HeartbeatTick is configured for 30s and ElectionTick for 100*30s (50 minutes). When we stop an Alpha Steps to reproduce: shell 1: cd ./compose ./run.sh -a 3 -z 3 shell 2: dgraph increment --alpha localhost:9180 --num=100000 shell 3: curl localhost:6180/state # check who the leader is docker stop alpha1 # stop the leader (could be alpha1, alpha2, or alpha3) See the cluster logs in shell 1 for how long leader election takes. Ticks happen every 30s, as expected from this change: I0720 00:21:21.698651 1 draft.go:760] Called n.Raft().Tick(). I0720 00:21:51.698690 1 draft.go:760] Called n.Raft().Tick(). Leadership change detection, according to my understanding, should happen every 50 minutes. But when I stop the leader in shell 3 the leadership election happens immediately once I stop the leader process with SIGINT.

Setting Raft's HeartbeatTick to 100ms to reduce the number of heartbeats that need to be sent. This changes how often heartbeats are sent, but the election tick is still 2s.

danielmai · 2019-07-23T01:20:57Z

Decreasing how the rate of calling tick should reduce the chance of the warning log showing up that says A tick missed to fire. Node blocks too long!

martinmr

Just make sure you include the reason in the commit message. Otherwise it

Reviewed 2 of 2 files at r1.
Reviewable status: complete! all files reviewed, all discussions resolved (waiting on @manishrjain)

manishrjain

Yes, add a comment. I wrote an explanation.

Reviewed 2 of 2 files at r1.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @danielmai)

worker/draft.go, line 730 at r1 (raw file):

	var leader bool
	// See also our configuration of HeartbeatTick and ElectionTick.
	ticker := time.NewTicker(100 * time.Millisecond)

Add a comment here:

Before we used to have 20ms ticks, but they would overload the Raft tick channel, causing "ticks missed to fire" logs. Etcd uses 100ms and they haven't seen those issues. Additionally, using 100ms for ticks does not cause proposals to slow down, because they get sent out asap and don't rely on ticks. So, setting this to 100ms instead of 20ms is a NOOP.

danielmai · 2019-07-23T01:44:44Z

worker/draft.go, line 730 at r1 (raw file):

Previously, manishrjain (Manish R Jain) wrote…

Add a comment here:

Before we used to have 20ms ticks, but they would overload the Raft tick channel, causing "ticks missed to fire" logs. Etcd uses 100ms and they haven't seen those issues. Additionally, using 100ms for ticks does not cause proposals to slow down, because they get sent out asap and don't rely on ticks. So, setting this to 100ms instead of 20ms is a NOOP.

Done.

Before we used to have 20ms ticks, but they would overload the Raft tick channel, causing "A tick missed to fire. Node blocks too long!" logs. Etcd uses 100ms and they haven't seen those issues. Additionally, using 100ms for ticks does not cause proposals to slow down, because they get sent out asap and don't rely on ticks. So, setting this to 100ms instead of 20ms is a NOOP. (cherry picked from commit 4b41d9c)

In #3708 (4b41d9c) we bumped up the Raft heartbeat rate to 100ms ticks. This change also updates the heartbeat ticker for Zero too. Otherwise, Zero's election ticker would happen every 400ms instead of every 2s since raft.Config.ElectionTick is set to 20.

danielmai added 3 commits July 19, 2019 17:24

Decrease number of Raft heartbeats to every 100ms.

fe442ab

Setting Raft's HeartbeatTick to 100ms to reduce the number of heartbeats that need to be sent. This changes how often heartbeats are sent, but the election tick is still 2s.

Remove Raft tick log.

33f56fb

danielmai requested review from manishrjain and a team as code owners July 23, 2019 01:18

martinmr approved these changes Jul 23, 2019

View reviewed changes

manishrjain approved these changes Jul 23, 2019

View reviewed changes

Add comment explaining the change from 20ms to 100ms ticks.

256aa7f

danielmai merged commit 4b41d9c into master Jul 23, 2019

danielmai deleted the danielmai/raft-ticker branch July 23, 2019 17:21

danielmai mentioned this pull request Jul 23, 2019

release/v1.0: Decrease rate of Raft heartbeat messages. #3714

Merged

danielmai mentioned this pull request Aug 5, 2019

Fix Raft heartbeat rate for Zero. #3753

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decrease rate of Raft heartbeat messages. #3708

Decrease rate of Raft heartbeat messages. #3708

danielmai commented Jul 23, 2019 •

edited

Loading

danielmai commented Jul 23, 2019

martinmr left a comment

manishrjain left a comment

danielmai commented Jul 23, 2019

Decrease rate of Raft heartbeat messages. #3708

Decrease rate of Raft heartbeat messages. #3708

Conversation

danielmai commented Jul 23, 2019 • edited Loading

danielmai commented Jul 23, 2019

martinmr left a comment

Choose a reason for hiding this comment

manishrjain left a comment

Choose a reason for hiding this comment

danielmai commented Jul 23, 2019

danielmai commented Jul 23, 2019 •

edited

Loading