server: transfer leadership in case of error #12293

lgfa29 · 2022-03-14T20:48:13Z

When a Nomad server becomes the Raft leader, it must perform several
actions defined in the establishLeadership function. If any of these
actions fail, Raft will think the node is the leader, but it will not
actually be able to act as a Nomad leader.

In this scenario, leadership must be revoked and transferred to another
server if possible, or the node should retry the establishLeadership
steps.

Closes #8470

Note to reviewers: this work was basically lifted from hashicorp/consul#5247 and I wasn't able to create a reliable test, so I tested it manually with a forced error in one of the servers. The log output looked like this after forcing an election by shutting down the cluster leader:

    2022-03-14T16:35:05.080-0400 [WARN]  nomad.raft: Election timeout reached, restarting election
    2022-03-14T16:35:05.080-0400 [INFO]  nomad.raft: entering candidate state: node="Node at 192.168.1.173:4647 [Candidate]" term=4
    2022-03-14T16:35:05.368-0400 [INFO]  nomad.raft: election won: tally=2
    2022-03-14T16:35:05.368-0400 [INFO]  nomad.raft: entering leader state: leader="Node at 192.168.1.173:4647 [Leader]"
    2022-03-14T16:35:05.368-0400 [INFO]  nomad.raft: added peer, starting replication: peer=302a41de-8b12-1226-a826-8203c39d34bc
    2022-03-14T16:35:05.368-0400 [INFO]  nomad.raft: added peer, starting replication: peer=7e714096-eeb6-bc65-942f-81e7457984d6
    2022-03-14T16:35:05.368-0400 [INFO]  nomad: cluster leadership acquired
    2022-03-14T16:35:05.368-0400 [INFO]  nomad.raft: pipelining replication: peer="{Voter 7e714096-eeb6-bc65-942f-81e7457984d6 192.168.1.173:7677}"
    2022-03-14T16:35:05.550-0400 [ERROR] nomad: failed to establish leadership: error="i can't be a leader"
    2022-03-14T16:35:05.553-0400 [INFO]  nomad: successfully transferred leadership: attempt=0 retry_limit=3

I also paraphrased some of the original comments to test my own understanding of what's going on. If any of them seems incorrect it's probably my fault 😅

When a Nomad server becomes the Raft leader, it must perform several actions defined in the establishLeadership function. If any of these actions fail, Raft will think the node is the leader, but it will not actually be able to act as a Nomad leader. In this scenario, leadership must be revoked and transferred to another server if possible, or the node should retry the establishLeadership steps.

tgross

LGTM

lgfa29 · 2022-03-14T21:23:26Z

nomad/leader.go

+func (s *Server) leadershipTransfer() error {
+	retryCount := 3
+	for i := 0; i < retryCount; i++ {
+		future := s.raft.LeadershipTransfer()


LeadershipTransfer will fail if the Raft version is < 3, which would cause the cluster to get stuck in a leaderless infinite loop since we always retry if leadershipTransfer returns an error.

2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to establish leadership: error="i can't be a leader" 2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=0 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=1 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=2 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership: error="failed to transfer leadership in 3 attempts" 2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to establish leadership: error="i can't be a leader" 2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=0 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=1 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=2 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership: error="failed to transfer leadership in 3 attempts" 2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to establish leadership: error="i can't be a leader" 2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=0 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=1 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=2 retry_limit=3 error="operation not supported with current protocol version" 2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership: error="failed to transfer leadership in 3 attempts"

Should we keep the previous (erroneous) behaviour if Raft is not v3? Or is there something else we can do?

Should we keep the previous (erroneous) behaviour if Raft is not v3?

That seems fine to me. Let's add a note in the upgrade guide about this behavior change. If nothing else it's a nice concrete feature to encourage upgrading.

Cool, I will work on that, thanks!

@schmichael After staring waaay too long at this, I realized that the previous behaviour is actually for the leader to get stuck, which is also the point of this fix, so there isn't much we can do about it.

In 2dc1ff1 I changed the logic a little to return early and with a more meaningful message if the problem is unsupported Raft version and in 772abb6 I added the upgrade guide notice.

schmichael · 2022-03-14T23:14:37Z

nomad/leader.go

+func (s *Server) leadershipTransfer() error {
+	retryCount := 3
+	for i := 0; i < retryCount; i++ {
+		future := s.raft.LeadershipTransfer()


Should we keep the previous (erroneous) behaviour if Raft is not v3?

That seems fine to me. Let's add a note in the upgrade guide about this behavior change. If nothing else it's a nice concrete feature to encourage upgrading.

website/content/docs/upgrade/upgrade-specific.mdx

github-actions · 2022-10-25T02:45:14Z

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

lgfa29 added 2 commits March 14, 2022 16:16

changelog: add entry for #12293

b15b7be

tgross approved these changes Mar 14, 2022

View reviewed changes

vercel bot deployed to Preview – nomad-storybook-and-ui March 14, 2022 21:02 View deployment

vercel bot temporarily deployed to Preview – nomad March 14, 2022 21:02 Inactive

lgfa29 commented Mar 14, 2022

View reviewed changes

lgfa29 requested review from schmichael and shoenig March 14, 2022 21:28

lgfa29 added this to the 1.3.0 milestone Mar 14, 2022

schmichael approved these changes Mar 14, 2022

View reviewed changes

shoenig approved these changes Mar 15, 2022

View reviewed changes

lgfa29 added 2 commits March 16, 2022 18:20

return early when leadership transfer fail due to protocol version

2dc1ff1

docs: add upgrade note about raft leadership transfer

4ffa808

vercel bot deployed to Preview – nomad-storybook-and-ui March 16, 2022 22:21 View deployment

vercel bot deployed to Preview – nomad March 16, 2022 22:21 View deployment

shoenig reviewed Mar 16, 2022

View reviewed changes

website/content/docs/upgrade/upgrade-specific.mdx Outdated Show resolved Hide resolved

docs: rewrite raft leadership transfer upgrade note

772abb6

vercel bot deployed to Preview – nomad-storybook-and-ui March 16, 2022 22:47 View deployment

vercel bot deployed to Preview – nomad March 16, 2022 22:47 View deployment

lgfa29 merged commit dfe520a into main Mar 17, 2022

lgfa29 deleted the f-transfer-raft-leadership branch March 17, 2022 15:10

github-actions bot locked as resolved and limited conversation to collaborators Oct 25, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

server: transfer leadership in case of error #12293

server: transfer leadership in case of error #12293

lgfa29 commented Mar 14, 2022

tgross left a comment

lgfa29 Mar 14, 2022

schmichael Mar 14, 2022

lgfa29 Mar 14, 2022

lgfa29 Mar 16, 2022 •

edited

Loading

schmichael Mar 14, 2022

github-actions bot commented Oct 25, 2022

server: transfer leadership in case of error #12293

server: transfer leadership in case of error #12293

Conversation

lgfa29 commented Mar 14, 2022

tgross left a comment

Choose a reason for hiding this comment

lgfa29 Mar 14, 2022

Choose a reason for hiding this comment

schmichael Mar 14, 2022

Choose a reason for hiding this comment

lgfa29 Mar 14, 2022

Choose a reason for hiding this comment

lgfa29 Mar 16, 2022 • edited Loading

Choose a reason for hiding this comment

schmichael Mar 14, 2022

Choose a reason for hiding this comment

github-actions bot commented Oct 25, 2022

lgfa29 Mar 16, 2022 •

edited

Loading