-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: transfer leadership in case of error #12293
Conversation
When a Nomad server becomes the Raft leader, it must perform several actions defined in the establishLeadership function. If any of these actions fail, Raft will think the node is the leader, but it will not actually be able to act as a Nomad leader. In this scenario, leadership must be revoked and transferred to another server if possible, or the node should retry the establishLeadership steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
nomad/leader.go
Outdated
func (s *Server) leadershipTransfer() error { | ||
retryCount := 3 | ||
for i := 0; i < retryCount; i++ { | ||
future := s.raft.LeadershipTransfer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LeadershipTransfer
will fail if the Raft version is < 3
, which would cause the cluster to get stuck in a leaderless infinite loop since we always retry if leadershipTransfer
returns an error.
2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to establish leadership: error="i can't be a leader"
2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=0 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=1 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=2 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:31.725-0400 [ERROR] nomad: failed to transfer leadership: error="failed to transfer leadership in 3 attempts"
2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to establish leadership: error="i can't be a leader"
2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=0 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=1 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=2 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:36.822-0400 [ERROR] nomad: failed to transfer leadership: error="failed to transfer leadership in 3 attempts"
2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to establish leadership: error="i can't be a leader"
2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=0 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=1 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership attempt, will retry: attempt=2 retry_limit=3 error="operation not supported with current protocol version"
2022-03-14T17:13:41.908-0400 [ERROR] nomad: failed to transfer leadership: error="failed to transfer leadership in 3 attempts"
Should we keep the previous (erroneous) behaviour if Raft is not v3? Or is there something else we can do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep the previous (erroneous) behaviour if Raft is not v3?
That seems fine to me. Let's add a note in the upgrade guide about this behavior change. If nothing else it's a nice concrete feature to encourage upgrading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool, I will work on that, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@schmichael After staring waaay too long at this, I realized that the previous behaviour is actually for the leader to get stuck, which is also the point of this fix, so there isn't much we can do about it.
In 2dc1ff1 I changed the logic a little to return early and with a more meaningful message if the problem is unsupported Raft version and in 772abb6 I added the upgrade guide notice.
nomad/leader.go
Outdated
func (s *Server) leadershipTransfer() error { | ||
retryCount := 3 | ||
for i := 0; i < retryCount; i++ { | ||
future := s.raft.LeadershipTransfer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep the previous (erroneous) behaviour if Raft is not v3?
That seems fine to me. Let's add a note in the upgrade guide about this behavior change. If nothing else it's a nice concrete feature to encourage upgrading.
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
When a Nomad server becomes the Raft leader, it must perform several
actions defined in the establishLeadership function. If any of these
actions fail, Raft will think the node is the leader, but it will not
actually be able to act as a Nomad leader.
In this scenario, leadership must be revoked and transferred to another
server if possible, or the node should retry the establishLeadership
steps.
Closes #8470
Note to reviewers: this work was basically lifted from hashicorp/consul#5247 and I wasn't able to create a reliable test, so I tested it manually with a forced error in one of the servers. The log output looked like this after forcing an election by shutting down the cluster leader:
I also paraphrased some of the original comments to test my own understanding of what's going on. If any of them seems incorrect it's probably my fault 😅