You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
bring the leader election budget mechanism to etcd leader election to etcd.
goal: the member with negative budget is not voted within a configured period
h2. overall workflow:
h3. normal flow:
all the members have a default budget when joining the cluster(e.g. 10)
every member has records about members and their stores(e.g. a:10, b:10, c:10,...)
every member reduce the budget of that member after voting it. e.g. a starts a vote, then , b has: a:9, c:10.
after cluster stabilizes for some while(e.g. 20min), the budget is reset to default budget
h3. cluster with a bad guy
all members have a default budget when joining the cluster(e.g. 10)
every member has records about members and their stores(e.g. a:10, b:10, c:10,...)
every member reduce the budget of that member after voting it. e.g. a starts a vote, then , b has: a:9, c:10.
a is a bad guy and can't reach b & c and loses leader after get it
b or c starts the leader and get the leader
a starts another leader again and get the leader on next term
repeat #5 and #6 until
a starts a leader election, b and c's budget on a is -1. they won't vote it anymore. and every vote from a postpone the budget reset to avoid a gets the leader.
Why is this needed?
etcd is built on mutual trust. everything goes well in common case that network is good, disk IO is good, etc. and all the followers follow the new leader on new term.
but it can't survive the scenario that some member is not stable and make the cluster thrashing on leader election. bring the trust mechanism is a good way to allow etcd survive from such scenarios.
The text was updated successfully, but these errors were encountered:
Have you enabled --pre-vote flag on etcd (default in v3.5)? It should prevent faulty member continuously forcing leader re-election. It works by adding additional pre-election phase, where healthy members can reject leader election if cluster is healthy. So faulty member request for election will be rejected.
@serathius thank you very much for the info. i didn't aware of this feature. it fulfill our requirement according to the feature description.
- For instance, a flaky(or rejoining) member may drop in and out, and start campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election.
What would you like to be added?
bring the leader election budget mechanism to etcd leader election to etcd.
goal: the member with negative budget is not voted within a configured period
h2. overall workflow:
h3. normal flow:
h3. cluster with a bad guy
#5
and#6
untilWhy is this needed?
etcd is built on mutual trust. everything goes well in common case that network is good, disk IO is good, etc. and all the followers follow the new leader on new term.
but it can't survive the scenario that some member is not stable and make the cluster thrashing on leader election. bring the trust mechanism is a good way to allow etcd survive from such scenarios.
The text was updated successfully, but these errors were encountered: