bring leader election budget mechanism in etcd leader election to stablize cluster availability and reliability #17326

armstrongli · 2024-01-26T05:52:59Z

What would you like to be added?

bring the leader election budget mechanism to etcd leader election to etcd.

goal: the member with negative budget is not voted within a configured period

h2. overall workflow:

h3. normal flow:

all the members have a default budget when joining the cluster(e.g. 10)
every member has records about members and their stores(e.g. a:10, b:10, c:10,...)
every member reduce the budget of that member after voting it. e.g. a starts a vote, then , b has: a:9, c:10.
after cluster stabilizes for some while(e.g. 20min), the budget is reset to default budget

h3. cluster with a bad guy

all members have a default budget when joining the cluster(e.g. 10)
every member has records about members and their stores(e.g. a:10, b:10, c:10,...)
every member reduce the budget of that member after voting it. e.g. a starts a vote, then , b has: a:9, c:10.
a is a bad guy and can't reach b & c and loses leader after get it
b or c starts the leader and get the leader
a starts another leader again and get the leader on next term
repeat #5 and #6 until
a starts a leader election, b and c's budget on a is -1. they won't vote it anymore. and every vote from a postpone the budget reset to avoid a gets the leader.

Why is this needed?

etcd is built on mutual trust. everything goes well in common case that network is good, disk IO is good, etc. and all the followers follow the new leader on new term.

but it can't survive the scenario that some member is not stable and make the cluster thrashing on leader election. bring the trust mechanism is a good way to allow etcd survive from such scenarios.

The text was updated successfully, but these errors were encountered:

serathius · 2024-01-26T09:26:10Z

Have you enabled --pre-vote flag on etcd (default in v3.5)? It should prevent faulty member continuously forcing leader re-election. It works by adding additional pre-election phase, where healthy members can reject leader election if cluster is healthy. So faulty member request for election will be rejected.

armstrongli · 2024-01-26T09:33:25Z

we don't have the flag enabled. i'll take a look this option and do investigation.

armstrongli · 2024-01-26T09:35:41Z

@serathius thank you very much for the info. i didn't aware of this feature. it fulfill our requirement according to the feature description.

- For instance, a flaky(or rejoining) member may drop in and out, and start campaign. This member will end up with a higher term, and ignore all incoming messages with lower term. In this case, a new leader eventually need to get elected, thus disruptive to cluster availability. Raft implements Pre-Vote phase to prevent this kind of disruptions. If enabled, Raft runs an additional phase of election to check if pre-candidate can get enough votes to win an election.

armstrongli added the type/feature label Jan 26, 2024

armstrongli closed this as completed Jan 26, 2024

serathius mentioned this issue Jan 26, 2024

Document pre-vote flag #17328

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bring leader election budget mechanism in etcd leader election to stablize cluster availability and reliability #17326

bring leader election budget mechanism in etcd leader election to stablize cluster availability and reliability #17326

armstrongli commented Jan 26, 2024 •

edited

Loading

serathius commented Jan 26, 2024

armstrongli commented Jan 26, 2024

armstrongli commented Jan 26, 2024

bring leader election budget mechanism in etcd leader election to stablize cluster availability and reliability #17326

bring leader election budget mechanism in etcd leader election to stablize cluster availability and reliability #17326

Comments

armstrongli commented Jan 26, 2024 • edited Loading

What would you like to be added?

Why is this needed?

serathius commented Jan 26, 2024

armstrongli commented Jan 26, 2024

armstrongli commented Jan 26, 2024

armstrongli commented Jan 26, 2024 •

edited

Loading