Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Design Proposal] Decommissioning and recommissioning a zone #3917

Open
gbbafna opened this issue Jul 15, 2022 · 2 comments
Open

[Design Proposal] Decommissioning and recommissioning a zone #3917

gbbafna opened this issue Jul 15, 2022 · 2 comments
Labels
discuss Issues intended to help drive brainstorming and decision making distributed framework enhancement Enhancement or improvement to existing feature or request

Comments

@gbbafna
Copy link
Collaborator

gbbafna commented Jul 15, 2022

Goal

This doc proposes a high level and low level design to support for decommissioning and recommissioning a zone . For more info on feature proposal, please refer: #3402 .

Requirements

Functional

  1. Decommission abdicates elected cluster-manager if it belongs to decommissioned zone
  2. Decommission removes nodes belonging to a specific attribute .
  3. Recommission allows the removed nodes to join back the cluster .

Non-Functional

  1. Minimal impact to ongoing requests - The constraints will ensure that decommissioned node has 0 weights. We will ensure that the no HTTP traffic is being received and all search traffic is drained before a node is decommissioned
  2. Kill Switch to ignore decommission - A system setting which will allow the decommissioned nodes to join back .
  3. No stability impact in steady state - The decommisioned nodes should not be sending join requests aggressively and should plateau to an extent that it doesn't overload the elected cluster manager.

Approach

The high level approach is to do all the heavy lifting in decommission API and then don't let the nodes join back again till the zone is commissioned again .

Breaking down the steps here in decommission :

  1. APIs to store decommissioned info in cluster state .

  2. It will also abdicate the elected master node if it is decommissioned zone.

  3. It will remove the all the nodes in decommissioned zones.

    1. It will collect the stats from data nodes/master nodes and make sure the rest request count is 0 before kicking node out of cluster .
    2. Make sure that removing nodes doesn’t lead to quorum loss.
  4. We add one more Join Validator in here to reject joins coming from nodes in decommissioned zone and Kill-Switch for Decommission is Off.

  5. Nodes in decommissioned zone will not send join request aggressively on getting DecommissionedException and will not storm the master and plateau the retries .

How to abdicate itself

Abdication to another node is not guaranteeing a change of cluster manager. The same node after abdicating can again become cluster manager. This might be due to the distributed nature of the cluster manager election algorithm leading to unpredictability of the new elected cluster manager .

To get around this, we are proposing to put the node in voting exclusion configuration temporarily. After the new cluster manager takes over , it will remove all the nodes in decommissioned zone from voting exclusion configuration. This might remove the previously voting excluded nodes, but since voting exclusion configuration is supposed to be short lived, this behavior might be okay .

API Design

More about this mentioned here.

@shwetathareja
Copy link
Member

To get around this, we are proposing to put the node in voting exclusion configuration temporarily. After the new cluster manager takes over , it will remove all the nodes in decommissioned zone from voting exclusion configuration. This might remove the previously voting excluded nodes, but since voting exclusion configuration is supposed to be short lived, this behavior might be okay .

The decommissioned node will be added in the voting exclusion explicitly from outside or it is taken care internally. If it is called from outside, then atomicity can't be guaranteed and system is not self sufficient to take care decommissioning without external API support.

@gbbafna
Copy link
Collaborator Author

gbbafna commented Aug 2, 2022

To get around this, we are proposing to put the node in voting exclusion configuration temporarily. After the new cluster manager takes over , it will remove all the nodes in decommissioned zone from voting exclusion configuration. This might remove the previously voting excluded nodes, but since voting exclusion configuration is supposed to be short lived, this behavior might be okay .

The decommissioned node will be added in the voting exclusion explicitly from outside or it is taken care internally. If it is called from outside, then atomicity can't be guaranteed and system is not self sufficient to take care decommissioning without external API support.

Thanks Shweta. Good point. We would invoke TransportAddVotingConfigExclusionsAction internally within our TransportDecommissionAction. Atomicity would be taken care inside the TransportDecommissionAction which would wait for new master to get elected .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Issues intended to help drive brainstorming and decision making distributed framework enhancement Enhancement or improvement to existing feature or request
Projects
None yet
Development

No branches or pull requests

3 participants