[Design Proposal] Decommissioning and recommissioning a zone #3917

gbbafna · 2022-07-15T05:59:17Z

Goal

This doc proposes a high level and low level design to support for decommissioning and recommissioning a zone . For more info on feature proposal, please refer: #3402 .

Requirements

Functional

Decommission abdicates elected cluster-manager if it belongs to decommissioned zone
Decommission removes nodes belonging to a specific attribute .
Recommission allows the removed nodes to join back the cluster .

Non-Functional

Minimal impact to ongoing requests - The constraints will ensure that decommissioned node has 0 weights. We will ensure that the no HTTP traffic is being received and all search traffic is drained before a node is decommissioned
Kill Switch to ignore decommission - A system setting which will allow the decommissioned nodes to join back .
No stability impact in steady state - The decommisioned nodes should not be sending join requests aggressively and should plateau to an extent that it doesn't overload the elected cluster manager.

Approach

The high level approach is to do all the heavy lifting in decommission API and then don't let the nodes join back again till the zone is commissioned again .

Breaking down the steps here in decommission :

APIs to store decommissioned info in cluster state .
It will also abdicate the elected master node if it is decommissioned zone.
It will remove the all the nodes in decommissioned zones.
1. It will collect the stats from data nodes/master nodes and make sure the rest request count is 0 before kicking node out of cluster .
2. Make sure that removing nodes doesn’t lead to quorum loss.
We add one more Join Validator in here to reject joins coming from nodes in decommissioned zone and Kill-Switch for Decommission is Off.
Nodes in decommissioned zone will not send join request aggressively on getting DecommissionedException and will not storm the master and plateau the retries .

How to abdicate itself

Abdication to another node is not guaranteeing a change of cluster manager. The same node after abdicating can again become cluster manager. This might be due to the distributed nature of the cluster manager election algorithm leading to unpredictability of the new elected cluster manager .

To get around this, we are proposing to put the node in voting exclusion configuration temporarily. After the new cluster manager takes over , it will remove all the nodes in decommissioned zone from voting exclusion configuration. This might remove the previously voting excluded nodes, but since voting exclusion configuration is supposed to be short lived, this behavior might be okay .

API Design

More about this mentioned here.

shwetathareja · 2022-08-02T08:16:52Z

To get around this, we are proposing to put the node in voting exclusion configuration temporarily. After the new cluster manager takes over , it will remove all the nodes in decommissioned zone from voting exclusion configuration. This might remove the previously voting excluded nodes, but since voting exclusion configuration is supposed to be short lived, this behavior might be okay .

The decommissioned node will be added in the voting exclusion explicitly from outside or it is taken care internally. If it is called from outside, then atomicity can't be guaranteed and system is not self sufficient to take care decommissioning without external API support.

gbbafna · 2022-08-02T09:28:53Z

To get around this, we are proposing to put the node in voting exclusion configuration temporarily. After the new cluster manager takes over , it will remove all the nodes in decommissioned zone from voting exclusion configuration. This might remove the previously voting excluded nodes, but since voting exclusion configuration is supposed to be short lived, this behavior might be okay .

The decommissioned node will be added in the voting exclusion explicitly from outside or it is taken care internally. If it is called from outside, then atomicity can't be guaranteed and system is not self sufficient to take care decommissioning without external API support.

Thanks Shweta. Good point. We would invoke TransportAddVotingConfigExclusionsAction internally within our TransportDecommissionAction. Atomicity would be taken care inside the TransportDecommissionAction which would wait for new master to get elected .

gbbafna added enhancement Enhancement or improvement to existing feature or request untriaged labels Jul 15, 2022

owaiskazi19 added discuss Issues intended to help drive brainstorming and decision making distributed framework and removed untriaged labels Jul 15, 2022

This was referenced Aug 2, 2022

[Zone Decommission] Add DecommissionService and helper to execute awareness attribute decommissioning #4083

Closed

Add DecommissionService and helper to execute awareness attribute decommissioning #4084

Merged

imRishN mentioned this issue Aug 19, 2022

Add APIs (GET/PUT) to decommission awareness attribute #4261

Merged

5 tasks

imRishN mentioned this issue Nov 4, 2022

[DOC] Awareness Attribute Decommission opensearch-project/documentation-website#1812

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Design Proposal] Decommissioning and recommissioning a zone #3917

[Design Proposal] Decommissioning and recommissioning a zone #3917

gbbafna commented Jul 15, 2022 •

edited

Loading

shwetathareja commented Aug 2, 2022

gbbafna commented Aug 2, 2022

[Design Proposal] Decommissioning and recommissioning a zone #3917

[Design Proposal] Decommissioning and recommissioning a zone #3917

Comments

gbbafna commented Jul 15, 2022 • edited Loading

Goal

Requirements

Functional

Non-Functional

Approach

How to abdicate itself

API Design

shwetathareja commented Aug 2, 2022

gbbafna commented Aug 2, 2022

gbbafna commented Jul 15, 2022 •

edited

Loading