From 800c45336d0306be212c90ac8f4f548b1802da67 Mon Sep 17 00:00:00 2001 From: Antonio Filipovic <61245998+antoniofilipovic@users.noreply.github.com> Date: Mon, 1 Jul 2024 14:28:49 +0200 Subject: [PATCH] Add callout for important force reset notice (#877) * add callout for instance to be alive on force reset * Update high-availability.mdx --- pages/clustering/high-availability.mdx | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/pages/clustering/high-availability.mdx b/pages/clustering/high-availability.mdx index 2883f72e7..ac2113316 100644 --- a/pages/clustering/high-availability.mdx +++ b/pages/clustering/high-availability.mdx @@ -361,10 +361,16 @@ the cluster enters a state of force reset where the cluster is reset to the stat The leader coordinator executes a force reset of the cluster if the action isn't fully complete. Failure can happen anywhere, i.e. in the case of setting instance to MAIN, the RPC request to a REPLICA instance to promote itself to MAIN can succeed, but writing to the Raft log that the instance was promoted can fail. -Force reset includes demoting every alive instance to REPLICA, and executing the failover procedure once again. Such a procedure is needed as currently cluster doesn't track +Force reset includes demoting every alive instance to REPLICA, and executing the failover procedure once again. Such a procedure is needed as of this moment cluster doesn't track where the action failed exactly, but only whether it fully succeded. Raft log is taken as a source of truth at all times. In case the leader coordinator dies while executing the force reset, the next coordinator which is elected as the leader, will continue executing the force reset. Action is executed until it succeeds. + + +It is important to note that if an action fails and all instances are down, the leader will attempt to execute a force reset until one instance is promoted to MAIN. Until then, no other actions are allowed on the cluster. + + + If an instance is down at the point of force reset, the leader coordinator writes in the Raft log that the instance needs to be demoted to REPLICA once it comes back up. If all instances are down at the point of force reset, the action won't succeed as a new MAIN instance can't be chosen.