Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose to the operator clearly when retired nodes can be shut down #1713

Closed
achamayou opened this issue Oct 6, 2020 · 6 comments
Closed
Assignees
Labels
2.0.x Short-term improvements to 2.0 enhancement liveness usability
Milestone

Comments

@achamayou
Copy link
Member

A node being removed may observe the commit of its removal before its successor network does. In some circumstances, when the amount of nodes being removed is high enough to prevent the newly elected primary from establishing consensus over the committable suffix it inherits (as implemented in #1641), terminating those removed nodes too early could effectively cause consensus to become permanently unable to progress, requiring a catastrophic recovery.

If instead, nodes observing their retirement being committed redirect user requests to the new network from that point onwards, then an operator can simply wait for the transaction status of the removal transaction to become COMMITTED, from any node in either configuration, at which point the removed node(s) can be safely terminated.

This can be implemented by adding to the removal hook code that extracts the new configuration, and sending redirects to one of those nodes in response to all requests once the node is in state RETIRED.

@achamayou achamayou added 0.16 and removed 0.15 labels Nov 18, 2020
@achamayou achamayou added 1.0 and removed 0.16 labels Feb 1, 2021
@achamayou achamayou self-assigned this Feb 1, 2021
@achamayou achamayou removed the 1.0 label Mar 10, 2021
@achamayou achamayou added the 2.0.x Short-term improvements to 2.0 label May 23, 2022
@achamayou achamayou changed the title Retired nodes should forward to new configuration after they witness their retirement being committed Expose to the operator clearly when retired nodes can be shut down Jun 17, 2022
@achamayou
Copy link
Member Author

Following discussion with @heidihoward and @eddyashton, it is clear that this approach will not work.

Instead, it is necessary for a primary having executed a reconfiguration that removes one or more nodes to:

  1. wait for the reconfiguration to commit
  2. wait for the commit index to make progress again, so as to make sure that any node who would win an election from there on would have observed the reconfiguration being committed, and would no longer rely on the old configuration to win their election

The consensus logic should keep track of this, and show an additional attribute on retired nodes indicating whether they can safely be removed.

While typing this, I realise this directly conflicts with the current primary step down logic described under https://microsoft.github.io/CCF/main/architecture/consensus/1tx-reconfig.html#retirement-details, which disables transaction emission after the first transaction following a reconfiguration.

@heidihoward
Copy link
Member

Just to elaborate further,

It is necessary for these two steps to happen in series. The primary must wait for the reconfiguration to be committed and thus for its local commit index to be greater than the reconfiguration transaction ID. Then the primary must commit a subsequent transaction such that any future leaders in the new configuration will have a commit index greater than the reconfiguration transaction ID. The primary then knows that it is ok to shut down removed nodes and can communicate this knowledge to the operator.

Aside: If I recall correctly (I might not be), there was some discussion of whether an operator could look at the ledger and determine whether it was safe to shut down a node by checking for two committed signatures in the same term after the reconfiguration. On reflection, I do not think this approach would work as its not clear if the signatures where committed by the primary of the term in question or by a subsequent primary. I think there needs to be some explicit information from the primary to say that its is ok to shut down the old configuration.

@heidihoward
Copy link
Member

@achamayou regarding leader retirement, I think this does not conflicts with the current scheme. A leader that is removing itself should not add any new transactions after the signature transaction which commits the reconfiguration. It is the responsibility of the next leader (in the new configuration) to add further transactions and finalize the removal of the old leader.

@achamayou
Copy link
Member Author

@heidihoward ok, I see, I was mistakenly assuming that the retiring primary should also emit the signature, but you are right, there is no obstacle to the next primary executing a two signature sequence separately. The old primary just can't be safely removed until that's complete.

@heidihoward
Copy link
Member

@achamayou I think either approach would work (and in fact, your approach would be quicker). It makes more sense to me to have the new leader finalize the removal as the 2nd signature need only be committed on the new configuration.

If the new leader learns from the old leader that the reconfiguration is committed then it only needs one signature, which is its first transaction, so it should not take long.

@heidihoward
Copy link
Member

Having further discussed this issue with @eddyashton and @achamayou, we think a better approach might be to have the primary add an explicit "remove nodes" transaction once the primary observes that the reconfiguration is completed.

Here's what it might look like:

  1. A primary waits until the reconfiguration has been committed (thus its commit index has passed the reconfiguration transaction)
  2. If there are retired nodes waiting to be removed then it updates the nodes' status to REMOVED with a 2nd reconfiguration transaction.
  3. Once this reconfiguration transaction has been committed then the nodes can be safely shut down (as all futures leaders will have a commit index passed the reconfiguration transaction)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.0.x Short-term improvements to 2.0 enhancement liveness usability
Projects
None yet
Development

No branches or pull requests

3 participants