You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When a new node joins a network, it should be possible for this node to initialise its store and history from a given snapshot instead of replaying all transactions since genesis.
For now, the responsibility of passing a snapshot to a new joiner is on the operator(s) who will have to copy a snapshot produced by a primary to the new node. Then, the new node can be started with cchost ... join --snapshots-dir <snapshot_dir>, which will automatically fetch the snapshot from disk and resume from there.
The text was updated successfully, but these errors were encountered:
TL;DR: A joiner should only be given a snapshot if the evidence for that snapshot has been endorsed by the node that generated that snapshot.
In the current model, snapshots are generated on the primary and given to (late) joiners via consensus messages. It is still not clear whether we will re-use the existing AppendEntries message for this or create a new message type for that à la InstallSnapshot message (see Raft paper). I lean towards the latter for now.
To provide auditability and blame, the evidence of this snapshot is emitted by the primary node. This evidence is a hash of serialised snapshot which is committed in a new ccf.snapshots table. See Snapshots should be generated at regular interval #1301 for more detail.
A snapshot should only be given to a late joiner only if the evidence for that snapshot has been endorsed via the primary's signature. In other words, a snapshot whose evidence has been applied at version N, should only be served to a new joiner once a signature at version S has been emitted, with S > N.
It is still not clear whether a snapshot should be given to a late joiner only when a majority of backups have ack'ed its evidence (i.e. they have applied version N to their store). This would prevent the late joiners to be able to catch up if an election occurs while it's joining. This is somehow related to Raft persistence bug #589 but it feels like the first implementation can omit this detail for now.
Upon receiving the snapshot, the late joiner should not 1) serve read entries and 2) count as part of the consensus quorum until it has received the signature S that endorses the snapshot evidence at N.
Edit: For now, the snapshot is passed to the new node "manually" by operators so some of the points below may no longer apply
When a new node joins a network, it should be possible for this node to initialise its store and history from a given snapshot instead of replaying all transactions since genesis.
For now, the responsibility of passing a snapshot to a new joiner is on the operator(s) who will have to copy a snapshot produced by a primary to the new node. Then, the new node can be started with
cchost ... join --snapshots-dir <snapshot_dir>
, which will automatically fetch the snapshot from disk and resume from there.The text was updated successfully, but these errors were encountered: