You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Node (A) starts as primary and (B) and (C) are happy followers
We are at term 2
They all compact at index 28, so their commit_idx is 28
They all proceed to index 32
Node (C) is suspended
Node (A) sends a signature at 33
Node (B) receives the signature at index 33 and responds to (A) with success
Node (A) receives the response from (B) and therefore compacts at index 33
Node (A) is suspended and doesn't send the next index to (B) so (B) never compacts at 33
Node (C) wakes up and with node (B) have an election. They both rollback to their latest commit_idx which is 28 and proceed with processing requests at term 3 and increasing their index.
Node (B) and (C) proceed to compacting at index > 28 (e.g. 34) for term 3
Node (A) wakes up, has compacted index 33 for term 2, where as nodes (B) and (C) have compacted index 33 at term 3
Node (A) can never catch up
The text was updated successfully, but these errors were encountered:
I think this is equivalent to a late join with an invalid ledger suffix, and our logic should be the same in both cases.
The only alternative I can think of is to add logic to delay compaction until commits are committed according to consensus rules, and that sounds like a can of worms.
Scenario with 3 nodes (A), (B), (C)
Node (A) starts as primary and (B) and (C) are happy followers
We are at term 2
They all compact at index 28, so their
commit_idx
is 28They all proceed to index 32
Node (C) is suspended
Node (A) sends a signature at 33
Node (B) receives the signature at index 33 and responds to (A) with success
Node (A) receives the response from (B) and therefore compacts at index 33
Node (A) is suspended and doesn't send the next index to (B) so (B) never compacts at 33
Node (C) wakes up and with node (B) have an election. They both rollback to their latest
commit_idx
which is 28 and proceed with processing requests at term 3 and increasing their index.Node (B) and (C) proceed to compacting at index > 28 (e.g. 34) for term 3
Node (A) wakes up, has compacted index 33 for term 2, where as nodes (B) and (C) have compacted index 33 at term 3
Node (A) can never catch up
The text was updated successfully, but these errors were encountered: