NRG (2.11): Completeness/consistency after leader changes #6485

MauriceVanVeen · 2025-02-10T10:15:38Z

Previously #6194 implemented a way to wait for entries that are stored in the new leader's log but not yet applied, before allowing 'expected per subject' operations to go through. This protected against KV/stream desync.

That protection is now extended to the whole of (clustered) JetStream, and not specific to this one operation anymore. Meaning that if the new leader recognizes it has entries in its log that have not yet been applied, it waits for those to be applied before responding to any read/write operations. In essence it's the 'Leader Completeness Property' as described by the Raft paper. This also brings us closer to 'read-your-writes' when only requesting reads from the leader.

Signed-off-by: Maurice van Veen github@mauricevanveen.com

neilalexander · 2025-02-10T14:04:53Z

server/raft.go

+		if n.pindex > n.applied {
+			n.aflr = n.pindex
+		} else {
+			n.aflr = 0


Not entirely clear what 0 means here, I know the comment above says signalling disabled, but is that implying this doesn't track upwards with applied always?

We only need to track if we have a log with entries that are not applied. If all our entries are applied we can signal immediately. Resetting n.aflr is just for sanity so we couldn't signal leader twice.

Have added this comment:

// We know we have applied all entries in our log and can signal immediately. // For sanity reset applied floor back down to 0, so we aren't able to signal twice.

server/raft.go

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

neilalexander

LGTM

Reverts #6415 and #6426. If duplication would be deferred to be done by a replica, and that replica was down for at least 'dupe window+5s', and it would clean up the dedupe map. A message that was meant to be duplicate could be passed as a genuine message, resulting in stream desync. Instead, the dedupe map should be cleared of any staged zero-sequences. However, that was not possible before as a new leader would not always be fully up-to-date when it starts responding to new write requests, which could result in duplicate messages as well. However, relying on the 'Leader Completeness Property' implemented here: #6485, we can confidently clear the dedupe map now of any staged zero-sequences (knowing they were not proposed). Ensuring both there's no desync, and a failed proposal for a message would not block subsequent messages with the same dedupe ID. Have left the commits as separate, to ease reviewing. Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

MauriceVanVeen requested a review from a team as a code owner February 10, 2025 10:15

MauriceVanVeen mentioned this pull request Feb 10, 2025

[FIXED] (2.11) Clear inflight dedupe IDs on leader change #6486

Merged

neilalexander reviewed Feb 10, 2025

View reviewed changes

NRG: Completeness/consistency after leader changes

43e7a02

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

MauriceVanVeen force-pushed the maurice/nrg-leader-change-consistency branch from 8916405 to 43e7a02 Compare February 10, 2025 14:11

neilalexander approved these changes Feb 10, 2025

View reviewed changes

derekcollison merged commit 8ed3361 into main Feb 10, 2025
5 checks passed

derekcollison deleted the maurice/nrg-leader-change-consistency branch February 10, 2025 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NRG (2.11): Completeness/consistency after leader changes #6485

NRG (2.11): Completeness/consistency after leader changes #6485

MauriceVanVeen commented Feb 10, 2025

neilalexander Feb 10, 2025

MauriceVanVeen Feb 10, 2025

neilalexander left a comment

NRG (2.11): Completeness/consistency after leader changes #6485

NRG (2.11): Completeness/consistency after leader changes #6485

Conversation

MauriceVanVeen commented Feb 10, 2025

neilalexander Feb 10, 2025

Choose a reason for hiding this comment

MauriceVanVeen Feb 10, 2025

Choose a reason for hiding this comment

neilalexander left a comment

Choose a reason for hiding this comment