[FIXED] Don't InstallSnapshot
during shutdown, would race with monitorStream
/monitorConsumer
#6153
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When stopping a stream or consumer, we would attempt to install a snapshot. However, this would race with what's happening in
monitorStream
/monitorConsumer
at that time.For example:
applyStreamEntries
we call intomset.processJetStreamMsg
to persist one or multiple messages.mset.stop(..)
either before or during the above.mset.stop(..)
we'd wait formset.processJetStreamMsg
to release the lock so we can entermset.stateSnapshotLocked()
. We create a snapshot with new state here!InstallSnapshot
to persist above snapshot, butn.applied
does not contain the right value, the value will be lower.applyStreamEntries
finishes and we end with callingn.Applied(..)
.This would be a race condition depending on if 4 happened before or after 5.
It's essential that the snapshot we make is aligned with the
n.applied
value. If we don't that means we'll replay and need to increasemset.clfs
which will snowball into stream desync due to this shift.The only place where we can guarantee that the snapshot and applied are aligned is in
doSnapshot
ofmonitorStream
andmonitorConsumer
(andmonitorCluster
), so we must not attempt installing snapshots outside of those.Signed-off-by: Maurice van Veen github@mauricevanveen.com