Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED] Catchup must not extend past requested sequence range #6038

Merged
merged 1 commit into from
Oct 24, 2024

Conversation

MauriceVanVeen
Copy link
Member

When we're being asked to provide data within a range during catchup, we should not extend that range and provide more data. Especially since that range was defined by a snapshot, which also specifies which RAFT entries should be sent after processing that snapshot. This would just result in duplicated work and possible desync for the follower, so these lines can safely be removed.

Timeline of these lines:
Previously when receiving a catchup request the FirstSeq could be moved up to match what the state says:
[IMPROVED] Catchup improvements #3348 (August, 2022)

Afterward this was removed in favor of only extending the LastSeq:
[FIXED] KeyValue not found after server restarts #5054 (February, 2024)

This was done to solve for KeyValue not found issues.
However this change would have also fixed that case:
[FIXED] Do not bump clfs on seq mismatch when before stream LastSeq #5821 (August, 2024)

Signed-off-by: Maurice van Veen github@mauricevanveen.com

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
Copy link
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

This makes sense to me as we don't want the upper stream state to surpass what Raft has told us that it should be.

If a snapshot takes us up to, say, message 50, then the upper layer catch-up must only be up to that point, as we'd expect the Raft machinery to take over again after that point. Otherwise we end up with duplicate work and the much higher potential to hit lseq-clfs mismatch during.

@derekcollison
Copy link
Member

This is till marked draft, lmk if this should be merged.

@MauriceVanVeen MauriceVanVeen marked this pull request as ready for review October 24, 2024 18:10
@MauriceVanVeen MauriceVanVeen requested a review from a team as a code owner October 24, 2024 18:10
@MauriceVanVeen
Copy link
Member Author

This is till marked draft, lmk if this should be merged.

Marked ready for review/merge now 🙂

I've put these PRs up as draft so these could first be discussed with some colleagues before opening up.

@derekcollison derekcollison merged commit 3e8715d into main Oct 24, 2024
5 checks passed
@derekcollison derekcollison deleted the maurice/catchup-must-not-extend-past-request branch October 24, 2024 18:22
neilalexander added a commit that referenced this pull request Nov 25, 2024
Includes the following:

- #5661
- #5666
- #5671
- #5344
- #5684
- #5689
- #5691
- #5714
- #5717
- #5707
- #5792
- #5912
- #5957
- #5700
- #5975
- #5991
- #5987
- #6027
- #6038
- #6053
- #5848
- #6055
- #6056
- #6060
- #6061
- #6072
- #5832
- #6073
- #6107

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants