Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cherry-picks for 2.10.19-RC.2 #5718

Merged
merged 16 commits into from
Jul 30, 2024
Merged

Cherry-picks for 2.10.19-RC.2 #5718

merged 16 commits into from
Jul 30, 2024

Conversation

kozlovic and others added 15 commits July 29, 2024 16:57
This is an alternate approach to the PR #5484 from @wjordan.

Using the code in that PR with the test added in this PR, I could
still see duplicate routes (up to 125 in one of the matrix), and
still had a data race (that could have easily be fixed). The main
issue is that the increment happens in connectToRoute, which is
running from a go routine, so there were still chances for duplicates.

Instead, I took the approach that those duplicates were the result
of way too many gossip protocols. Suppose that you have servers A
and B already connected. C connects to A. A gossips to B that it
should connect to C. When that happened, B would gossip to A the
server C and C would gossip to A the server B, which all that was
unnecessary. It would grow quite fast with the size of the cluster
(that is, several thousands for a cluster size of 15 or so).

Resolves #5483

Signed-off-by: Ivan Kozlovic <ivan@synadia.com>
Signed-off-by: Neil Twigg <neil@nats.io>
Signed-off-by: Waldemar Quevedo <wally@nats.io>
Signed-off-by: Derek Collison <derek@nats.io>
If we failed to start the Raft group subscriptions then we were calling
`shutdown` with the `shouldDelete` flag set, which would nuke the state
on disk, blowing away the WAL, the term and vote etc.

However, this could happen if a Raft group tried to be started while the
server was shutting down. When this happened, we would see a log entry
saying `Error creating raft group: system account not setup` and then the
Raft state would get deleted, so after a restart, all state was lost.

This PR changes `shouldDelete` to false so that we preserve the state on
disk for the next startup.

Signed-off-by: Neil Twigg <neil@nats.io>
While validating the ideas in ADR-44 the proposed improvements
caught the fact that a snapshot request was being sent to a restore
API call. Tests passed because there was enough overlap in the structs
but strictly should have been a failure due to the invalid request

Signed-off-by: R.I.Pienaar <rip@devco.net>
…tion`

Signed-off-by: Neil Twigg <neil@nats.io>
…atch filters (#5699)

Signed-off-by: Derek Collison <derek@nats.io>

Signed-off-by: Derek Collison <derek@nats.io>
The stream state and replica info is only guaranteed to be accurate
when returned by the leader of a given stream. This new option
returns stream details only for the stream in which the server
is the leader for that stream. For systems with many streams this
can significantly reduce the amount of data returned when scraping
across all servers since non-leader details will likely be ignored.

Fix #5698

Signed-off-by: Byron Ruth <byron@nats.io>
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
We had a use case with millions of subjects and the last sequence checked being in the next to the last block.
The consumer had a wildcard that matched lots of entries that were behind where we were.
This would burn alot of cpu and when a stream had lots of consumers and they shift leadership this would introduce some instability due to all the cpu cycles.

Signed-off-by: Derek Collison <derek@nats.io>
Signed-off-by: Neil Twigg <neil@nats.io>
@bruth bruth requested a review from a team as a code owner July 29, 2024 21:00
Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…sive compact attempts.

Signed-off-by: Derek Collison <derek@nats.io>
Copy link
Member

@derekcollison derekcollison left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bruth bruth merged commit b0cbbe8 into release/v2.10.19 Jul 30, 2024
5 checks passed
@bruth bruth deleted the neil/21019rc2 branch July 30, 2024 00:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants