-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NATS Deleting Recovered Stream as Orphaned #5382
Comments
Orphaned means the server could not find any meta assignment from the meta layer after synching with up. |
Trying to understand this a bit more - here's the order of what happened:
If the stream doesn't exist in the meta layer after syncing up - why does the stream appear on the same node moments later? |
Could you add some more information to |
Here's the timeline - at 13:56UTC a new stream is created using the NATS client, bound to that node (gq-nats-1). We then notice these logs with what appears to be the node re-detecting every consumer for the stream - this happens several hundreds of times:
After the "new consumer" logs stop - wee see these errors:
Followed by repeated logging of the following:
and
At this point - the node is unavailable, as are all the streams located on it - which prompts the restart of the cluster using |
It says it ran out of resources and shutdown JetStream. We should address that first. |
@derekcollison I've just run into a similar occurrence on NATS 2.9.20 with a cluster of 3, and my stream of only 1 replica getting wiped. tl;dr: Stream "AAAAA" was recovered but then wiped in the same process by nats-server after a restart (due to underlying jetstream volume was 100% full) Cluster:
Streams:
Sequence of events:
nats-1 logs on crash/bootloop
nats-1 logs on restart (stream recover+wipe)
Warning logs excerpt:
|
Fixed via #5767 |
Observed behavior
NATS recovered messages from a stream, but then deleted messages afterwards
Expected behavior
NATS should not delete the "OR" stream - as it and its consumers were recovered
Server and client version
2.9.15
Host environment
Running as a Kubernetes statefulset.
Steps to reproduce
The "OR" stream was unavailable at the time of restart. The OR stream runs on a single node - referred to here as gq-nats-1
A series of issues with NATS began after we created a new stream that was tag-located to gq-nats-1:
This then drove us to restart the service.
The text was updated successfully, but these errors were encountered: