Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FIXED] Ghost consumers during meta recovery #6092

Merged
merged 3 commits into from
Nov 9, 2024

Conversation

MauriceVanVeen
Copy link
Member

@MauriceVanVeen MauriceVanVeen commented Nov 8, 2024

During meta recovery ru.updateConsumers and ru.removeConsumers would not be properly cleared since the move from map[string]*consumerAssignment to map[string]map[string]*consumerAssignment. Which meant that consumers that needed to be removed were both in ru.removeConsumers and left in ru.updateConsumers. Resulting in a ghost consumer.

Also don't clear recovering state while we still have items to process as part of recovery.

De-flakes TestJetStreamClusterLostConsumers, and makes TestJetStreamClusterConsumerLeak more reliable by re-introducing the ca.pending flag. Since the consumer leader responds for consumer creation, but meta leader responds for consumer deletion, so need to have the consumer assignment available so meta leader can respond successfully.

Signed-off-by: Maurice van Veen github@mauricevanveen.com

@MauriceVanVeen MauriceVanVeen requested a review from a team as a code owner November 8, 2024 13:05
@MauriceVanVeen MauriceVanVeen force-pushed the maurice/fix-ghost-consumers branch from 6faedb6 to 99cf6c8 Compare November 8, 2024 13:22
@MauriceVanVeen MauriceVanVeen marked this pull request as draft November 8, 2024 13:58
Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
…ude in snapshot

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
…d don't include in snapshot

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
@MauriceVanVeen MauriceVanVeen force-pushed the maurice/fix-ghost-consumers branch from f0b9900 to 857ada4 Compare November 9, 2024 00:06
@MauriceVanVeen MauriceVanVeen marked this pull request as ready for review November 9, 2024 00:25
Copy link
Member

@neilalexander neilalexander left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@derekcollison derekcollison merged commit 37d4461 into main Nov 9, 2024
5 checks passed
@derekcollison derekcollison deleted the maurice/fix-ghost-consumers branch November 9, 2024 02:39
neilalexander pushed a commit that referenced this pull request Nov 12, 2024
During meta recovery `ru.updateConsumers` and `ru.removeConsumers` would
not be properly cleared since the move from
`map[string]*consumerAssignment` to
`map[string]map[string]*consumerAssignment`. Which meant that consumers
that needed to be removed were both in `ru.removeConsumers` and left in
`ru.updateConsumers`. Resulting in a ghost consumer.

Also don't clear recovering state while we still have items to process
as part of recovery.

De-flakes `TestJetStreamClusterLostConsumers`, and makes
`TestJetStreamClusterConsumerLeak` more reliable by re-introducing the
`ca.pending` flag. Since the consumer leader responds for consumer
creation, but meta leader responds for consumer deletion, so need to
have the consumer assignment available so meta leader can respond
successfully.

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

---------

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
neilalexander pushed a commit that referenced this pull request Nov 12, 2024
During meta recovery `ru.updateConsumers` and `ru.removeConsumers` would
not be properly cleared since the move from
`map[string]*consumerAssignment` to
`map[string]map[string]*consumerAssignment`. Which meant that consumers
that needed to be removed were both in `ru.removeConsumers` and left in
`ru.updateConsumers`. Resulting in a ghost consumer.

Also don't clear recovering state while we still have items to process
as part of recovery.

De-flakes `TestJetStreamClusterLostConsumers`, and makes
`TestJetStreamClusterConsumerLeak` more reliable by re-introducing the
`ca.pending` flag. Since the consumer leader responds for consumer
creation, but meta leader responds for consumer deletion, so need to
have the consumer assignment available so meta leader can respond
successfully.

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>

---------

Signed-off-by: Maurice van Veen <github@mauricevanveen.com>
neilalexander added a commit that referenced this pull request Nov 12, 2024
Includes the following:

* Some tweaks to the NRG test helpers
* #6055
* #6061
* #6065 
* #6041 (but with `math/rand`
instead of `math/rand/v2` due to an older Go version in CI for 2.10.x)
* #6066
* #6067
* #6069
* #6075
* #6082
* #6087
* #6086
* #6088
* #6089
* #6092
* #6096
* #6098
* #6097
* #6105
* #6104
* #6106
* #6109
* #6111
* #6112

Signed-off-by: Neil Twigg <neil@nats.io>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants