This repository has been archived by the owner on Jan 24, 2024. It is now read-only.
Fix many non-durable cursors created on a topic with multiple groups #695
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
KoP's
KafkaTopicConsumerManager
(aka TCM) maintains some non-durable cursors and the associated offsets for a consumer on a specific topic. If the fetch offset doesn't exist in TCM, TCM will create a non-durable cursor whose position is associated with the offset. Each time a message is consumed, the offset and cursor pair will be removed.Currently there's a global map
consumerTopicManagers
whose key is topic name and value is the future of TCM. However, for a topic with multiple consumer groups (subscriptions), all consumers share the same TCM. There's a great possibility that different consumers fetch different offsets concurrently from the same TCM. In this case, a lot of non-durable cursors could be created.Modifications
Add a singleton class
KafkaTopicConsumerManagerCache
to manage TCMs. The internal cache has two keys. The first key is the topic name, the second key is the remote address to identify different consumers on the same topic.To ensure only one non-durable cursor is created for a TCM if no reconnection happened, this PR adds a field
numCreatedCursors
to record the total count of non-durable cursor creations.testCursorCountForMultiGroups
was added to verify this behavior, it creates 5 consumers to consume the same topic with different group id in parallel. After consuming completed, check all TCMs to ensure each TCM'snumCreatedCursors
is 1.