fix: clear preferredReadReplica if broker shutdown #2108
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After Sarama had been given a preferred replica to consume from, it was mistakenly latching onto that value and not unsetting it in the case that the preferred replica broker was shutdown and left the cluster metadata.
Fetches continued to work as long as that broker remained shutdown, because they were now being sent to the Leader, which would service them itself as it had no better preferred replica to point the client at.
However, consumption would then hang after the broker came back online, because the Leader would stop returning records in the FetchResponse and would instead just return the preferred replicaID, expecting the client to send its FetchRequests over there. However, because the partitionConsumer had latched the value of preferredReplica it never dispatched to (re-)connect to the preferred replica and instead just continued to send FetchRequests to the leader and received no records back.
Contributes-to: #2090