-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[backport] [v24.2.x] miscellaneous idempotency fixes #22687 #22757
Merged
piyushredpanda
merged 10 commits into
redpanda-data:v24.2.x
from
bharathv:v242x-id-fixes
Aug 6, 2024
Merged
[backport] [v24.2.x] miscellaneous idempotency fixes #22687 #22757
piyushredpanda
merged 10 commits into
redpanda-data:v24.2.x
from
bharathv:v242x-id-fixes
Aug 6, 2024
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(cherry picked from commit 26f6150)
Factoring out the code into a utility, to be used in a later commit. (cherry picked from commit 7ac5acf)
To be used to reset the producer state with new epoch for idempotent producers that decide to bump the epoch on the client side (which is totally fine as the idempotency is per session and client can independently decide to bump the epoch on it's side). (cherry picked from commit 2f5b9cb)
(cherry picked from commit e2c59b8)
For producers the broker no longer tracks, we now skip sequence checks and allow any non zero sequence. This can happen if the producer produced after the producer got evicted from the broker's memory (eg: log got prefix truncated, producer hit expiration thresholds etc) While kip-360 suggests that the broker should throw unknown_producer_id error in this case, Apache Kafka no longer does that. Adding to the complication not every client implements unknown_producer_id logic similiarly, this can result in different behaviors on different clients. With this patch, we just mimic what Apache Kafka does to be consistent. Apache Kafka code for future reference. https://github.com/apache/kafka/pull/7115/files#diff-5482b26d93c5d36f272f65e628c1692622b69f8ba4a2df04ba74fad23623828dR239 (cherry picked from commit bc3d761)
config definition says it is but it is not, fixed it (cherry picked from commit 0ed2a90)
If expire_old_txes kicks in and there are no tx topics, it means there are no transactions, that can be logged at a lower severity. (cherry picked from commit 3c36e7c)
(cherry picked from commit 1af7557)
(cherry picked from commit f8e5c21)
In some racy situations it may happen that the request is already errored out. Consider the following sequence of actions. replicate_f - succeeded but set_value() not called -- scheduling point -- term change -> sync() -> GC of inflight requests, request is marked timedout now set_value() is called in the original fiber, this triggers an assert. Relaxing the assert condition to make it idempotent. Subsequent client retry of the request will be marked success (once the change is applied in the stm and the request state is populated). Unable to reproduce in a unit test mainly due to lack of an idempotent client in the unit test fixture. (cherry picked from commit 092a2b8)
piyushredpanda
approved these changes
Aug 6, 2024
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Two main changes in this patch
Broker can now handle epoch bumps for idempotent producers. A client can independently bump the producer epoch in certain situations (check kip-360 and related code) as idempotency only pertains to the single session. The broker code had issues handling epoch bumps which is fixed.
For evicted producer state on the broker (eg: log prefix truncation, producer expiration etc), there are subtle differences among clients around how they handle the producer reset scenario. Java client, for example bumps the epoch on OOOSN and if there are no other requests in flight while librdkafka is pretty strict and only does it on UNKNOWN_PRODUCER_ID error code (which explicitly tells the client that the broker has no state for the producer and it should reset). Changed the code to what Apache Kafka does, upon encountering an unknown producer id, any sequence number is accepted to make forward progress because the only way a broker doesn't know about the producer is when it got evicted from memory, doesn't seem fool proof but consistent with AK behavior and more importantly works with all the client implementations.
Fixes #22753
Backports Required
Release Notes