-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Idempotent producer bug: "Local: Inconsistent state: Unable to reconstruct MessageSet" #4736
Comments
Hi @StephanDollberg , thanks for the report, can you reproduce it with AK too? Does it happen after that fix in 2.3.0? The log has version 2.2.0 |
I haven't tried that yet.
Sorry for not being clear, the issue still persist on latest master. It got introduced in the commit pointed out above -6dc7c71 (which in git speak is 2.2+something but the first released version is 2.3.0) |
I think it can happen in some particular cases because of this condition librdkafka/src/rdkafka_msgset_writer.c Line 877 in 2587cac
and the backoff that is not uniform across all messages of the same idempotent batch here: librdkafka/src/rdkafka_partition.c Line 925 in 6dc7c71
I'm going to check about reproducing and fixing this case |
…tructed identically when retried Issues: #4736 Fix for an idempotent producer error, with a message batch not reconstructed identically when retried. Caused the error message "Local: Inconsistent state: Unable to reconstruct MessageSet". Happening on large batches. Solved by using the same backoff baseline for all messages in the batch. Happens since 2.2.0
…tructed identically when retried (#4750) Issues: #4736 Fix for an idempotent producer error, with a message batch not reconstructed identically when retried. Caused the error message "Local: Inconsistent state: Unable to reconstruct MessageSet". Happening on large batches. Solved by using the same backoff baseline for all messages in the batch. Happens since 2.2.0
…tructed identically when retried (confluentinc#4750) Issues: confluentinc#4736 Fix for an idempotent producer error, with a message batch not reconstructed identically when retried. Caused the error message "Local: Inconsistent state: Unable to reconstruct MessageSet". Happening on large batches. Solved by using the same backoff baseline for all messages in the batch. Happens since 2.2.0
Description
Hi,
I am using
rdkafka_performance
for some load tests against a redpanda cluster and running into the following idempotent producer bug:As per the comment here: https://github.com/confluentinc/librdkafka/blob/master/src/rdkafka_msgset_writer.c#L912 this seems to indicate a client bug.
How to reproduce
Unfortunately I don't have an easy reproducer but effectively I am running 10 parallel rdkafka_performance instances like so (it seems to require some broker load to trigger) :
Note that only
enable.idempotence=true
is really required. The other options can be omitted in my case as well.I have bisected the issue back to this commit introduced between v2.2.0 and v2.3.0:
Can't immediately see what's wrong with that. Possibly it just exposes a bug that was already existing. Note 2.3.0 also introduced another suspiciously looking commit but that doesn't seem to be the issue.
From the logs (see below) it seems to get triggered by a "Broker: Not leader for partition" request error.
Checklist
IMPORTANT: We will close issues where the checklist has not been completed.
Please provide the following information:
enable.idempotence=true
debug=..
as necessary) from librdkafka: debug=topic,broker (msg is too much data): https://gist.githubusercontent.com/StephanDollberg/40363a2567a98c0ad1469994df964412/raw/072343ca4c6fceb4a1c30876a951bad736df8b06/rdkafka_0_0_0.logThe text was updated successfully, but these errors were encountered: