-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rolling upgrade from 0.10 to 0.11 causes unknown magic byte errors #1021
Comments
Hmm, this is a bit confusing to me since the framing should be different for the two record formats so I'm not sure how they would get mixed in a response. Maybe this is another function of having multiple batches in a stream (aka #1022 cc @wladh)? Except if that was the case I would expect that bug to mask this one. |
It's not masked by #1022 because the legacy messages don't have that issue. A message set decodes messages in a loop until the buffer is completely processed. So if the first messages are legacy, followed by records, then you'd see this behaviour. |
Should be fixed by #1023. |
Update to 1.19 and pick up the following bug fixes: 1. IBM/sarama#1021 (for FAB-11977) 2. IBM/sarama#1087 (for FAB-12827) FAB-11977 #done FAB-12827 #done Change-Id: I85f89aeabb619a084902dc9e76491b981848c752 Signed-off-by: Kostas Christidis <kostas@christidis.io>
Update to 1.19 and pick up the following bug fixes: 1. IBM/sarama#1021 (for FAB-11977) 2. IBM/sarama#1087 (for FAB-12827) FAB-11977 #done FAB-12827 #done Change-Id: Ifc73cbc4d205e9ce1e19c403666c4420a5538b0c Signed-off-by: Kostas Christidis <kostas@christidis.io>
Update to 1.19 and pick up the following bug fixes: 1. IBM/sarama#1021 (for FAB-11977) 2. IBM/sarama#1087 (for FAB-12827) FAB-11977 #done FAB-12827 #done Change-Id: I3be588a3f293079971af5c20c72c1b32bf613968 Signed-off-by: Kostas Christidis <kostas@christidis.io>
Update to 1.19 and pick up the following bug fixes: 1. IBM/sarama#1021 (for FAB-11977) 2. IBM/sarama#1087 (for FAB-12827) FAB-11977 #done FAB-12827 #done Change-Id: I3be588a3f293079971af5c20c72c1b32bf613968 Signed-off-by: Kostas Christidis <kostas@christidis.io>
Versions
Sarama Version: f0c3255
Kafka Version: 0.10.1.0 -> 0.11.0.1
Go Version: 1.9.2
Configuration
(happy to provide more information if this is necessary)
Logs
Relevant Sarama error included below.
Problem Description
We just upgraded one of our Kafka clusters from 0.10.1.0 (Confluent 3.1) to 0.11.0.1 (Confluent 3.3.1). While doing the final rolling restart to change the
log.message.format.version
from 0.10.1-IV2 to 0.11.0, our Go clients started throwingerror decoding packet: unknown magic byte (2)
errors. When this happened, they would disconnect from the client and reconnect, picking up from where they left off, so they would never advance.This had us stumped for a while, as the clients worked fine on another cluster running 0.11. I dug through the Sarama source and think I know what is happening. Sarama pulls a block of records from Kafka, and then calls
setTypeFromMagic
to determine if they are legacy (0.10) messages or 0.11 records. It assumes that every message in that block is of the same type. But that is not necessarily going to be the case if a message format upgrade is in progress.The "fix" for us was to restart the clients so they would start consuming from the end of the partitions. Luckily all of our Go clients could tolerate skipping messages.
The text was updated successfully, but these errors were encountered: