Support synchronous sending for batches of messages #674

JasonRosenberg · 2016-06-09T18:23:48Z

Currently, the sync producer doesn't support sending batches of messages, even though the underlying protocol allows this (and the java client library supports this too).

We have a need to send a batch of messages reliably with acknowledgment, but don't want to incur the multiple round trips required with sending each message individually.

eapache · 2016-06-09T18:52:54Z

There is a minor ambiguity here: do you need to ensure that the batch succeeds or fails atomically as a single group, or do you just need to know when they've all been delivered?

Atomic batch-level success/failure is complicated, and I'm not likely to get to it in the near-future (pull requests welcome of course). If you don't need that, then the question becomes: what should the hypothetical SendMessages([]*ProducerMessage) return when some of the messages succeed but others fail?

tadbook · 2016-06-09T19:18:36Z

I would assume that you would return a []*ProducerResponse where a ProducerResponse includes an error that might be nil. Something like the ProducerError returned from the async producer.

JasonRosenberg · 2016-06-09T19:36:34Z

I think it should behave the same as the java client version, which is to only indicate whether all messages succeeded, or not. If there was a failure, don't bother with returning details about which ones succeeded and which ones failed. Producer clients can expect to resend the whole batch if there is any failure (which is ok, since with kafka, we only guarantee at-least-once delivery, and consumers have to be tolerant of duplicates, and process messages idempotently).

JasonRosenberg · 2016-06-09T19:38:27Z

fwiw, this is also how async sending should work too, really (e.g. internally, you buffer up messages and send out batches, and then only report success or failure at the batch level)....

eapache · 2016-06-09T19:43:54Z

internally, you buffer up messages and send out batches

We already do that.

then only report success or failure at the batch level

We decided that callers shouldn't have to care about our internal choice of batches (which is good, since the algorithm we use has changed a few times already), so we propagate the batch-level errors back to individual messages. This results in a simpler API.

JasonRosenberg · 2016-06-09T19:59:24Z

Agreed, sending the batch-level result back to individual messages makes sense.

tadbook · 2016-06-10T15:19:10Z

So I dug into this a little more. It looks like the sync producer is actually built on top of the async producer. It just sends a message to the async producer and blocks on a channel that it puts into the metadata. When it gets a response from the async producer, it writes to the channel, unblocking the first call. So, assuming the async producer uses batching, sending a bunch of individual messages to the sync producer should batch them up, as long as they are sent from separate goroutines, so you aren't waiting on the response from one before sending the next. Not an optimal interface, though.

eapache · 2016-06-10T15:26:35Z

@tadbook your analysis is correct (and yes the async producer does use batching) however that method would not guarantee ordering within a given batch.

#677 implements a proper SendMessages(msgs []*ProducerMessage) method, still built on top of the async producer but in a way that guarantees order and only requires one goroutine instead of n.

tadbook · 2016-06-10T15:53:02Z

Awesome! Thanks for doing that, and thanks for the quick turnaround.

eapache added enhancement producer labels Jun 9, 2016

eapache mentioned this issue Jun 9, 2016

SyncProducer batch support #677

Merged

eapache closed this as completed in #677 Jun 10, 2016

mctoohey mentioned this issue Sep 28, 2023

kafka_notify should support batching of messages to Kafka minio/minio#18124

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support synchronous sending for batches of messages #674

Support synchronous sending for batches of messages #674

JasonRosenberg commented Jun 9, 2016

eapache commented Jun 9, 2016 •

edited

Loading

tadbook commented Jun 9, 2016

JasonRosenberg commented Jun 9, 2016

JasonRosenberg commented Jun 9, 2016

eapache commented Jun 9, 2016

JasonRosenberg commented Jun 9, 2016

tadbook commented Jun 10, 2016

eapache commented Jun 10, 2016

tadbook commented Jun 10, 2016

Support synchronous sending for batches of messages #674

Support synchronous sending for batches of messages #674

Comments

JasonRosenberg commented Jun 9, 2016

eapache commented Jun 9, 2016 • edited Loading

tadbook commented Jun 9, 2016

JasonRosenberg commented Jun 9, 2016

JasonRosenberg commented Jun 9, 2016

eapache commented Jun 9, 2016

JasonRosenberg commented Jun 9, 2016

tadbook commented Jun 10, 2016

eapache commented Jun 10, 2016

tadbook commented Jun 10, 2016

eapache commented Jun 9, 2016 •

edited

Loading