Consider batching incoming partition messages before flushing them together to partition segment #18

markpapadakis · 2016-07-15T06:53:28Z

Currently, the broker will append to the leader (itself in standalone mode), but maybe this is not optimal. We should support for tuning select topics so that the server would buffer upto n-number of messages and it would flush them to the segment until it got >= that number, or some time has elapsed since it began buffering, or maybe also specify a size threshold.

This is particularly useful when you have many producers, all producing 1-2 messages at/time.

Kafka's likely doing something similar, but I am not sure what's the optimal way to do this in terms of semantics. Will need to consider Kafka's design, but here are a few questions that need be answered:

How to deal with incoming bundles with a compressed message set? Flush existing and restart buffering, or decompress them and buffer the messages in the compressed message set?
When should the broker respond (generate an ack) to the clients? immediately, or as soon as the messages the client produced, and are now buffered along with other messages, have all been flushed?

Some of the benefits:

Opportunities for compression, because when buffered messages will be flushed the broker will consider compression -- this may not be what we want though because we wouldn't want the compression to block the thread so maybe we can hand this off to another thread; a few us worth of interop delay shouldn't mean much
Far fewer system write()/writev() calls

markpapadakis · 2016-09-12T18:25:12Z

Imagine having 1,000 different clients all consistently producing 1 message (say 500 some messages/second) to the same Tank broker -- that would mean the Tank broker would need to execute over 500k write() calls (excluding any possible writes for updating indices) in order to write those messages to the current segment.

If instead we would collect all writes for each distinct segment received in the same poll events loop and then combine and apply them outside the loop and then notify the clients, that would perhaps work great, and shouldn't have been that hard to implement.

markpapadakis · 2016-10-07T07:40:51Z

We should eventually support buffering published messages on the client, and flush them all as a single bundle when the buffered messages count exceeds a threshold, or a long time has passed since the first message was buffered.

This should be configurable on a per partition basis. It obviously means that if the application doesn't get to flush the data in-time, the data will be lost. It will still be useful if you can tolerate that possibility and/or the thresholds are kept low.

Alternatively, you could e.g buffer that into a file and periodically flush that file to Tank.

markpapadakis added enhancement Server Performance labels Jul 15, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider batching incoming partition messages before flushing them together to partition segment #18

Consider batching incoming partition messages before flushing them together to partition segment #18

markpapadakis commented Jul 15, 2016 •

edited

Loading

markpapadakis commented Sep 12, 2016

markpapadakis commented Oct 7, 2016

Consider batching incoming partition messages before flushing them together to partition segment #18

Consider batching incoming partition messages before flushing them together to partition segment #18

Comments

markpapadakis commented Jul 15, 2016 • edited Loading

markpapadakis commented Sep 12, 2016

markpapadakis commented Oct 7, 2016

markpapadakis commented Jul 15, 2016 •

edited

Loading