Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider batching incoming partition messages before flushing them together to partition segment #18

Open
markpapadakis opened this issue Jul 15, 2016 · 2 comments

Comments

@markpapadakis
Copy link
Member

markpapadakis commented Jul 15, 2016

Currently, the broker will append to the leader (itself in standalone mode), but maybe this is not optimal. We should support for tuning select topics so that the server would buffer upto n-number of messages and it would flush them to the segment until it got >= that number, or some time has elapsed since it began buffering, or maybe also specify a size threshold.

This is particularly useful when you have many producers, all producing 1-2 messages at/time.

Kafka's likely doing something similar, but I am not sure what's the optimal way to do this in terms of semantics. Will need to consider Kafka's design, but here are a few questions that need be answered:

  • How to deal with incoming bundles with a compressed message set? Flush existing and restart buffering, or decompress them and buffer the messages in the compressed message set?
  • When should the broker respond (generate an ack) to the clients? immediately, or as soon as the messages the client produced, and are now buffered along with other messages, have all been flushed?

Some of the benefits:

  • Opportunities for compression, because when buffered messages will be flushed the broker will consider compression -- this may not be what we want though because we wouldn't want the compression to block the thread so maybe we can hand this off to another thread; a few us worth of interop delay shouldn't mean much
  • Far fewer system write()/writev() calls
@markpapadakis
Copy link
Member Author

Imagine having 1,000 different clients all consistently producing 1 message (say 500 some messages/second) to the same Tank broker -- that would mean the Tank broker would need to execute over 500k write() calls (excluding any possible writes for updating indices) in order to write those messages to the current segment.

If instead we would collect all writes for each distinct segment received in the same poll events loop and then combine and apply them outside the loop and then notify the clients, that would perhaps work great, and shouldn't have been that hard to implement.

@markpapadakis
Copy link
Member Author

We should eventually support buffering published messages on the client, and flush them all as a single bundle when the buffered messages count exceeds a threshold, or a long time has passed since the first message was buffered.

This should be configurable on a per partition basis. It obviously means that if the application doesn't get to flush the data in-time, the data will be lost. It will still be useful if you can tolerate that possibility and/or the thresholds are kept low.

Alternatively, you could e.g buffer that into a file and periodically flush that file to Tank.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant