You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A client writes log entries by calling Append RPC. The request message of Append RPC can contain a batch of log entries to write a bunch of logs at once, reducing network RTTs. Batching log entries also makes the log-appending pipeline in a storage node efficient since the storage layer in the storage node stores a set of log entries as a batch rather than one by one.
It works well if a client writes many log entries at a single request. For example, the append request can write four logs simultaneously, as shown in the figure above.
However, when the append request has a few log entries, for instance, a single log entry in a request, the storage node cannot benefit from the batch write of the storage layer. In the current implementation, the storage node flows the log entries only in a single request into the log-appending pipeline.
We introduce the accumulator to collect log entries in front of the sequencer and to form a batch.
MinAccumulateInterval: The minimum interval of sending accumulated log entries to the sequencer. The accumulator keeps a log entry from a client until this interval expires unless the number of retained log entries exceeds the MaxAccumulateSize.
MaxAccumulateSize: The maximum number of log entries kept by the accumulator. If the number of retained log entries equals the MaxAccumulateSize, the accumulator sends log entries to the sequencer and resets the timer for MinAccumulateInterval.
Values to retain log entries can be pooled. Since the number of log entries that have to keep is fixed, it is easy to pool.
Currently, the Append RPC handler builds the write batch to be stored in the storage layer. Since the handler runs concurrently, the building write batch is also executed concurrently.
However, the accumulator has to build up the write batch instead of the Append RPC. If it causes head-of-line blocking, we can use some optimization like double buffering.
Challenges
Performance testing
Finding good parameters.
The text was updated successfully, but these errors were encountered:
Motivation
A client writes log entries by calling Append RPC. The request message of Append RPC can contain a batch of log entries to write a bunch of logs at once, reducing network RTTs. Batching log entries also makes the log-appending pipeline in a storage node efficient since the storage layer in the storage node stores a set of log entries as a batch rather than one by one.
It works well if a client writes many log entries at a single request. For example, the append request can write four logs simultaneously, as shown in the figure above.
However, when the append request has a few log entries, for instance, a single log entry in a request, the storage node cannot benefit from the batch write of the storage layer. In the current implementation, the storage node flows the log entries only in a single request into the log-appending pipeline.
We introduce the accumulator to collect log entries in front of the sequencer and to form a batch.
Design
A goroutine executes the accumulator. It has a queue to receive a log entry from the Append RPC handler.
The back-of-the-envelope design of the accumulator looks like this:
It has two tunable parameters:
Values to retain log entries can be pooled. Since the number of log entries that have to keep is fixed, it is easy to pool.
Currently, the Append RPC handler builds the write batch to be stored in the storage layer. Since the handler runs concurrently, the building write batch is also executed concurrently.
However, the accumulator has to build up the write batch instead of the Append RPC. If it causes head-of-line blocking, we can use some optimization like double buffering.
Challenges
The text was updated successfully, but these errors were encountered: