This repository has been archived by the owner on Jan 24, 2024. It is now read-only.
Use pooled direct memory allocator when decoding Pulsar entry to Kafka records #673
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Motivation
When a Pulsar entry is decoded to Kafka record in
ByteBufUtils#decodePulsarEntryToKafkaRecords
, a NIO buffer whose initial capacity is 1 MB will be allocated from heap memory. Therefore, each time an entry is read, 1 MB heap memory will be allocated. Then the heap memory will increase very quickly and GC will happen frequently.Kafka
MemoryRecordsBuilder
uses its underlyingByteBufferOutputStream
field as the internal buffer whose capacity can be increased inwrite
method. Even if a direct buffer was allocated by Netty's pooled direct memory allocator and its underlyingByteBuffer
was passed toByteBufferOutputStream
's constructor, if the reallocation happened, the new buffer could still be allocated from heap memory.Modification
This PR adds a
DirectBufferOutputStream
class that inherits fromByteBufferOutputStream
and overrides some methods that can be called inMemoryRecordsBuilder
. This class uses Pulsar's defaultByteBufAllocator
to allocate memory. The other methods' behaviors are the same withByteBufferOutputStream
.A unit test is added to verify that the
MemoryRecordsBuilder
will build the same records no matter the underlyingByteBufferOutputStream
isByteBufferOutputStream
orDirectBufferOutputStream
. Three cases are tested in this test:position(int)
method will be called to increase the capacity.write()
method will increase the capacity automatically.Then, a
DirectBufferOutputStream
instance is passed toMemoryRecordsBuilder
's constructor inByteBufUtils#decodePulsarEntryToKafkaRecords
and the return value's type is changed toDecodeResult
because we need to release theByteBuf
later.