fetching: export utilities for decompressing and parsing partition fetch responses #4

dimitarvdimitrov · 2024-09-30T15:11:56Z

This pulls in twmb#803. There were conflicts with #1 in the signature of ProcessRespPartition. After chatting with @ortuman we agreed to keep the field private for now

…tch responses ### Background In grafana/mimir we are working towards making fetch requests ourselves. The primary reason behind that is that individual requests to the kafka backend are slow, so doing them sequentially per partition becomes the bottleneck in our application. So we want to fetch records in parallel to speed up the consumption. One difficulty I met when issuing `FetchRequest`s ourselves is that parsing the response is non-trivial. That's why I'm proposing to export these functions for downstream projects to use. Alternatively, I can also try contributing the concurrent fetching logic. But I believe that is much more nuanced and with more tradeoffs around fetched bytes and latency. So I wasn't sure whether it's a good fit for a general purpose library. I'm open to discuss this further. ### What this PR does Moves `(*kgo.cursorOffsetNext).processRespPartition` from being a method to being a standalone function - `kgo.processRespPartition`. There were also little changes necessary to make the interface suitable for public use (like removing the `*broker` parameter). ### Side effects To minimize the necessary changes and the API surface of the package I opted to use a single global decompressor for all messages. Previously, there would be one decompressor per client and that decompressor would be passed down to `(*cursorOffsetNext).processRespPartition`. My understanding is that using different pooled readers (lz4, zst, gzip) shouldn't have a negative impact on performance because usage patterns do not affect the behaviour of the reader (for example, a consistent size of decompressed data doesn't make the reader more or less efficient). I have not thoroughly verified or tested this - Let me know if you think that's important. An alternative to this is to also export the `decompressor` along with `newDecompressor()` and the auxiliary types for decompression.

…tch responses

…th-export-partition-parsing-utils fetching: export utilities for decompressing and parsing partition fetch responses

* fetching: export utilities for decompressing and parsing partition retch responses ### Background In grafana/mimir we are working towards making fetch requests ourselves. The primary reason behind that is that individual requests to the kafka backend are slow, so doing them sequentially per partition becomes the bottleneck in our application. So we want to fetch records in parallel to speed up the consumption. One difficulty I met when issuing `FetchRequest`s ourselves is that parsing the response is non-trivial. That's why I'm proposing to export these functions for downstream projects to use. Alternatively, I can also try contributing the concurrent fetching logic. But I believe that is much more nuanced and with more tradeoffs around fetched bytes and latency. So I wasn't sure whether it's a good fit for a general purpose library. I'm open to discuss this further. ### What this PR does Moves `(*kgo.cursorOffsetNext).processRespPartition` from being a method to being a standalone function - `kgo.processRespPartition`. There were also little changes necessary to make the interface suitable for public use (like removing the `*broker` parameter). ### Side effects To minimize the necessary changes and the API surface of the package I opted to use a single global decompressor for all messages. Previously, there would be one decompressor per client and that decompressor would be passed down to `(*cursorOffsetNext).processRespPartition`. My understanding is that using different pooled readers (lz4, zst, gzip) shouldn't have a negative impact on performance because usage patterns do not affect the behaviour of the reader (for example, a consistent size of decompressed data doesn't make the reader more or less efficient). I have not thoroughly verified or tested this - Let me know if you think that's important. An alternative to this is to also export the `decompressor` along with `newDecompressor()` and the auxiliary types for decompression. * Restore multiline processV0OuterMessage * `*kgo.Records` pooling support Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com> * Merge pull request #1 from grafana/ortuman/reduce-kgo-record-alloc `*kgo.Record` pooling support * fetching: export utilities for decompressing and parsing partition retch responses * Merge pull request #4 from dimitarvdimitrov/dimitar/grafana-master-with-export-partition-parsing-utils fetching: export utilities for decompressing and parsing partition fetch responses * Merge pull request #3 from ortuman/reduce-decompression-buffer-allocations Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com> --------- Signed-off-by: Miguel Ángel Ortuño <ortuman@gmail.com> Co-authored-by: Dimitar Dimitrov <dimitar.dimitrov@grafana.com>

dimitarvdimitrov added 3 commits September 4, 2024 16:55

Restore multiline processV0OuterMessage

ceadc28

fetching: export utilities for decompressing and parsing partition re…

df26a76

…tch responses

dimitarvdimitrov changed the title ~~Dimitar/grafana master with export partition parsing utils~~ fetching: export utilities for decompressing and parsing partition fetch responses Sep 30, 2024

ortuman approved these changes Sep 30, 2024

View reviewed changes

dimitarvdimitrov mentioned this pull request Sep 30, 2024

kgo: reduce allocations when processing batches #3

Merged

dimitarvdimitrov merged commit ec37e4b into grafana:master Sep 30, 2024
1 check passed

ortuman pushed a commit that referenced this pull request Oct 3, 2024

Merge pull request #4 from dimitarvdimitrov/dimitar/grafana-master-wi…

bf66f21

…th-export-partition-parsing-utils fetching: export utilities for decompressing and parsing partition fetch responses

ortuman mentioned this pull request Oct 3, 2024

use grafana/franz-go fork grafana/mimir#9511

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fetching: export utilities for decompressing and parsing partition fetch responses #4

fetching: export utilities for decompressing and parsing partition fetch responses #4

dimitarvdimitrov commented Sep 30, 2024

fetching: export utilities for decompressing and parsing partition fetch responses #4

fetching: export utilities for decompressing and parsing partition fetch responses #4

Conversation

dimitarvdimitrov commented Sep 30, 2024