Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(derive): New BatchStream Stage for Holocene #566

Merged
merged 2 commits into from
Sep 25, 2024

Conversation

refcell
Copy link
Collaborator

@refcell refcell commented Sep 24, 2024

Description

Adds a new BatchStream stage for holocene.

Makes progress towards #559

Copy link
Collaborator Author

refcell commented Sep 24, 2024

@refcell refcell changed the title feat(derive): new batch span stage for holocene feat(derive): New BatchSpan Stage for Holocene Sep 24, 2024
@refcell refcell requested a review from clabby September 24, 2024 18:56
@refcell refcell added H-holocene Hardfork: Holocene related A-derive Area: kona-derive crate K-feature Kind: feature labels Sep 24, 2024 — with Graphite App
@refcell refcell marked this pull request as ready for review September 24, 2024 18:56
Copy link

codecov bot commented Sep 24, 2024

Codecov Report

Attention: Patch coverage is 54.09836% with 28 lines in your changes missing coverage. Please review.

Project coverage is 78.4%. Comparing base (b2f114a) to head (52962be).
Report is 1 commits behind head on main.

✅ All tests successful. No failed tests found.

Files with missing lines Patch % Lines
crates/derive/src/stages/batch_stream.rs 54.0% 28 Missing ⚠️
Additional details and impacted files

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@clabby clabby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start. Small preference for BatchStream rather than BatchSpan, as a dumb naming nit.

I think there's a few things we need to think about architecturally here, though. The way I think about this is that we need to be able to stream single batches from the span batch into the BatchQueue, one at a time. This means that, from holocene forward, every batch sent into the BatchQueue is a SingleBatch. We need to retain next_batch(...) -> PipelineResult<Batch>, to keep that backwards compatibility, but the new stage should enforce that.

What this looks like in my head is something like:
Untitled-2022-11-03-2344

The things I'm trying to optimize for here is:

  • The BatchQueue currently performs the check_batch call. But it does it on Batch, which wraps both SingleBatch + SpanBatch. After holocene, basically, the BatchQueue should never receive a SpanBatch. This way, we can do the span batch prefix check in the BatchStream, and just reuse the SingleBatch check in the BQ.
  • The BatchStream stage basically acts as a buffer of SingleBatches, derived from a SpanBatch. So it holds all of the SingleBatches from a span batch in-memory, and gets drained by the BatchQueue as SingleBatches are read. It can eventually land on Eof, signaling that it needs to fetch a new batch from the ChannelReader.

Notably this means that the stage will always be present - no need for a active flag, just forwards batches from the ChannelReader directly pre-holocene. Otherwise, it acts as an in-memory buffer of SingleBatches, and also owns the job of validating the span batch's prefix per the spec.

@refcell refcell force-pushed the rf/feat/holocene-batch-span-stage branch from 65fbf51 to 5e95560 Compare September 25, 2024 02:44
@refcell refcell changed the title feat(derive): New BatchSpan Stage for Holocene feat(derive): New BatchStream Stage for Holocene Sep 25, 2024
@refcell refcell requested a review from clabby September 25, 2024 02:46
Copy link
Collaborator

@clabby clabby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start! Let's get this in and keep iterating.

crates/derive/src/stages/batch_stream.rs Outdated Show resolved Hide resolved
@refcell refcell force-pushed the rf/feat/holocene-batch-span-stage branch from 35ee4d2 to 0cec9fe Compare September 25, 2024 22:43
@refcell refcell added this pull request to the merge queue Sep 25, 2024
Merged via the queue into main with commit 2cb8d4a Sep 25, 2024
16 of 17 checks passed
@github-actions github-actions bot mentioned this pull request Sep 25, 2024
refcell added a commit that referenced this pull request Sep 26, 2024
* feat(derive): new batch span stage for holocene

* small fix
This was referenced Sep 27, 2024
@github-actions github-actions bot mentioned this pull request Oct 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-derive Area: kona-derive crate H-holocene Hardfork: Holocene related K-feature Kind: feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants