-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(derive): New BatchStream Stage for Holocene #566
Conversation
This stack of pull requests is managed by Graphite. Learn more about stacking. |
Codecov ReportAttention: Patch coverage is
✅ All tests successful. No failed tests found.
Additional details and impacted files☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a good start. Small preference for BatchStream
rather than BatchSpan
, as a dumb naming nit.
I think there's a few things we need to think about architecturally here, though. The way I think about this is that we need to be able to stream single batches from the span batch into the BatchQueue
, one at a time. This means that, from holocene forward, every batch sent into the BatchQueue
is a SingleBatch
. We need to retain next_batch(...) -> PipelineResult<Batch>
, to keep that backwards compatibility, but the new stage should enforce that.
What this looks like in my head is something like:
The things I'm trying to optimize for here is:
- The
BatchQueue
currently performs thecheck_batch
call. But it does it onBatch
, which wraps bothSingleBatch
+SpanBatch
. After holocene, basically, theBatchQueue
should never receive aSpanBatch
. This way, we can do the span batch prefix check in theBatchStream
, and just reuse theSingleBatch
check in the BQ. - The
BatchStream
stage basically acts as a buffer ofSingleBatch
es, derived from aSpanBatch
. So it holds all of theSingleBatch
es from a span batch in-memory, and gets drained by theBatchQueue
asSingleBatch
es are read. It can eventually land onEof
, signaling that it needs to fetch a new batch from theChannelReader
.
Notably this means that the stage will always be present - no need for a active
flag, just forwards batches from the ChannelReader
directly pre-holocene. Otherwise, it acts as an in-memory buffer of SingleBatch
es, and also owns the job of validating the span batch's prefix per the spec.
65fbf51
to
5e95560
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good start! Let's get this in and keep iterating.
35ee4d2
to
0cec9fe
Compare
* feat(derive): new batch span stage for holocene * small fix
Description
Adds a new
BatchStream
stage for holocene.Makes progress towards #559