-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create a buffered wrapper around BytesStream #1501
Create a buffered wrapper around BytesStream #1501
Conversation
The need for it is driven by the behavior we're observing from Report Collector sending bytes down to individual shards. It writes data as it becomes available and Hyper does not accumulate it before sending. On the receiver side we are seeing chunks of size 1 received and that creates thrashing on sender/receiver side. This change paves the path to use buffering on RC side.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1501 +/- ##
==========================================
+ Coverage 93.02% 93.24% +0.22%
==========================================
Files 237 238 +1
Lines 43535 43688 +153
==========================================
+ Hits 40498 40739 +241
+ Misses 3037 2949 -88 ☔ View full report in Codecov by Sentry. |
// verify_success(infallible_stream(12, 5), 12).await; | ||
// verify_success(infallible_stream(12, 12), 12).await; | ||
// verify_success(infallible_stream(24, 12), 12).await; | ||
// verify_success(infallible_stream(24, 12), 1).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: clean these up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch, I want to get them back
/// done. This may need to be used when writing into HTTP streams as Hyper | ||
/// does not provide any buffering functionality and we turn NODELAY on | ||
#[pin_project] | ||
pub struct BufferedBytesStream<S> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few nits
To me BufferedBytesStream
doesn't tell why this stream is special. I think the key thing is the poll with is being "chunked". I would rather call this BufferedChunkedStream
why not have the trait bound S: BytesStream
here as well? Thinking more about documentation/readability
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bounds need to be repeated everywhere if put on the struct - generally we try to avoid that if that's not necessary
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not attempt to chunk the inner stream, the whole purpose of it is to accumulate enough bytes (buffer) before sending them down for processing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But doesn't the output get chunked? This adapter is buffering and chunking... another name would be paginating.
I still think BufferedBytesStream is vague. My 2 cents.
/// Number of bytes released per single poll. | ||
/// All items except the last one are guaranteed to have | ||
/// exactly this number of bytes written to them. | ||
sz: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: chunk_size?
Continuation of private-attribution#1501, we want to avoid submitting reports one by one from RC to each individual shard. That likely leads to fragmentation and we've been observing slow execution on the client side. This sets up the buffer size to be divisible by TCP MSS, but I don't have any real evidence that this is going to work well. We would need to experiment with it
* Use stream buffering in report collector Continuation of #1501, we want to avoid submitting reports one by one from RC to each individual shard. That likely leads to fragmentation and we've been observing slow execution on the client side. This sets up the buffer size to be divisible by TCP MSS, but I don't have any real evidence that this is going to work well. We would need to experiment with it * Use 8Kb buffers for streaming inside report collector
The need for it is driven by the behavior we're observing from Report Collector sending bytes down to individual shards. It writes data as it becomes available and Hyper does not accumulate it before sending. On the receiver side we are seeing chunks of size 1 received and that creates thrashing on sender/receiver side.
This change paves the path to use buffering on RC side.