Use stream buffering in report collector #1504

akoshelev · 2024-12-17T07:53:58Z

Continuation of #1501, we want to avoid submitting reports one by one from RC to each individual shard. That likely leads to fragmentation and we've been observing slow execution on the client side.

This sets up the buffer size to be divisible by TCP MSS, but I don't have any real evidence that this is going to work well. We would need to experiment with it

Continuation of private-attribution#1501, we want to avoid submitting reports one by one from RC to each individual shard. That likely leads to fragmentation and we've been observing slow execution on the client side. This sets up the buffer size to be divisible by TCP MSS, but I don't have any real evidence that this is going to work well. We would need to experiment with it

akoshelev · 2024-12-17T07:54:35Z

ipa-core/src/bin/report_collector.rs

@@ -430,6 +431,9 @@ async fn hybrid(
    count: usize,
    set_fixed_polling_ms: Option<u64>,
 ) -> Result<(), Box<dyn Error>> {
+    // twice the size of TCP MSS. This may get messed up if TCP options are used which is not
+    // in our control, but hopefully fragmentation is not too bad
+    const BUF_SIZE: NonZeroUsize = NonZeroUsize::new(1072).unwrap();


@andyleiserson do you have any suggestion on picking up the buffer size?

The tokio and std buffered I/O helpers use 8 kB, so maybe use that? I don't think it's necessary to try and match this to TCP MSS, I might not even expose the buffer size from BufferedBytesStream until we identify a need to tune it for individual uses.

I think what's important is that the kernel has at least a full packet's worth of data, so that it can send full packets.

codecov · 2024-12-17T08:24:53Z

Codecov Report

Attention: Patch coverage is 96.15385% with 4 lines in your changes missing coverage. Please review.

Project coverage is 93.10%. Comparing base (c45ef9b) to head (8b953ab).
Report is 1 commits behind head on main.

Files with missing lines	Patch %	Lines
ipa-core/src/cli/playbook/streaming.rs	97.02%	3 Missing ⚠️
ipa-core/src/bin/report_collector.rs	0.00%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #1504      +/-   ##
==========================================
+ Coverage   92.87%   93.10%   +0.22%     
==========================================
  Files         242      242              
  Lines       44177    44269      +92     
==========================================
+ Hits        41031    41217     +186     
+ Misses       3146     3052      -94

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

andyleiserson · 2024-12-17T19:11:28Z

ipa-core/src/bin/report_collector.rs

@@ -430,6 +431,9 @@ async fn hybrid(
    count: usize,
    set_fixed_polling_ms: Option<u64>,
 ) -> Result<(), Box<dyn Error>> {
+    // twice the size of TCP MSS. This may get messed up if TCP options are used which is not
+    // in our control, but hopefully fragmentation is not too bad
+    const BUF_SIZE: NonZeroUsize = NonZeroUsize::new(1072).unwrap();


The tokio and std buffered I/O helpers use 8 kB, so maybe use that? I don't think it's necessary to try and match this to TCP MSS, I might not even expose the buffer size from BufferedBytesStream until we identify a need to tune it for individual uses.

akoshelev · 2024-12-18T18:03:18Z

I don't see any difference in performance on 1B records. I think the bottleneck is on the upload, so I'll hold off merging this

akoshelev · 2025-01-09T00:07:24Z

I intend to merge it - there is nothing bad about this change and buffered writes from RC can still be useful for submitting medium-sized inputs

akoshelev requested a review from andyleiserson December 17, 2024 07:54

akoshelev commented Dec 17, 2024

View reviewed changes

andyleiserson approved these changes Dec 17, 2024

View reviewed changes

Use 8Kb buffers for streaming inside report collector

a8b344b

Merge from main

8b953ab

akoshelev merged commit 837cdb4 into private-attribution:main Jan 9, 2025
12 checks passed

akoshelev deleted the buffered-rc branch January 9, 2025 00:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use stream buffering in report collector #1504

Use stream buffering in report collector #1504

akoshelev commented Dec 17, 2024

akoshelev Dec 17, 2024

andyleiserson Dec 17, 2024

andyleiserson Dec 17, 2024

codecov bot commented Dec 17, 2024 •

edited

Loading

andyleiserson Dec 17, 2024

akoshelev commented Dec 18, 2024

akoshelev commented Jan 9, 2025

Use stream buffering in report collector #1504

Use stream buffering in report collector #1504

Conversation

akoshelev commented Dec 17, 2024

akoshelev Dec 17, 2024

Choose a reason for hiding this comment

andyleiserson Dec 17, 2024

Choose a reason for hiding this comment

andyleiserson Dec 17, 2024

Choose a reason for hiding this comment

codecov bot commented Dec 17, 2024 • edited Loading

Codecov Report

andyleiserson Dec 17, 2024

Choose a reason for hiding this comment

akoshelev commented Dec 18, 2024

akoshelev commented Jan 9, 2025

codecov bot commented Dec 17, 2024 •

edited

Loading