-
Notifications
You must be signed in to change notification settings - Fork 123
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use a stack allocation for header protection #1978
Conversation
The use of `Vec` here is unnecessary. We can use a fixed sized array instead. The performance gain here is negligible, but the code becomes cleaner, so that's a win.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@KershawChang we should make this part of 0.8.0.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1978 +/- ##
==========================================
- Coverage 94.97% 94.97% -0.01%
==========================================
Files 112 112
Lines 36509 36504 -5
==========================================
- Hits 34673 34668 -5
Misses 1836 1836 ☔ View full report in Codecov by Sentry. |
|
OK, 0.8.1 then :-) And let's not vendor in 0.8.0. |
Is this urgent? Can we wait until issue #1975 is fixed? |
Benchmark resultsPerformance differences relative to 9f0a86d. coalesce_acked_from_zero 1+1 entries: No change in performance detected.time: [194.21 ns 194.67 ns 195.17 ns] change: [-0.5951% +0.1099% +0.7132%] (p = 0.76 > 0.05) coalesce_acked_from_zero 3+1 entries: No change in performance detected.time: [234.09 ns 234.65 ns 235.25 ns] change: [-0.3057% +0.1774% +0.7291%] (p = 0.51 > 0.05) coalesce_acked_from_zero 10+1 entries: No change in performance detected.time: [234.25 ns 235.09 ns 236.07 ns] change: [-0.2659% +0.0849% +0.4352%] (p = 0.64 > 0.05) coalesce_acked_from_zero 1000+1 entries: No change in performance detected.time: [216.05 ns 216.30 ns 216.57 ns] change: [-0.1414% +0.5029% +1.2039%] (p = 0.16 > 0.05) RxStreamOrderer::inbound_frame(): No change in performance detected.time: [118.92 ms 119.02 ms 119.12 ms] change: [-0.2446% -0.0465% +0.1063%] (p = 0.65 > 0.05) transfer/Run multiple transfers with varying seeds: No change in performance detected.time: [54.460 ms 57.543 ms 60.660 ms] thrpt: [65.941 MiB/s 69.513 MiB/s 73.449 MiB/s] change: time: [-2.9551% +5.2097% +14.293%] (p = 0.22 > 0.05) thrpt: [-12.506% -4.9517% +3.0451%] transfer/Run multiple transfers with the same seed: No change in performance detected.time: [66.910 ms 73.185 ms 79.524 ms] thrpt: [50.299 MiB/s 54.656 MiB/s 59.782 MiB/s] change: time: [-14.630% -3.4066% +9.4710%] (p = 0.58 > 0.05) thrpt: [-8.6516% +3.5267% +17.137%] 1-conn/1-100mb-resp (aka. Download)/client: No change in performance detected.time: [151.10 ms 157.47 ms 166.79 ms] thrpt: [599.54 MiB/s 635.03 MiB/s 661.80 MiB/s] change: time: [-13.743% -7.1554% -0.3541%] (p = 0.08 > 0.05) thrpt: [+0.3553% +7.7068% +15.933%] 1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.time: [433.02 ms 436.44 ms 439.97 ms] thrpt: [22.729 Kelem/s 22.912 Kelem/s 23.094 Kelem/s] change: time: [-1.5817% -0.3788% +0.7786%] (p = 0.52 > 0.05) thrpt: [-0.7726% +0.3803% +1.6071%] 1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.time: [43.252 ms 43.813 ms 44.376 ms] thrpt: [22.535 elem/s 22.824 elem/s 23.120 elem/s] change: time: [-2.4728% -0.6989% +1.0864%] (p = 0.45 > 0.05) thrpt: [-1.0747% +0.7038% +2.5355%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
The use of
Vec
here is unnecessary. We can use a fixed sized array instead. It means that we'll not be generating 64 bytes of data for ChaCha20 header protection, but that's also a net win.The performance gain here is negligible, but the code becomes cleaner, so I consider that a win.