-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(bin): use 32k stream IO buffer #2008
Conversation
Firefox by default uses a 32k IO buffer for streams. https://searchfox.org/mozilla-central/rev/f6e3b81aac49e602f06c204f9278da30993cdc8a/modules/libpref/init/all.js#3212 This commit makes `neqo-bin` use the same buffer size across http09/3 and client/server. Along the way it consolidates various buffer logic and reuses buffers whereever feasible.
Failed Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #2008 +/- ##
==========================================
- Coverage 95.36% 95.36% -0.01%
==========================================
Files 112 112
Lines 36475 36475
==========================================
- Hits 34784 34783 -1
- Misses 1691 1692 +1 ☔ View full report in Codecov by Sentry. |
Benchmark resultsPerformance differences relative to f801c29. coalesce_acked_from_zero 1+1 entries: 💔 Performance has regressed.time: [196.61 ns 197.05 ns 197.53 ns] change: [+1.4739% +1.8071% +2.1494%] (p = 0.00 < 0.05) coalesce_acked_from_zero 3+1 entries: Change within noise threshold.time: [237.21 ns 237.92 ns 238.71 ns] change: [+0.0346% +0.3689% +0.7299%] (p = 0.04 < 0.05) coalesce_acked_from_zero 10+1 entries: No change in performance detected.time: [236.98 ns 238.05 ns 239.36 ns] change: [-0.1969% +0.3499% +0.9639%] (p = 0.23 > 0.05) coalesce_acked_from_zero 1000+1 entries: No change in performance detected.time: [215.66 ns 215.88 ns 216.12 ns] change: [-0.0640% +0.7259% +1.4315%] (p = 0.06 > 0.05) RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [119.60 ms 119.68 ms 119.76 ms] change: [+0.8078% +0.9063% +1.0011%] (p = 0.00 < 0.05) transfer/pacing-false/varying-seeds: No change in performance detected.time: [40.423 ms 42.503 ms 44.609 ms] change: [-5.6161% +0.7236% +7.8640%] (p = 0.83 > 0.05) transfer/pacing-true/varying-seeds: Change within noise threshold.time: [57.763 ms 60.818 ms 63.932 ms] change: [+2.2889% +10.117% +18.409%] (p = 0.01 < 0.05) transfer/pacing-false/same-seed: No change in performance detected.time: [47.965 ms 49.768 ms 51.492 ms] change: [-6.1545% -1.5034% +3.4935%] (p = 0.55 > 0.05) transfer/pacing-true/same-seed: No change in performance detected.time: [60.923 ms 67.805 ms 74.674 ms] change: [-18.847% -6.3929% +7.1145%] (p = 0.35 > 0.05) 1-conn/1-100mb-resp (aka. Download)/client: 💚 Performance has improved.time: [126.55 ms 127.76 ms 129.52 ms] thrpt: [772.10 MiB/s 782.71 MiB/s 790.18 MiB/s] change: time: [-27.403% -26.268% -24.915%] (p = 0.00 < 0.05) thrpt: [+33.182% +35.626% +37.747%] 1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.time: [408.49 ms 411.58 ms 414.65 ms] thrpt: [24.117 Kelem/s 24.296 Kelem/s 24.481 Kelem/s] change: time: [-0.9293% +0.2571% +1.4515%] (p = 0.66 > 0.05) thrpt: [-1.4308% -0.2564% +0.9380%] 1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.time: [46.648 ms 47.456 ms 48.297 ms] thrpt: [20.705 elem/s 21.072 elem/s 21.437 elem/s] change: time: [-1.6077% +0.7764% +3.0788%] (p = 0.52 > 0.05) thrpt: [-2.9869% -0.7704% +1.6340%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
Strange. "No change in performance detected", but a 97% increase in runtime. Will look into it now.
|
Still debugging the performance regression. I expected a performance improvement. Interesting difference in branch
branch
|
Without pacing, this pull request outperforms
|
This is ready for a review. #2035 resolved the performance regressions. In addition, as expected, this now improves the Download benchmark:
Note: The benefit of this pull request is not the performance improvement above, but the fact that the benchmark now executes closer to how Firefox uses |
When POSTing a large request to a server, don't allocate the entire request upfront, but instead, as is done in `neqo-bin/src/server/mod.rs`, iterate over a static buffer. Reuses the same logic from `neqo-bin/src/server/mod.rs`, i.e. `SendData`. See previous similar change on server side mozilla#2008.
When POSTing a large request to a server, don't allocate the entire request upfront, but instead, as is done in `neqo-bin/src/server/mod.rs`, iterate over a static buffer. Reuses the same logic from `neqo-bin/src/server/mod.rs`, i.e. `SendData`. See previous similar change on server side mozilla#2008.
* bench(bin/client): don't allocate upload payload upfront When POSTing a large request to a server, don't allocate the entire request upfront, but instead, as is done in `neqo-bin/src/server/mod.rs`, iterate over a static buffer. Reuses the same logic from `neqo-bin/src/server/mod.rs`, i.e. `SendData`. See previous similar change on server side #2008. * Inline done()
Firefox by default uses a 32k IO buffer for streams.
https://searchfox.org/mozilla-central/rev/f6e3b81aac49e602f06c204f9278da30993cdc8a/modules/libpref/init/all.js#3212
This commit makes
neqo-bin
use the same buffer size across http09/3 and client/server.Along the way it consolidates various buffer logic and reuses buffers whereever feasible.