-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(transport): don't pace below timer granularity #2035
Conversation
Neqo assumes a timer granularity of 1ms: https://github.com/mozilla/neqo/blob/0eb9174d7a67b250607f0c3ea056fe056bcf91f5/neqo-transport/src/rtt.rs#L25-L27 but `neqo_transport::Pacer::next()` might return values `< GRANULARITY`. https://github.com/mozilla/neqo/blob/0eb9174d7a67b250607f0c3ea056fe056bcf91f5/neqo-transport/src/pace.rs#L71-L90 Under the assumption that a timer implementation rounds small values up to its granularity (e.g. 1ms), packets can be delayed significantly more than intended by `Pacer`. With this commit `Pacer` does not delay packets that would previously be delayed by less than `GRANULARITY`. The downside is loss in pacing granularity. See also: - google/quiche - https://github.com/google/quiche/blob/60aec87316d24392b2ea37c391ecf406ef183074/quiche/quic/core/congestion_control/pacing_sender.cc#L167 - https://github.com/google/quiche/blob/60aec87316d24392b2ea37c391ecf406ef183074/quiche/quic/core/quic_constants.h#L304 - quic-go - https://github.com/quic-go/quic-go/blob/d1f9af4cc6b13c96dc302ac9ec5f061ed294d36b/internal/protocol/params.go#L137 - `kGranularity` in RFC 9002 https://datatracker.ietf.org/doc/html/rfc9002#name-constants-of-interest
Failed Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
All resultsSucceeded Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
Unsupported Interop TestsQUIC Interop Runner, client vs. server neqo-latest as client
neqo-latest as server
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind writing a test for this?
I also see a lot of CI failures here. I don't immediately see why this is the case, likely the test was dependent on some amount of pacing happening (with a wait that might be suppressed by this change).
Pacing on new path is now below granularity and thus packet on new path is send immediately. Calling `skip_pacing` will instead fast forward to the PTO of the old path to expire, thus leading to an unexpected probe packet on the old path. ``` thread 'connection::tests::migration::path_forwarding_attack' panicked at test-fixture/src/assertions.rs:153:5: assertion `left == right` failed left: [fe80::1]:443 right: 192.0.2.1:443 ``` This commit simply removes the no longer needed `skip_pacing` step, thus reverting to the previous behavior.
Of course. Thanks for continuously pushing for more tests @martinthomson. Sorry for not providing one early on. f183df8 adds a basic test. Let me know if you would like me to cover more scenarios. Also, this might be a good opportunity for property based testing e.g. via |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2035 +/- ##
==========================================
+ Coverage 94.98% 95.03% +0.04%
==========================================
Files 112 112
Lines 36415 36436 +21
==========================================
+ Hits 34590 34627 +37
+ Misses 1825 1809 -16 ☔ View full report in Codecov by Sentry. |
I am surprised that Codecov reports a line in a unit test as missing test coverage: |
Benchmark resultsPerformance differences relative to 3d0efa2. coalesce_acked_from_zero 1+1 entries: 💔 Performance has regressed.time: [193.20 ns 193.64 ns 194.14 ns] change: [+1.0162% +1.4427% +1.8401%] (p = 0.00 < 0.05) coalesce_acked_from_zero 3+1 entries: 💔 Performance has regressed.time: [236.26 ns 236.78 ns 237.36 ns] change: [+1.4241% +1.7616% +2.0811%] (p = 0.00 < 0.05) coalesce_acked_from_zero 10+1 entries: 💔 Performance has regressed.time: [237.69 ns 238.52 ns 239.51 ns] change: [+2.1932% +2.6364% +3.2358%] (p = 0.00 < 0.05) coalesce_acked_from_zero 1000+1 entries: 💔 Performance has regressed.time: [216.65 ns 216.85 ns 217.07 ns] change: [+1.0459% +1.7432% +2.4558%] (p = 0.00 < 0.05) RxStreamOrderer::inbound_frame(): Change within noise threshold.time: [120.70 ms 120.76 ms 120.82 ms] change: [-0.3077% -0.2440% -0.1796%] (p = 0.00 < 0.05) transfer/pacing-false/varying-seeds: No change in performance detected.time: [39.450 ms 41.321 ms 43.149 ms] change: [-4.3356% +2.0268% +8.7589%] (p = 0.55 > 0.05) transfer/pacing-true/varying-seeds: No change in performance detected.time: [50.991 ms 54.095 ms 57.334 ms] change: [-12.892% -5.6697% +1.8739%] (p = 0.16 > 0.05) transfer/pacing-false/same-seed: No change in performance detected.time: [49.538 ms 51.010 ms 52.439 ms] change: [-1.7119% +2.4038% +6.8421%] (p = 0.27 > 0.05) transfer/pacing-true/same-seed: No change in performance detected.time: [63.225 ms 70.051 ms 76.853 ms] change: [-14.783% -3.4556% +8.7636%] (p = 0.58 > 0.05) 1-conn/1-100mb-resp (aka. Download)/client: No change in performance detected.time: [171.32 ms 173.05 ms 174.85 ms] thrpt: [571.93 MiB/s 577.88 MiB/s 583.69 MiB/s] change: time: [-2.3669% +1.0422% +4.1275%] (p = 0.54 > 0.05) thrpt: [-3.9639% -1.0315% +2.4243%] 1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.time: [409.80 ms 413.50 ms 417.10 ms] thrpt: [23.975 Kelem/s 24.184 Kelem/s 24.402 Kelem/s] change: time: [-1.1717% +0.0447% +1.2326%] (p = 0.94 > 0.05) thrpt: [-1.2176% -0.0447% +1.1856%] 1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.time: [45.998 ms 46.715 ms 47.432 ms] thrpt: [21.083 elem/s 21.406 elem/s 21.740 elem/s] change: time: [-0.6630% +1.5256% +3.7464%] (p = 0.18 > 0.05) thrpt: [-3.6111% -1.5027% +0.6675%] Client/server transfer resultsTransfer of 33554432 bytes over loopback.
|
Neqo assumes a timer granularity of 1ms:
neqo/neqo-transport/src/rtt.rs
Lines 25 to 27 in 0eb9174
but
neqo_transport::Pacer::next()
might return values< GRANULARITY
.neqo/neqo-transport/src/pace.rs
Lines 71 to 90 in 0eb9174
Under the assumption that a timer implementation rounds small values up to its granularity (e.g. 1ms), packets can be delayed significantly more than intended by
Pacer
.With this commit
Pacer
does not delay packets that would previously be delayed by less thanGRANULARITY
. The downside is loss in pacing granularity.See also:
kGranularity
in RFC 9002 https://datatracker.ietf.org/doc/html/rfc9002#name-constants-of-interestInitial idea from @larseggert.
Would fix performance regression in #2008: