Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(transport): don't pace below timer granularity #2035

Merged
merged 6 commits into from
Aug 6, 2024

Conversation

mxinden
Copy link
Collaborator

@mxinden mxinden commented Aug 4, 2024

Neqo assumes a timer granularity of 1ms:

/// The smallest time that the system timer (via `sleep()`, `nanosleep()`,
/// `select()`, or similar) can reliably deliver; see `neqo_common::hrtime`.
pub const GRANULARITY: Duration = Duration::from_millis(1);

but neqo_transport::Pacer::next() might return values < GRANULARITY.

/// Determine when the next packet will be available based on the provided RTT
/// and congestion window. This doesn't update state.
/// This returns a time, which could be in the past (this object doesn't know what
/// the current time is).
pub fn next(&self, rtt: Duration, cwnd: usize) -> Instant {
if self.c >= self.p {
qtrace!([self], "next {}/{:?} no wait = {:?}", cwnd, rtt, self.t);
self.t
} else {
// This is the inverse of the function in `spend`:
// self.t + rtt * (self.p - self.c) / (PACER_SPEEDUP * cwnd)
let r = rtt.as_nanos();
let d = r.saturating_mul(u128::try_from(self.p - self.c).unwrap());
let add = d / u128::try_from(cwnd * PACER_SPEEDUP).unwrap();
let w = u64::try_from(add).map(Duration::from_nanos).unwrap_or(rtt);
let nxt = self.t + w;
qtrace!([self], "next {}/{:?} wait {:?} = {:?}", cwnd, rtt, w, nxt);
nxt
}
}

Under the assumption that a timer implementation rounds small values up to its granularity (e.g. 1ms), packets can be delayed significantly more than intended by Pacer.

With this commit Pacer does not delay packets that would previously be delayed by less than GRANULARITY. The downside is loss in pacing granularity.

See also:


Initial idea from @larseggert.

Would fix performance regression in #2008:

➜  neqo git:(pace-granularity) ✗ critcmp main main-no-pacing 32k 32k-no-pacing 32k-min-1ms-pacing -f "Download" --list
1-conn/1-100mb-resp (aka. Download)/client
------------------------------------------
32k-no-pacing          1.00    114.8±49.78ms   871.3 MB/sec
32k-min-1ms-pacing     1.04    119.9±63.28ms   834.2 MB/sec
main-no-pacing         1.05    121.0±83.97ms   826.4 MB/sec
main                   1.45    165.9±56.69ms   602.8 MB/sec
32k                   11.39  1307.8±1038.97ms   76.5 MB/sec

Copy link

github-actions bot commented Aug 4, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

neqo-latest as client

neqo-latest as server

Copy link

github-actions bot commented Aug 4, 2024

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

Copy link
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind writing a test for this?

I also see a lot of CI failures here. I don't immediately see why this is the case, likely the test was dependent on some amount of pacing happening (with a wait that might be suppressed by this change).

neqo-transport/src/pace.rs Outdated Show resolved Hide resolved
neqo-transport/src/pace.rs Outdated Show resolved Hide resolved
Pacing on new path is now below granularity and thus packet on new path is send
immediately. Calling `skip_pacing` will instead fast forward to the PTO of the
old path to expire, thus leading to an unexpected probe packet on the old path.

```
thread 'connection::tests::migration::path_forwarding_attack' panicked at test-fixture/src/assertions.rs:153:5:
assertion `left == right` failed
  left: [fe80::1]:443
 right: 192.0.2.1:443
```

This commit simply removes the no longer needed `skip_pacing` step, thus
reverting to the previous behavior.
@mxinden
Copy link
Collaborator Author

mxinden commented Aug 5, 2024

Would you mind writing a test for this?

Of course. Thanks for continuously pushing for more tests @martinthomson. Sorry for not providing one early on.

f183df8 adds a basic test. Let me know if you would like me to cover more scenarios.

Also, this might be a good opportunity for property based testing e.g. via quickcheck. Let me know if that sounds worth the additional dependency. Happy to add it, preferably in a separate pull request.

Copy link

codecov bot commented Aug 5, 2024

Codecov Report

Attention: Patch coverage is 96.55172% with 1 line in your changes missing coverage. Please review.

Project coverage is 95.03%. Comparing base (f63b22c) to head (bb0fefc).

Files Patch % Lines
neqo-transport/src/pace.rs 96.55% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2035      +/-   ##
==========================================
+ Coverage   94.98%   95.03%   +0.04%     
==========================================
  Files         112      112              
  Lines       36415    36436      +21     
==========================================
+ Hits        34590    34627      +37     
+ Misses       1825     1809      -16     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mxinden
Copy link
Collaborator Author

mxinden commented Aug 5, 2024

Codecov Report

Attention: Patch coverage is 96.55172% with 1 line in your changes missing coverage. Please review.

Project coverage is 95.05%. Comparing base (0eb9174) to head (c66154b).
Report is 2 commits behind head on main.

Files Patch % Lines
neqo-transport/src/pace.rs 96.55% 1 Missing ⚠️
Additional details and impacted files

☔ View full report in Codecov by Sentry. 📢 Have feedback on the report? Share it here.

I am surprised that Codecov reports a line in a unit test as missing test coverage:

image

Copy link

github-actions bot commented Aug 5, 2024

Benchmark results

Performance differences relative to 3d0efa2.

coalesce_acked_from_zero 1+1 entries: 💔 Performance has regressed.
       time:   [193.20 ns 193.64 ns 194.14 ns]
       change: [+1.0162% +1.4427% +1.8401%] (p = 0.00 < 0.05)

Found 5 outliers among 100 measurements (5.00%)
3 (3.00%) high mild
2 (2.00%) high severe

coalesce_acked_from_zero 3+1 entries: 💔 Performance has regressed.
       time:   [236.26 ns 236.78 ns 237.36 ns]
       change: [+1.4241% +1.7616% +2.0811%] (p = 0.00 < 0.05)

Found 13 outliers among 100 measurements (13.00%)
3 (3.00%) low mild
3 (3.00%) high mild
7 (7.00%) high severe

coalesce_acked_from_zero 10+1 entries: 💔 Performance has regressed.
       time:   [237.69 ns 238.52 ns 239.51 ns]
       change: [+2.1932% +2.6364% +3.2358%] (p = 0.00 < 0.05)

Found 9 outliers among 100 measurements (9.00%)
1 (1.00%) low mild
8 (8.00%) high severe

coalesce_acked_from_zero 1000+1 entries: 💔 Performance has regressed.
       time:   [216.65 ns 216.85 ns 217.07 ns]
       change: [+1.0459% +1.7432% +2.4558%] (p = 0.00 < 0.05)

Found 9 outliers among 100 measurements (9.00%)
3 (3.00%) high mild
6 (6.00%) high severe

RxStreamOrderer::inbound_frame(): Change within noise threshold.
       time:   [120.70 ms 120.76 ms 120.82 ms]
       change: [-0.3077% -0.2440% -0.1796%] (p = 0.00 < 0.05)

Found 3 outliers among 100 measurements (3.00%)
2 (2.00%) high mild
1 (1.00%) high severe

transfer/pacing-false/varying-seeds: No change in performance detected.
       time:   [39.450 ms 41.321 ms 43.149 ms]
       change: [-4.3356% +2.0268% +8.7589%] (p = 0.55 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild

transfer/pacing-true/varying-seeds: No change in performance detected.
       time:   [50.991 ms 54.095 ms 57.334 ms]
       change: [-12.892% -5.6697% +1.8739%] (p = 0.16 > 0.05)

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) high mild

transfer/pacing-false/same-seed: No change in performance detected.
       time:   [49.538 ms 51.010 ms 52.439 ms]
       change: [-1.7119% +2.4038% +6.8421%] (p = 0.27 > 0.05)

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) low mild

transfer/pacing-true/same-seed: No change in performance detected.
       time:   [63.225 ms 70.051 ms 76.853 ms]
       change: [-14.783% -3.4556% +8.7636%] (p = 0.58 > 0.05)
1-conn/1-100mb-resp (aka. Download)/client: No change in performance detected.
       time:   [171.32 ms 173.05 ms 174.85 ms]
       thrpt:  [571.93 MiB/s 577.88 MiB/s 583.69 MiB/s]
change:
       time:   [-2.3669% +1.0422% +4.1275%] (p = 0.54 > 0.05)
       thrpt:  [-3.9639% -1.0315% +2.4243%]

Found 4 outliers among 100 measurements (4.00%)
4 (4.00%) high mild

1-conn/10_000-parallel-1b-resp (aka. RPS)/client: No change in performance detected.
       time:   [409.80 ms 413.50 ms 417.10 ms]
       thrpt:  [23.975 Kelem/s 24.184 Kelem/s 24.402 Kelem/s]
change:
       time:   [-1.1717% +0.0447% +1.2326%] (p = 0.94 > 0.05)
       thrpt:  [-1.2176% -0.0447% +1.1856%]

Found 1 outliers among 100 measurements (1.00%)
1 (1.00%) low mild

1-conn/1-1b-resp (aka. HPS)/client: No change in performance detected.
       time:   [45.998 ms 46.715 ms 47.432 ms]
       thrpt:  [21.083  elem/s 21.406  elem/s 21.740  elem/s]
change:
       time:   [-0.6630% +1.5256% +3.7464%] (p = 0.18 > 0.05)
       thrpt:  [-3.6111% -1.5027% +0.6675%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client Server CC Pacing Mean [ms] Min [ms] Max [ms] Relative
msquic msquic 116.4 ± 21.0 97.2 185.7 1.00
neqo msquic reno on 307.7 ± 66.5 256.2 477.0 1.00
neqo msquic reno 265.3 ± 8.8 256.0 281.5 1.00
neqo msquic cubic on 268.4 ± 10.6 256.4 288.0 1.00
neqo msquic cubic 272.3 ± 10.7 253.6 285.5 1.00
msquic neqo reno on 190.5 ± 77.5 112.2 387.6 1.00
msquic neqo reno 142.4 ± 17.0 114.2 181.2 1.00
msquic neqo cubic on 183.0 ± 46.0 110.3 263.8 1.00
msquic neqo cubic 181.3 ± 25.1 108.4 209.7 1.00
neqo neqo reno on 208.6 ± 58.5 162.5 349.3 1.00
neqo neqo reno 228.8 ± 115.9 161.8 538.5 1.00
neqo neqo cubic on 171.6 ± 14.9 155.0 203.4 1.00
neqo neqo cubic 175.7 ± 20.9 147.5 224.6 1.00

⬇️ Download logs

@larseggert larseggert added this pull request to the merge queue Aug 6, 2024
Merged via the queue into mozilla:main with commit 9fa21ee Aug 6, 2024
56 of 57 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants