Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: DPLPMTUD #1903

Merged
merged 105 commits into from
Jul 10, 2024
Merged

feat: DPLPMTUD #1903

merged 105 commits into from
Jul 10, 2024

Conversation

larseggert
Copy link
Collaborator

@larseggert larseggert commented May 14, 2024

This implements a simplified variant of PLDPMTUD (aka RFC8899), which by default probes for increasingly larger PMTUs using an MTU table.

There is currently no attempt to repeat the PMTUD at intervals. There is also no attempt to detect PMTUs that are in between values in the table. There is no attempt to handle the case where the PMTU shrinks.

A lot of the existing tests (~50%) break when PMTUD is enabled, so this PR disables it by default. New tests that cover PMTUD were added to this PR.

Fixes #243

@larseggert larseggert changed the title feat: Groudwork for DPLPMTUD feat: Groundwork for DPLPMTUD May 14, 2024
Copy link

github-actions bot commented May 14, 2024

Failed Interop Tests

QUIC Interop Runner, client vs. server

All results

Succeeded Interop Tests

QUIC Interop Runner, client vs. server

Unsupported Interop Tests

QUIC Interop Runner, client vs. server

@mxinden
Copy link
Collaborator

mxinden commented May 14, 2024

(There are also a bunch of warning about unused code that is actually used. I don't understand why that is, since those functions mirror existing ones such as cwnd_avail.)

As far as I can tell the trait function CongestionControl::cwnd_min and its implementation <ClassicCongestionControl<T> as CongestionControl>::cwnd_min are only called in PacketSender::cwnd_min. PacketSender::cwnd_min is only called in testing code. Thus, cargo complains about the 3 not being used.

Does that make sense @larseggert?

Copy link

github-actions bot commented May 14, 2024

Firefox builds for this PR

The following builds are available for testing. Crossed-out builds did not succeed.

neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/path.rs Outdated Show resolved Hide resolved
Copy link

github-actions bot commented May 14, 2024

Benchmark results

Performance differences relative to 7d610ed.

coalesce_acked_from_zero 1+1 entries: Change within noise threshold.
       time:   [193.61 ns 194.06 ns 194.53 ns]
       change: [+0.0719% +0.4048% +0.7829%] (p = 0.02 < 0.05)

Found 13 outliers among 100 measurements (13.00%)
1 (1.00%) low mild
8 (8.00%) high mild
4 (4.00%) high severe

coalesce_acked_from_zero 3+1 entries: No change in performance detected.
       time:   [234.67 ns 235.73 ns 237.08 ns]
       change: [+0.6006% +2.3382% +6.2565%] (p = 0.06 > 0.05)

Found 17 outliers among 100 measurements (17.00%)
11 (11.00%) high mild
6 (6.00%) high severe

coalesce_acked_from_zero 10+1 entries: No change in performance detected.
       time:   [233.27 ns 234.03 ns 234.96 ns]
       change: [-0.1847% +0.3385% +0.8310%] (p = 0.21 > 0.05)

Found 8 outliers among 100 measurements (8.00%)
1 (1.00%) high mild
7 (7.00%) high severe

coalesce_acked_from_zero 1000+1 entries: No change in performance detected.
       time:   [215.28 ns 217.58 ns 222.98 ns]
       change: [-0.5398% +6.4762% +20.010%] (p = 0.44 > 0.05)

Found 10 outliers among 100 measurements (10.00%)
3 (3.00%) high mild
7 (7.00%) high severe

RxStreamOrderer::inbound_frame(): 💔 Performance has regressed.
       time:   [120.23 ms 120.29 ms 120.37 ms]
       change: [+1.1136% +1.1966% +1.2758%] (p = 0.00 < 0.05)

Found 2 outliers among 100 measurements (2.00%)
2 (2.00%) high mild

transfer/Run multiple transfers with varying seeds: 💚 Performance has improved.
       time:   [55.737 ms 59.125 ms 62.506 ms]
       thrpt:  [63.994 MiB/s 67.653 MiB/s 71.766 MiB/s]
change:
       time:   [-53.858% -51.136% -48.358%] (p = 0.00 < 0.05)
       thrpt:  [+93.642% +104.65% +116.72%]
transfer/Run multiple transfers with the same seed: 💚 Performance has improved.
       time:   [65.695 ms 70.817 ms 75.893 ms]
       thrpt:  [52.706 MiB/s 56.484 MiB/s 60.887 MiB/s]
change:
       time:   [-45.571% -41.490% -37.459%] (p = 0.00 < 0.05)
       thrpt:  [+59.896% +70.912% +83.726%]
1-conn/1-100mb-resp (aka. Download)/client: 💚 Performance has improved.
       time:   [154.81 ms 160.68 ms 166.89 ms]
       thrpt:  [599.19 MiB/s 622.35 MiB/s 645.97 MiB/s]
change:
       time:   [-86.471% -85.914% -85.283%] (p = 0.00 < 0.05)
       thrpt:  [+579.47% +609.92% +639.15%]
1-conn/10_000-parallel-1b-resp (aka. RPS)/client: Change within noise threshold.
       time:   [430.16 ms 433.60 ms 437.05 ms]
       thrpt:  [22.880 Kelem/s 23.063 Kelem/s 23.247 Kelem/s]
change:
       time:   [-2.5302% -1.4809% -0.4847%] (p = 0.00 < 0.05)
       thrpt:  [+0.4870% +1.5031% +2.5959%]
1-conn/1-1b-resp (aka. HPS)/client: 💚 Performance has improved.
       time:   [43.569 ms 44.108 ms 44.652 ms]
       thrpt:  [22.395  elem/s 22.671  elem/s 22.952  elem/s]
change:
       time:   [-3.7644% -2.5274% -1.2515%] (p = 0.00 < 0.05)
       thrpt:  [+1.2674% +2.5930% +3.9116%]

Client/server transfer results

Transfer of 33554432 bytes over loopback.

Client Server CC Pacing Mean [ms] Min [ms] Max [ms] Relative
msquic msquic 111.9 ± 13.4 88.9 147.8 1.00
neqo msquic reno on 267.4 ± 7.3 250.7 279.4 1.00
neqo msquic reno 271.8 ± 6.0 264.6 283.5 1.00
neqo msquic cubic on 275.0 ± 15.4 245.0 299.2 1.00
neqo msquic cubic 267.1 ± 7.8 256.6 281.1 1.00
msquic neqo reno on 235.3 ± 166.2 89.0 648.3 1.00
msquic neqo reno 140.0 ± 20.3 111.6 172.6 1.00
msquic neqo cubic on 164.5 ± 59.6 113.1 331.7 1.00
msquic neqo cubic 206.5 ± 70.2 126.4 361.7 1.00
neqo neqo reno on 192.5 ± 24.7 151.4 237.1 1.00
neqo neqo reno 173.1 ± 19.6 144.6 215.4 1.00
neqo neqo cubic on 186.7 ± 47.4 146.1 358.3 1.00
neqo neqo cubic 171.5 ± 9.3 156.5 187.0 1.00

⬇️ Download logs

@larseggert larseggert marked this pull request as ready for review May 15, 2024 16:29
@larseggert larseggert changed the title feat: Groundwork for DPLPMTUD feat: DPLPMTUD May 21, 2024
Copy link
Member

@martinthomson martinthomson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing PMTUD tests, which would be necessary for this.

The big question I have is the one that Christian makes about PTMUD generally: how do you know that the bytes you use on PMTUD pay you back?

There is probably a case for sending probes when you have spare sending capacity and nothing better to send. Indeed, successfully probing will let us push congestion windows up more and could even improve performance.

What I'm seeing here displaces other data. I'd like to see something that doesn't do that. There's a fundamental problem that needs analysis though. You can't predict that a connection will be used for uploads, so you don't know when probes will really help. I see a few cases:

  1. The connection is short-lived or low volume. Probes are strictly wasteful.
  2. The connection is long-lived and high volume, with ample idle time for probing. Probes can use gaps. This might be a video stream, where probing can fit into a warmup period. Probes are therefore easy and super-helpful.
  3. The connection exists only to support a smaller upload. The upload is small enough that probes are wasteful.
  4. The connection exists only to support a larger upload. The upload is large enough that spending bytes on probing early on is a good investment.

Case 1 and 2 are easy to deal with. We could probe on an idle connection and tolerate a small amount of waste for case 1 if it makes case 2 appreciably better.

The split between 3 and 4 is rough. There is an uncertain zone between the two as well where some probing is justified, but successive rounds of probing might be wasteful as the throughput gain over the remaining time diminishes relative to the wasted effort of extra probes.

Right now, you don't send real data in probes. You are effectively betting on the probes being lost. But you could send data, which would reduce the harm in case 3. It might even make the code slightly simpler.

neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/pace.rs Outdated Show resolved Hide resolved
neqo-transport/src/path.rs Outdated Show resolved Hide resolved
neqo-transport/src/path.rs Outdated Show resolved Hide resolved
neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/pmtud.rs Outdated Show resolved Hide resolved
neqo-transport/src/path.rs Outdated Show resolved Hide resolved
Co-authored-by: Martin Thomson <mt@lowentropy.net>
Signed-off-by: Lars Eggert <lars@eggert.org>
larseggert and others added 6 commits July 10, 2024 07:56
Co-authored-by: Max Inden <mail@max-inden.de>
Signed-off-by: Lars Eggert <lars@eggert.org>
Co-authored-by: Martin Thomson <mt@lowentropy.net>
Signed-off-by: Lars Eggert <lars@eggert.org>
mxinden added a commit to mxinden/neqo that referenced this pull request Jul 10, 2024
mxinden added a commit to mxinden/neqo that referenced this pull request Jul 10, 2024
mxinden added a commit to mxinden/neqo that referenced this pull request Jul 10, 2024
@mxinden mxinden mentioned this pull request Jul 10, 2024
larseggert and others added 6 commits July 10, 2024 13:03
`Probe` is a small simple enum on the stack, thus convention is to implement
`Copy` instead of only `Clone` with a call to `clone()`.

The following helped me in the past:

> When should my type be Copy?
>
> Generally speaking, if your type can implement Copy, it should. Keep in mind,
> though, that implementing Copy is part of the public API of your type. If the
> type might become non-Copy in the future, it could be prudent to omit the Copy
> implementation now, to avoid a breaking API change.

https://doc.rust-lang.org/std/marker/trait.Copy.html#when-should-my-type-be-copy
@larseggert larseggert enabled auto-merge July 10, 2024 14:05
@larseggert larseggert added this pull request to the merge queue Jul 10, 2024
Merged via the queue into mozilla:main with commit 4852dc6 Jul 10, 2024
54 of 56 checks passed
@larseggert larseggert deleted the feat-dplpmtud branch July 10, 2024 15:19
moz-v2v-gh pushed a commit to mozilla/gecko-dev that referenced this pull request Sep 16, 2024
…iewers,kershaw,janerik

This commit adds four Glean probes:
- http3_udp_datagram_segment_size_sent
- http3_udp_datagram_segment_size_received
- http3_udp_datagram_size_received
- http3_udp_datagram_num_segments_received

Given the performance impact tracking Glean metrics in the UDP hot path, see
https://phabricator.services.mozilla.com/D216034#7453056, this commit introduces
a sample buffer per metric.

This will enable us to measure the impact of:

- Implementation of Packetization Layer Path MTU Discovery for Datagram Transports (RFC 8899) [in Neqo](mozilla/neqo#1903)
- [Fast UDP for Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=1901292)

Differential Revision: https://phabricator.services.mozilla.com/D216034
i3roly pushed a commit to i3roly/firefox-dynasty that referenced this pull request Sep 17, 2024
…iewers,kershaw,janerik

This commit adds four Glean probes:
- http3_udp_datagram_segment_size_sent
- http3_udp_datagram_segment_size_received
- http3_udp_datagram_size_received
- http3_udp_datagram_num_segments_received

Given the performance impact tracking Glean metrics in the UDP hot path, see
https://phabricator.services.mozilla.com/D216034#7453056, this commit introduces
a sample buffer per metric.

This will enable us to measure the impact of:

- Implementation of Packetization Layer Path MTU Discovery for Datagram Transports (RFC 8899) [in Neqo](mozilla/neqo#1903)
- [Fast UDP for Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=1901292)

Differential Revision: https://phabricator.services.mozilla.com/D216034
jamienicol pushed a commit to jamienicol/gecko that referenced this pull request Sep 17, 2024
…iewers,kershaw,janerik

This commit adds four Glean probes:
- http3_udp_datagram_segment_size_sent
- http3_udp_datagram_segment_size_received
- http3_udp_datagram_size_received
- http3_udp_datagram_num_segments_received

Given the performance impact tracking Glean metrics in the UDP hot path, see
https://phabricator.services.mozilla.com/D216034#7453056, this commit introduces
a sample buffer per metric.

This will enable us to measure the impact of:

- Implementation of Packetization Layer Path MTU Discovery for Datagram Transports (RFC 8899) [in Neqo](mozilla/neqo#1903)
- [Fast UDP for Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=1901292)

Differential Revision: https://phabricator.services.mozilla.com/D216034
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified-and-comments-removed that referenced this pull request Sep 20, 2024
…iewers,kershaw,janerik

This commit adds four Glean probes:
- http3_udp_datagram_segment_size_sent
- http3_udp_datagram_segment_size_received
- http3_udp_datagram_size_received
- http3_udp_datagram_num_segments_received

Given the performance impact tracking Glean metrics in the UDP hot path, see
https://phabricator.services.mozilla.com/D216034#7453056, this commit introduces
a sample buffer per metric.

This will enable us to measure the impact of:

- Implementation of Packetization Layer Path MTU Discovery for Datagram Transports (RFC 8899) [in Neqo](mozilla/neqo#1903)
- [Fast UDP for Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=1901292)

Differential Revision: https://phabricator.services.mozilla.com/D216034

UltraBlame original commit: 16d3f312970444dd8bc8af65629cbef7d7dbcd62
gecko-dev-updater pushed a commit to marco-c/gecko-dev-wordified that referenced this pull request Sep 20, 2024
…iewers,kershaw,janerik

This commit adds four Glean probes:
- http3_udp_datagram_segment_size_sent
- http3_udp_datagram_segment_size_received
- http3_udp_datagram_size_received
- http3_udp_datagram_num_segments_received

Given the performance impact tracking Glean metrics in the UDP hot path, see
https://phabricator.services.mozilla.com/D216034#7453056, this commit introduces
a sample buffer per metric.

This will enable us to measure the impact of:

- Implementation of Packetization Layer Path MTU Discovery for Datagram Transports (RFC 8899) [in Neqo](mozilla/neqo#1903)
- [Fast UDP for Firefox](https://bugzilla.mozilla.org/show_bug.cgi?id=1901292)

Differential Revision: https://phabricator.services.mozilla.com/D216034

UltraBlame original commit: 16d3f312970444dd8bc8af65629cbef7d7dbcd62
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PMTU discovery
3 participants