Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generic Send Offload (GSO) Support #953

Closed
wants to merge 12 commits into from
Closed

Conversation

Matthias247
Copy link
Contributor

@Matthias247 Matthias247 commented Dec 29, 2020

Generic Send Offload (GSO) Support

This change implements an initial version of GSO support [1][2][3]
for Linux, which improves the effiency of sending data.

The approach taken in this change is to create a buffer which contains
multiple datagrams in Connection::poll_transmit. This was picked
over trying to merge packets in the endpoint task, since packets
seem to need to be padded to a common segment size in order to
make them sendable via GSO.

In order to to this in an efficient fashion, the poll_transmit
method was restructred. Instead of selecting spaces upfront, it
will now loop through all possible packet spaces, check if there is
pending data to send and create packets and datagrams out of this.

The last packet which was written to a datagram buffer is not
finalized until it is clear whether follow-up packets need to be
written, since we need to know whether this packet should get padded
to MTU length or not.

This change doesn't enable GSO yet, since it will only produce a
single datagram. I will create a separate change for this.

Performance measurements:

No GSO:

Sent 1073741824 bytes on 1 streams in 4.31s (237.66 MiB/s)

With GSO (up to 8 packets):

Sent 1073741824 bytes on 1 streams in 3.02s (339.18 MiB/s)

[1] http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-paper-DRAFT-1.pdf
[2] http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-presentation-20181104.pdf
[3] https://lwn.net/Articles/752956/

Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work so far! Made an initial high-level pass.

One missing consideration is pacing. In particular, we limit bursts to the initial congestion window. The current pacer tries to always send packets in such bursts to avoid thrashing, so this code is probably safe by coincidence, but to be robust against changes to the pacer to operate more smoothly and/or introduction of dynamic MTU detection, we should take care to ensure that a GSO batch does not exceed the pacing window.

GSO is currently only used for data spaces, due to a less understood impact on other packet spaces and packet types.

What concerns do you have about other spaces? I discussed anti-amplification concerns below; due to 0-RTT I actually don't think restricting to Data space is sufficient regardless.

quinn-proto/src/connection/mod.rs Outdated Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Outdated Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Outdated Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Outdated Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Outdated Show resolved Hide resolved
quinn/src/platform/unix.rs Outdated Show resolved Hide resolved
@Matthias247
Copy link
Contributor Author

One missing consideration is pacing. In particular, we limit bursts to the initial congestion window. The current pacer tries to always send packets in such bursts to avoid thrashing, so this code is probably safe by coincidence, but to be robust against changes to the pacer to operate more smoothly and/or introduction of dynamic MTU detection, we should take care to ensure that a GSO batch does not exceed the pacing window.

Yes, I've seen the code for it - but I can't claim I fully understand the interactions with everything else.
I think however that this should be relatively safe, unless one sets an extreme GSO size. It would send a few packets more at once in a burst, but since more tokens are subtracted the next pause will also happen at the same time. I don't think trying to pact at a granularity of < 10 packets makes sense in userspace. At 250MB/s we are sending about 250k packets/s, which means the there is a latency of 4us for 1 packet (or 40us for 10 packets). The precision in user-space - especially under load - will however likely be more in the 1ms region. Figuring out how to integrate kernel pacing (with SO_TXTIME) and how this interacts with a more granular application timer might be interesting here.

@Matthias247
Copy link
Contributor Author

What concerns do you have about other spaces?

Another thing besides amplification was that other spaces might use different packet types, and the padding rules for all of them are not in my head at the moment. A brief glancy at the spec says that all packets besides short packets have explicit length information, and we can probably just pad all of them with zeros without any side effect. But I felt more comfortable with an initially reduced scope. Still captures the main source of multi-datagram transmissions - which is after the connection was established.

@Ralith
Copy link
Collaborator

Ralith commented Dec 30, 2020

On review, I think the current code actually handles pacing just fine: we invoke Pacer::delay for each packet (rather than the whole batch), and its state is updated by Connection::on_packet_sent, which we also call per-packet.

@Matthias247
Copy link
Contributor Author

On review, I think the current code actually handles pacing just fine: we invoke Pacer::delay for each packet (rather than the whole batch), and its state is updated by Connection::on_packet_sent, which we also call per-packet.

Yes. It should also place nice with the congestion controller due to this. All the interactions are within a loop.

I however wondered if it also raises the 0-length datagram problem: I don't think so -> A new datagram is started, but we will not write any byte to it. When returning it the total packet size will be an exact multiple of the segment size, so no additional empty datagram will be sent. However the num_datagrams calculation seems to be off by 1. Maybe this can all be made nicer.

@Ralith
Copy link
Collaborator

Ralith commented Dec 30, 2020

Would it be sufficient to only increment num_datagrams immediately after caling begin_packet? At that point we're committed to sending a datagram.

@Matthias247
Copy link
Contributor Author

Would it be sufficient to only increment num_datagrams immediately after caling begin_packet? At that point we're committed to sending a datagram.

Yes, this should work. Would however still be good to filter the empty packets.

@Ralith
Copy link
Collaborator

Ralith commented Dec 30, 2020

Right, but that's an existing issue, and not a pressing one given that false positives from the can-send detection stuff should be very rare under normal conditions.

@Matthias247
Copy link
Contributor Author

This is still WIP, and misses some more changes regarding concatenation and respecting pacing and congestion-control before creating new packets.
However the following this are now changed based on feedback, and I wanted to get an idea whether that looks right:

  • A common space_can_send method is used to determine whether there is more data to send
  • The method do determine whether a packet needs to be padded was moved inside the packet creation loop
  • tag_len is now stored in PacketBuilder, and reused on populate_packet.

Matthias247 added a commit to Matthias247/quinn that referenced this pull request Dec 31, 2020
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Dec 31, 2020
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
@Matthias247 Matthias247 mentioned this pull request Dec 31, 2020
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Dec 31, 2020
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 1, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 1, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 1, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 1, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 2, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 2, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 2, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 2, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Matthias247 added a commit to Matthias247/quinn that referenced this pull request Jan 3, 2021
This splits out the platform/socket parts for UDP GSO support
from quinn-rs#953 to reduce the amount of code to review.

This change mainly implements setting the segment size
socket option, and adds a runtime detection mechanism for
GSO support.
Copy link
Member

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some smaller comments. I think the last commit here definitely needs to be split up more, it seems like we can do a bunch of the refactoring in incremental steps?

quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Outdated Show resolved Hide resolved
@Matthias247
Copy link
Contributor Author

I think the last commit here definitely needs to be split up more, it seems like we can do a bunch of the refactoring in incremental steps?

I agree that it could. But since most of the changes are in the poll_transmit function this would make it harder to address any review feedback (everything needs to be fixed in intermediate commits as well as the final one). So there is this tradeoff too.

Overall I think the intermediate step is that some fields had been moved in the PacketBuilder. But not all of those moves make sense individually, since the responsibility for adding padding had been redefined. And I don't want to figure out for an unnecessary intermediate step.

@djc
Copy link
Member

djc commented Feb 17, 2021

I don't feel confident in my ability to review the last commit in its current state, so I do think there need to be some intermediate steps. Perhaps it makes sense to start moving fields into PacketBuilder one by one?

Matthias247 and others added 12 commits February 17, 2021 16:45
With GSO, we need to ask whether we can send 2 additional datagrams.
Wit GSO activated, encoding cmsgs paniced since not enough
space was available. We need a minimum of 88 bytes.
`space_can_send` is now a separate function which returns whether there
is sendable data in a certain space.
With GSO we need to be able to check whether multiple datagrams
are sendable.
If any of the frames inside a packet requires padding, the whole
packet will require padding.
This allows easy access the information later on.
With GSO storing the buffer inside the builder does not work
due to lifetime issues - we need to hold the buffer across various packets.
This keeps the buffer outside of the builder and just references the
start of a packet via an offset.
This also removes the lifetime from `PacketBuilder`.
`finish_and_track_packet` will now take information from the builder
when finalizing the packet.
This adds a new stat for the number of transmit calls
and fixes the metrics for path challenges.
The option only needs to be sent for GSO (multiple datagrams
in a buffer).
This change defers padding packets as a runtime decision
while the packet is populated. Whether padding is required is stored
in a `min_datagram_size` variable which allows that only the last
packet in a datagram gets padded.
This change implements an initial version of GSO support [1][2][3]
for Linux, which improves the effiency of sending data.

The approach taken in this change is to create a buffer which contains
multiple datagrams in `Connection::poll_transmit`. This was picked
over trying to merge packets in the endpoint task, since packets
seem to need to be padded to a common segment size in order to
make them sendable via GSO.

In order to to this in an efficient fashion, the `poll_transmit`
method was restructred. Instead of selecting spaces upfront, it
will now loop through all possible packet spaces, check if there is
pending data to send and create packets and datagrams out of this.

The last packet which was written to a datagram buffer is not
finalized until it is clear whether follow-up packets need to be
written, since we need to know whether this packet should get padded
to MTU length or not.

This change doesn't enable GSO yet, since it will only produce a
single datagram. I will create a separate change for this.

Performance measurements:

**No GSO:**

```
Sent 1073741824 bytes on 1 streams in 4.31s (237.66 MiB/s)
```

**With GSO (up to 8 packets):**

```
Sent 1073741824 bytes on 1 streams in 3.02s (339.18 MiB/s)
```

[1] http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-paper-DRAFT-1.pdf
[2] http://vger.kernel.org/lpc_net2018_talks/willemdebruijn-lpc2018-udpgso-presentation-20181104.pdf
[3] https://lwn.net/Articles/752956/
@Matthias247
Copy link
Contributor Author

I split it now into a lot of commits. I think this should now be reviewable

I however would love if we would avoid being overly nitpicky on individual things, because all renames and tiny other changes will now have an impact on all follow-ups. I'm happy to apply further changes later on, but would like to avoid having to change 5 commits for the sake of a style discussion.

Copy link
Collaborator

@Ralith Ralith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for breaking this up! poll_transmit is a bit of a monolith, but between the detailed comments introduced and the independent changes being split out, this was a pleasure to review.

Overall this looks great. Most issues are cosmetic and non-blocking, though I think pacing of loss probes is required for correctness.

If it's not too difficult, I would like to see the "Indicate whether a packet requires padding in SentFrames" and "Use padding information from populate_packet" commits folded together, since the former doesn't stand on its own very well. Not the end of the world for me if that's just too painful a rebase, though.

quinn-proto/src/connection/mod.rs Outdated Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
quinn-proto/src/connection/mod.rs Show resolved Hide resolved
@Matthias247
Copy link
Contributor Author

Oops, this wasn't intended to be closed. Wrong command on git push.

@Matthias247
Copy link
Contributor Author

If it's not too difficult, I would like to see the "Indicate whether a packet requires padding in SentFrames" and "Use padding information from populate_packet" commits folded together, since the former doesn't stand on its own very well.

Seems like we had been lucky and at least moving the commit into one direction and merging worked without conflicts! Updated

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants