-
Notifications
You must be signed in to change notification settings - Fork 151
feat(s2n-quic-dc): throttle repeated successful handshakes #2845
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(s2n-quic-dc): throttle repeated successful handshakes #2845
Conversation
a8032ac to
7e10ff6
Compare
7e10ff6 to
30da67a
Compare
| rng.random_range(1..120) | ||
| rng.random_range(1000..120_000) | ||
| }; | ||
| tokio::time::sleep(Duration::from_secs(duration)).await; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what was the reason for this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the commit message:
As a drive-by improvement this also tweaks the delays to measure in
milliseconds, increasing the randomization.
Basically it's increasing the entropy in how much we randomize how long we'll sleep for. I doubt it matters much in practice though.
This mirrors prior work for *failing* handshakes by adding a throttle on outgoing, successful handshakes to a given peer. This is particularly relevant for workloads that happen to trigger replay detection relatively often, which would otherwise cause us to repeatedly handshake with the peer. It remains generally a good idea for us to switch to new secret material in case of e.g. spurious bitflips in memory or other issues that have broken the old entry in some way, so retaining the handshake-on-possible-replay makes sense, but doing so at a high rate increases CPU utilization for little benefit. As a drive-by improvement this also tweaks the delays to measure in milliseconds, increasing the randomization.
30da67a to
7ec75cf
Compare
Release Summary:
Resolved issues:
n/a
Description of changes:
This mirrors prior work for failing handshakes by adding a throttle on outgoing, successful handshakes to a given peer. This is particularly relevant for workloads that happen to trigger replay detection relatively often, which would otherwise cause us to repeatedly handshake with the peer.
It remains generally a good idea for us to switch to new secret material in case of e.g. spurious bitflips in memory or other issues that have broken the old entry in some way, so retaining the handshake-on-possible-replay makes sense, but doing so at a high rate increases CPU utilization for little benefit.
Call-outs:
n/a
Testing:
No particular testing added. This is exercise by most of our existing tests since the code is hit by anything that handshakes; I don't think we need additional, dedicated test coverage for it.
The stream tests are updated to disable this jitter as it breaks restart tests. As noted in comments, I think the tradeoff is reasonable, but open to other thoughts there (or other defaults instead of 1 minute).
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.