Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return 429 response on HTTP max connection limit #13621

Merged
merged 5 commits into from
Jul 20, 2022

Conversation

wjordan
Copy link
Contributor

@wjordan wjordan commented Jul 6, 2022

The limits.http_max_conns_per_client configuration option sets a limit on the max number of concurrent TCP connections per client IP address:

- `http_max_conns_per_client` `(int: 100)` - Configures a limit of how many
concurrent TCP connections a single client IP address is allowed to open to
the agent's HTTP server. This affects the HTTP servers in both client and
server agents. Default value is `100`. `0` disables HTTP connection limits.

The current implementation behavior silently closes the TCP connection, resulting in opaque 'unexpected EOF' or 'connection reset' errors which many users fail to correctly diagnose, and is likely the root cause of various reported Nomad issues.

Instead of silently closing the connection, this PR returns a 429 Too Many Requests HTTP response with a helpful error message to aid debugging when the connection limit is unintentionally reached. This should help diagnose/fix some instances of #12273 and #8718, where the fix is simply to increase the connection limit from the default of 100.

The default of 100 is easily hit when doing anything with blocking queries, since each blocking query requires a separate HTTP client connection. One example from my use-case: the Nomad Autoscaler monitors each job status using a blocking query, which means that any autoscaler agent monitoring more than 100 jobs will exceed this default limit in normal operation and will need to be increased.

One tradeoff in this PR is that the connection is kept open slightly longer as the agent writes out the 429 response. The HTTPConnStateFuncWithDefault429Handler function accepts a write-deadline argument, which I've set as 0 (no deadline) (updated: 10 * time.millisecond to match existing behavior in Consul) in this PR, this could alternatively be made configurable through an additional config option, if desired.

Instead of silently closing the connection, return a `429 Too Many Requests`
HTTP response with a helpful error message to aid debugging when the
connection limit is unintentionally reached.
@wjordan
Copy link
Contributor Author

wjordan commented Jul 6, 2022

I found that Consul returns a 429 response on max-connection limit (hashicorp/consul#8221) and uses 10 * time.Millisecond for its write timeout, so I added a commit with a matching 10ms timeout for consistency.

Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @wjordan! This PR effectively reverts a PR made by my colleague @shoenig in f0dff3f, where we weren't happy with how the default 429 handler interacted with our handshake.

I'm working on rate limiting in #13480 which will send a 429 code as you'd expect. I can make sure we're introducing that where we can for connection tracking as well.

I'm going to close this PR, but thanks so much for contributing!

@tgross tgross closed this Jul 8, 2022
@wjordan
Copy link
Contributor Author

wjordan commented Jul 8, 2022

Thanks for pointing me to the previous discussion. The concern in #9608 (review) seems to be that writing 429 responses to the net.Conn seems like a DoS, since the write calls could block and prevent the main thread from handling connections from other addresses.

It seems the DoS concern is mostly theoretical (I was unable to reproduce any DoS in my own local testing), and I think it's based on some unclear assumptions about what might cause a conn.Write call to block- I believe each TCP socket has its own send buffer which should be empty for any StateNew connection, so blocking writes should never typically happen. In any case, either reducing the timeout to 1 * time.microsecond and/or wrapping the write calls in their own goroutines could be a reasonable mitigation if this is a remaining concern. (I don't see how #13480 is related, as that seems focused on rate-limiting API calls and not TCP-connection rates, at least in its current form.)

Beyond this, rejecting this PR in Nomad raises an issue about Consul's current behavior- if this behavior was considered a DoS issue in Nomad, does that mean that it's currently a DoS issue for Consul? At the very least I'd like to see the various Hashicorp products agree on some consistent behavior across them, whatever it may be.

@tgross
Copy link
Member

tgross commented Jul 11, 2022

Hey @wjordan I suspect you're right that blocking sends aren't an issue, but I'm also not sure that's actually the problem.

If we simply drop the connection we don't need to do the TLS handshake (but we also can't send the 429). But if we accept the connection and then immediately close it after sending the 429 we've already done an awful lot of work for the TLS handshake and that's the primary thing the connection rate limiting is intended to protect against. Once you've successfully connected it doesn't really matter how many connections you have; you can DoS the cluster by sending a ton of work to it (which is what #13480 is for).

I'm going to ping my colleagues @shoenig and @schmichael here, as they both worked on the original implementation and the PR this PR reverts. They might have more thoughts.

Beyond this, rejecting this PR in Nomad raises an issue about Consul's current behavior- if this behavior was considered a DoS issue in Nomad, does that mean that it's currently a DoS issue for Consul? At the very least I'd like to see the various Hashicorp products agree on some consistent behavior across them, whatever it may be.

Agreed.

The rate-limit prevents the TLS handshake necessary to write the HTTP response
from consuming too many server resources.
Add "Too many concurrent connections" log warning when
connection limit is reached.
Count of HTTP connections exceeding concurrency limit.
@wjordan
Copy link
Contributor Author

wjordan commented Jul 15, 2022

Please see wjordan/nomad#http_max_conn_limit_429 for followup commits (3372c07, 9768e42) that should address the concerns raised, by rate-limiting the 429 responses and adding a (not rate-limited) counter for tracking the total number of limit-exceeding connection attempts. Please let me know if this headed in a helpful direction and I will open a new PR.

@schmichael
Copy link
Member

This PR

tl;dr -> 👍 :shipit:

Hi @wjordan! Thanks for sticking with this! Security-sensitive issues are always tricky, and obviously we've gone back and forth on it ourselves.

In the absence of unique factors, Nomad follows Consul's lead on such matters, so I think we can accept this PR and match their behavior. It obviously increases risk somewhat, but the current behavior obviously causes significant pain for legitimate users and is very difficult to debug. I'm willing to take the risk for the sake of the common case.

Followups

As far as the followup commits are concerned, I think the metric would be great. 👍

I like your approach to logging as it limits the rate to 10/s. AFAICT from reading https://pkg.go.dev/golang.org/x/time/rate#NewLimiter the 100 burst limit will not take effect because we only call Reserve() and never the ...N() variant.

I think we can just set burst=1 and leave the rest be. I suspect bursts would only contain the same remote IP more times, so it seems like we can have a fairly low rate limit on this log line. Perhaps even lower it to 1/s? I'm fine with 1 or 10.

@angrycub angrycub added type/enhancement stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues labels Jul 18, 2022
@angrycub angrycub self-assigned this Jul 18, 2022
@tgross tgross self-requested a review July 20, 2022 15:52
Copy link
Member

@tgross tgross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @schmichael!

@wjordan I've kicked CI to re-run the test flakes but it looked like lint-go was still failing for goimport -d. I've done that and added a changelog entry. Once CI is green again we'll get this merged. Thanks for the PR and being patient with us figuring this out! 😀

@tgross tgross merged commit 662a12a into hashicorp:main Jul 20, 2022
@tgross tgross added this to the 1.3.x milestone Jul 20, 2022
@tgross
Copy link
Member

tgross commented Jul 20, 2022

This has been merged and will ship in the next version of Nomad!

@wjordan
Copy link
Contributor Author

wjordan commented Jul 20, 2022

🎉 Thanks for seeing this through! I understand the extra scrutiny required for security-sensitive issues and appreciate the extra attention.

Note that I had pushed the followup commits (adding rate-limiting and logging/metric) to this PR, so they were included in the merge-commit. It sounded like @schmichael had requested some slight changes (or at least some further discussion) on the followups, so just wanted to make sure the full merge was intended.


I like your approach to logging as it limits the rate to 10/s. AFAICT from reading https://pkg.go.dev/golang.org/x/time/rate#NewLimiter the 100 burst limit will not take effect because we only call Reserve() and never the ...N() variant.

Reserve is just shorthand for ReserveN(time.now(), 1), so the burst limit should apply correctly. (It seemed fine in my local ad-hoc testing)

I think we can just set burst=1 and leave the rest be. I suspect bursts would only contain the same remote IP more times, so it seems like we can have a fairly low rate limit on this log line. Perhaps even lower it to 1/s? I'm fine with 1 or 10.

That would make sense for the log entries, but a rate-limit too low could make 429 responses inconsistent if multiple processes are trying to establish connections. Maybe adding a separate 1/s rate-limit for log entries would make sense, even if it adds a bit of extra complexity?

@tgross
Copy link
Member

tgross commented Jul 21, 2022

so just wanted to make sure the full merge was intended.

Oh, sorry, I should have addressed those specifically and forgot to:

  • I noted the same thing for Reserve as a shorthand
  • Having an extra rate limit to track for log entries seems like a nice-to-have but I think it's going to make the implementation fussy, especially if we have to track the rate limit per IP to account for multiple IPs connecting, as you've noted. (I kind of wish hashicorp/conn-limit used sethvargo/go-limiter under the hood for this reason.)

Arguably having the log line there lets someone spam logs but in typical production environments where you've got journald going, it'll drop excessive logging anyways. We can always come back and add a rate limit here if we feel more strongly about it later.

@github-actions
Copy link

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/1.3.x backport to 1.3.x release line stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues type/enhancement
Projects
Development

Successfully merging this pull request may close these issues.

4 participants