Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http/1.1 codec: Rejects requests w/ invalid HTTP header values #7306

Merged

Conversation

dcarney
Copy link
Contributor

@dcarney dcarney commented Jun 17, 2019

Description:
This PR addresses the behavior described in #7061 by modifying the HTTP/1.1 codec, such that it verifies the value of each HTTP header when onHeaderValue is called. The codec raises an exception, which results in a 400 response being returned if an invalid header value (per RFC 7230, section 3.2) is found. The actual validation is done via the nghttp2_check_header_value function, which is wrapped in Envoy::Http::HeaderUtility::headerIsValid.

(NOTE: this has been discussed as something that nghttp2 could do itself, but the issue seems to have languished. Also note that Go handles this: Go uses the httpguts.ValidHeaderFieldValue function (which is analogous to nghttp2_check_header_value) to ensure that all the HTTP header values conform to the relevant RFC specs before an http.Transport instance will round-trip the request.

Risk Level:
Low/medium

This stricter validation semantics are controlled with the envoy.reloadable_features.validate_header_values runtime-guarded feature. Originally, the PR used a new boolean on the HTTP connection manager proto to toggle the behavior, but during the course of PR review, it was decided that this would be more appropriate for a runtime guard.

Testing:
Unit tests were added to cover both the HeaderUtility changes as well as the HTTP codec changes. Manual testing against an Envoy build made from this PR's branch was performed using the repro steps decribed in #7061. During the manual testing, the "client Envoy" (or "envoy-one") returned a 400 as expected, rather than exhibiting the behavior described in the issue:

# run a newly-compiled envoy with an HTTP/1.1 listener 
# (on port :10001) and an H/2 upstream cluster

$ ./envoy-static -c config.yaml -l trace 2>&1                                               

# make the request with the invalid HTTP header(s):

$ cat invalid_request.txt | nc -n -i 1 127.0.0.1 10001      
HTTP/1.1 503 Service Unavailable
content-length: 85
content-type: text/plain
date: Thu, 11 Jul 2019 02:25:36 GMT
server: envoy

upstream connect error or disconnect/reset before headers. reset reason: remote reset%

# enable the new runtime feature to cause stricter header validation

$ curl --silent -X POST http://localhost:9901/runtime_modify\?envoy.reloadable_features.validate_header_values\=true
OK

# make the request with the invalid HTTP header(s) again:

$ cat invalid_request.txt | nc -n -i 1 127.0.0.1 10001 
HTTP/1.1 400 Bad Request
content-length: 0
connection: close

Success!

Docs Changes:

Release Notes: Updated
Fixes #7061

@dcarney dcarney force-pushed the dcarney/reject-invalid-http-headers branch 2 times, most recently from 682407d to 87d534c Compare June 17, 2019 23:01
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this seems reasonable to me. CI seems broken, mind taking a look?

[](const Http::HeaderEntry& header, void* context) -> Http::HeaderMap::Iterate {
const absl::string_view header_value = header.value().getStringView();
if (nghttp2_check_header_value(
reinterpret_cast<const uint8_t*>(std::string(header_value).data()),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you call .data on the string_view directly to avoid a temporary std::string?

// header, returning a 400 response if any of the headers contain an invalid
// value. See `Header Fields `<https://tools.ietf.org/html/rfc7230#section-3.2>`_
// for details.
google.protobuf.BoolValue validate_headers = 33;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normally we use the POD boolean type when the default is false

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, sounds good to me. I think I had been looking at some other fields on the conn manager like add_user_agent, which uses the proto wrapper type, but defaults to false.

@snowp snowp self-assigned this Jun 18, 2019
@snowp snowp added the waiting label Jun 18, 2019
@@ -598,6 +599,17 @@ void ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers,
// called with end_stream=true.
maybeEndDecode(end_stream);

if (connection_manager_.config_.validateHeaders()) {
ENVOY_STREAM_LOG(trace, "validating header values\n{}", *this, *request_headers_);
if (!Http::HeaderUtility::validateHeaders(*request_headers_)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we bump a stat here? might be useful

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@dcarney
Copy link
Contributor Author

dcarney commented Jun 18, 2019

@snowp Thanks for the feedback! I addressed your comments, and CI is sorted out now.

@snowp snowp closed this Jun 19, 2019
@snowp snowp reopened this Jun 19, 2019
@snowp
Copy link
Contributor

snowp commented Jun 19, 2019

LGTM, though a stat would be nice. @alyssawilk do you wanna take a look?

@alyssawilk alyssawilk self-assigned this Jun 19, 2019
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great - I think it makes Envoy much safer to do consistent header validation. I've got a couple of thoughts for today, and I'll do a closer pass at tests tomorrow.

// header, returning a 400 response if any of the headers contain an invalid
// value. See `Header Fields <https://tools.ietf.org/html/rfc7230#section-3.2>`
// for more details regarding the validity of HTTP headers.
bool validate_headers = 33;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can sanity check with @mattklein123 , but if this is making Envoy more standards compliant I'd lean towards making it a permanent change (off by default, and then see if on by default works for people)

If so I think instead of making changes to the Envoy API, we should just runtime guard per
https://github.com/envoyproxy/envoy/blob/master/CONTRIBUTING.md#runtime-guarding

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runtime guard for this SGTM 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay. Just so I understand correctly, we'd be making this new behavior:

  • a runtime-guarded option only, and removing the option it from the connection manager proto, as I have it currently.
  • off by default

I'm basing that on the comments in the original issue about making it opt-in.

I was a little confused by this verbiage from the runtime guarding docs:

Generally all runtime guarded features will be set true when a release is cut)

@@ -30,6 +30,7 @@ Version history
* http: mitigated a race condition with the :ref:`delayed_close_timeout<envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.delayed_close_timeout>` where it could trigger while actively flushing a pending write buffer for a downstream connection.
* http: added support for :ref:`preserve_external_request_id<envoy_api_field_config.filter.network.http_connection_manager.v2.HttpConnectionManager.preserve_external_request_id>` that represents whether the x-request-id should not be reset on edge entry inside mesh
* http: changed `sendLocalReply` to send percent-encoded `GrpcMessage`.
* http: added support for rejecting requests containing invalid HTTP header values.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if we leave this in, we should reference the config field. If we runtime guard let's mention the string we guard it with.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I think we now have two release notes - how about removing this one since I think the upper one is correct?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually this file looks odd in general - there seem to be a bunch of deltas which aren't yours. Bad merge?

@@ -128,6 +128,8 @@ struct ResponseCodeDetailValues {
// indicates that original "success" headers may have been sent downstream
// despite the subsequent failure.
const std::string LateUpstreamReset = "upstream_reset_after_response_started";
// The request was rejected due to invalid HTTP headers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, the headers had invalid characters, yes? If so let's be as specific as possible here and below

@@ -598,6 +599,17 @@ void ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers,
// called with end_stream=true.
maybeEndDecode(end_stream);

if (connection_manager_.config_.validateHeaders()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do just want to call out this solves your problem because your client is using HTTP/1. If they use HTTP/2 with invalid characters, nghttp2 would reject the request before we got to the HCM decodeHeaders call.
I still think it's a good change, I just want to make sure the limitations are on your radar :-)

@@ -598,6 +599,17 @@ void ConnectionManagerImpl::ActiveStream::decodeHeaders(HeaderMapPtr&& headers,
// called with end_stream=true.
maybeEndDecode(end_stream);

if (connection_manager_.config_.validateHeaders()) {
ENVOY_STREAM_LOG(trace, "validating header values\n{}", *this, *request_headers_);
if (!Http::HeaderUtility::validateHeaders(*request_headers_)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a perf perspective, this ends up being a full redundant header scan for HTTP/2 as nghttp2 has already validated headers at this point. It might be worth moving this check in the HTTP/1 codec (maybe even doing onHeaderValue? rather than once at the end) and we could ASSERT that all characters are valid so that as we add more transports (QUIC etc.) we have consistent behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Contributor Author

@dcarney dcarney Jun 25, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alyssawilk If I move this to the http/1.1 codec, it seems to make sense to handle it similarly to this code, which is doing similar albeit more limited, header value checking (it even references the same RFC section!). WDYT?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think with the runtime guard false, it does the old thing, and with the runtime guard true, it does your new thing.
runtime starts out false, we flip it with the release, and then remove old code the following release.
If there's any docs clean up which would make that more clear in future feel free to include it in this PR (with my thanks)

@dcarney
Copy link
Contributor Author

dcarney commented Jun 27, 2019

Update on this PR: I talked with @alyssawilk on Slack about some of the integration test failures I'd been seeing with the runtime guarding (thanks a ton Alyssa!).

The TL;DR is that the runtime guarding has issues in certain integration tests due to its reliance on thread-local storage (which isn't available in all integration test threads, depending on the objects which are trying to use the runtime impl).

We came up with a workaround approach which will allow this PR to make forward progress, and the runtime/thread-local stuff (and some docs improvements) can be sorted out in another PR. I've got the implementation working correctly using the runtime guarding and incorporating the other PR feedback. Now I just need to thread things through correctly in the integration tests. 👍

@stale
Copy link

stale bot commented Jul 4, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added stale stalebot believes this issue/PR has not been touched recently and removed stale stalebot believes this issue/PR has not been touched recently labels Jul 4, 2019
Copy link
Contributor

@alyssawilk alyssawilk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking great! A few more thoughts on little ways we can clean this up before merging

@@ -136,11 +136,12 @@ void CodecClient::onData(Buffer::Instance& data) {

CodecClientProd::CodecClientProd(Type type, Network::ClientConnectionPtr&& connection,
Upstream::HostDescriptionConstSharedPtr host,
Event::Dispatcher& dispatcher)
Event::Dispatcher& dispatcher, bool validate_header_values)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe strict_header_validation?
since we were validating a bit before, just not much

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, or maybe I'm pinging when you're not ready for me to take another pass. Let me know when I should take a look? :-)

Copy link
Contributor Author

@dcarney dcarney Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I think strict_header_validation is a slightly more accurate description of what's going on.

validate_header_values is consistent with the runtime feature name, and all the other parameters/contructor args however. @alyssawilk Do you think I should find/replace them all, so we standardize on strict_header_validation?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That'd be great, if you're up for it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a PR-wide s/validate_header_values/strict_header_validation/

throw CodecProtocolException("http/1.1 protocol error: header value contains NUL");
}

if (validate_headers_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be an if-else with the above clause, since I think headerIsValid should catch the '\0'
Can you check? It'd mean that when we remove the old code path we can remove the extra header scan without worrying about a change in functionality

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, the HeaderUtility::headerIsValid function will catch the \0 char, along with any other ASCII char with decimal value [0-31] (with the exception of decimal value 9, which is allowed per the RFC).

@@ -664,8 +688,13 @@ void ServerConnectionImpl::onBelowLowWatermark() {
}
}

ClientConnectionImpl::ClientConnectionImpl(Network::Connection& connection, ConnectionCallbacks&)
: ConnectionImpl(connection, HTTP_RESPONSE, MAX_RESPONSE_HEADERS_KB) {}
ClientConnectionImpl::ClientConnectionImpl(Network::Connection& connection,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to not just remove the old constructor and default validate_header_values to false?

Though now that I think about it, I think we'd be better off only having the new constructor and not having a default value. If we force all callers to pick a boolean we won't miss any code sites using the new code path.

It will probably mean tweaking a bunch of unit tests (you can set them all true) but hopefully you can search and replace there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason I hadn't done that was the pervasiveness of the change: it involves a bunch of find/replace that really blew up the diff for the PR. (Due to the fact that it involves changing ServerConnectionImpl, ClientConnectionImpl, CodecClient/CodecClientProd, ConnectionManagerConfig, and associated mocks, unit tests, and call sites.

If you think that much code churn is warranted though, I can certainly do that, and make all the callers default to false (and default the unit test callers to true). I know originally we had talked about an overloaded constructor like the one included here, which didn't lead to nearly as much churn.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's actually a fair point about code churn given we'd want to go back to the single argument once we flip this to true and deprecate.

Do you feel confident from grepping around that you have all the call sites under source/...? If not, how about a hacked local version with a RELEASE_ASSERT in the old constructors to make sure we didn't miss any, and we'll call it a day?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, I added RELEASE_ASSERT(false, ""); to the {Client,Server}ConnectionImpl constructors and ran Envoy to perform the steps outlined in the PR description (in the "testing" section). Worked great.

@@ -292,6 +296,10 @@ class ServerConnectionImpl : public ServerConnection, public ConnectionImpl {
ServerConnectionImpl(Network::Connection& connection, ServerConnectionCallbacks& callbacks,
Http1Settings settings, uint32_t max_request_headers_kb);

ServerConnectionImpl(Network::Connection& connection, ServerConnectionCallbacks& callbacks,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto here, regarding the new constructor

@@ -336,7 +336,7 @@ Http::CodecClient::Type HttpHealthCheckerImpl::codecClientType(bool use_http2) {
Http::CodecClient*
ProdHttpHealthCheckerImpl::createCodecClient(Upstream::Host::CreateConnectionData& data) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be inclined to snag the runtime value here too, so when we've flipped the default we can remove the old code path and always do full sanitization

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you did this in the wrong block, for the gRPC health checker (always H2) rather than than here where it can be 1/1

Other than that, LGTM.
@snowp any other thoughts on your end?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alyssawilk Good catch! Just fixed that.

@dcarney dcarney force-pushed the dcarney/reject-invalid-http-headers branch from 6f39c4f to bed11d6 Compare July 9, 2019 23:55
@dcarney dcarney changed the title http conn manager: Rejects requests w/ invalid HTTP header values http/1.1 codec: Rejects requests w/ invalid HTTP header values Jul 10, 2019
@dcarney dcarney force-pushed the dcarney/reject-invalid-http-headers branch from 10bfdae to 0037ece Compare July 10, 2019 20:13
@alyssawilk
Copy link
Contributor

Oh also, can you do a master merge, and move your release note to 1.12.0? Thanks!

dcarney added 3 commits July 11, 2019 15:35
…alues

Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
dcarney added 15 commits July 11, 2019 15:35
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
…feedback

Signed-off-by: Dylan Carney <dcarney@gmail.com>
…s on individual values

Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
…ring used.

  - Also: fixes grammar in comments

Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
… arg, per alyssawilk

Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
@dcarney
Copy link
Contributor Author

dcarney commented Jul 11, 2019

@alyssawilk Just rebased against the latest master and moved the release note to 1.12.0. Looks like I'm just missed that 1.11.0 train!

Signed-off-by: Dylan Carney <dcarney@gmail.com>
@dcarney dcarney force-pushed the dcarney/reject-invalid-http-headers branch from 0037ece to bc6205e Compare July 11, 2019 22:47
Signed-off-by: Dylan Carney <dcarney@gmail.com>
@dcarney dcarney force-pushed the dcarney/reject-invalid-http-headers branch from bc6205e to c0126ea Compare July 11, 2019 22:49
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Signed-off-by: Dylan Carney <dcarney@gmail.com>
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No comments from me, this looks great!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Malformed HTTP headers cause Envoy to 503 when proxying via HTTP/2
6 participants