-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Malformed HTTP headers cause Envoy to 503 when proxying via HTTP/2 #7061
Comments
maybe to give a little more context here, it looks like:
It seems weird that Envoy can create packets that it-itself can't terminate, so a simple solution might be calling Would something like that make sense? |
@bobby-stripe Thanks for the additional context, and for digging up that As some "supporting evidence" for the type of fix that you proposed, we can look at how Go handles this: Go uses the httpguts.ValidHeaderFieldValue function (which seems analogous to As a result, these malformed header values haven't caused the same sorts of issues in the Go parts of our stack. |
Thanks for all the detailed context and report.
The reason for this is that when nghttp2 detects the violation, it will reset the stream. Right now downstream Envoy will always end up translating that reset into a 503. It's possible that we could look at the reset type and translate to a 400 but I'm not sure it's worth it. Overall, I agree that we should unify the H1 and H2 validation flows, so am moving this to help wanted. |
To diagnose problems like these, is there any way to figure out exactly which header is malformed? The only log we get is "invalid frame: Invalid HTTP header field was received on stream" |
This failures comes from within nghttp2 and IIRC there is no easy way to get the failure details, other than turning on extra trace logging. It might be worth looking at the code though. |
@skydoctor In this case, the problem isn't with a particular HTTP header, it's with any HTTP header whose value contains characters that are invalid, per the relevant RFC. In my repro for this issue, you can see that I manually create a header with such contents, in request.txt. As mentioned previously, Go's I'm actually working on a PR that will use |
Thanks @dcarney, it would be great to validate the header for sure. In general, if a request is received from a client we don't control, it becomes harder to find which specific header had illegal characters. @mattklein123: There seems to be a way to get nghttp2 to print out the name of the header in the form of : |
@mattklein123, @alyssawilk: Do you see a way to incorporate the more detailed log from nghttp2? |
Unfortunately I'm not sure there's a way to get nghttp2 to print that error, other than turning up all nghttp2 logs, which gets pretty spammy. It looks like the detailed string gets passed up to session_call_error_callback, but the envoy codec callback only gets NGHTTP2_ERR_HTTP_HEADER Would it have helped if the upstream Envoy had extra debug info, i.e. where it logs We could also ask @tatsuhiro-t (nghttp2 author) if he'd be up for either passing up the details string, or tracking stream state of the individual error in a way we could call back to the nghttp2 call stack to get the details. |
@mattklein123 @alyssawilk I actually have a working PR that I was going to submit that fixes this issue following the rough approach that @bobby-stripe suggested a few weeks ago:
My PR makes the HTTP connection manager validate that the headers are RFC-compliant (delegating to Let me know if this seems like a reasonable approach. I was planning to submit my PR imminently, after writing up a detailed PR description, checking the test coverage, etc. |
@dcarney thanks. I think this fix is reasonable, but I would be concerned about changing end-user behavior, which makes me think we should probably do opt-in for this, but I could be convinced otherwise. WDYT @alyssawilk? |
Definitely opt-in please. I'm less sure if it needs to be a permanent config option or can we do it with runtime flags. We could always runtime-guard it with the plans of deprecating the old code path, but if anyone complains replace the runtime guard with a permanent config option? I'm also curious if orthogonally we should tweak our upgrade detection. Right now on the H2 path we only remove the upgrade header iff both the upgrade header is present and there's a connection "upgrade" header, but arguably we could be more lenient in checking if there's legit non-standard-compliant upgrade requests. |
Thanks @alyssawilk. The place I see this error is an edge gateway and the incorrect header is received from an outside client. Hence the struggle is to convey to the client which exact header of theirs was at fault. We turned up the logging level to trace but still could only get the following. Are you implying that with logging level set to "trace", we should have seen the name of the header at fault?
|
nghttp2 has https://nghttp2.org/documentation/nghttp2_session_callbacks_set_error_callback2.html#c.nghttp2_session_callbacks_set_error_callback2 which provides verbose error message. |
Oh, that's fantastic, I'd missed that API, thanks. |
Perhaps this could be used for printing the name of the header in the trace log printout as well? |
Description: This PR addresses the behavior described in #7061 by modifying the HTTP/1.1 codec, such that it verifies the value of each HTTP header when onHeaderValue is called. The codec raises an exception, which results in a 400 response being returned if an invalid header value (per RFC 7230, section 3.2) is found. The actual validation is done via the nghttp2_check_header_value function, which is wrapped in Envoy::Http::HeaderUtility::headerIsValid. (NOTE: this has been discussed as something that nghttp2 could do itself, but the issue seems to have languished. Also note that Go handles this: Go uses the httpguts.ValidHeaderFieldValue function (which is analogous to nghttp2_check_header_value) to ensure that all the HTTP header values conform to the relevant RFC specs before an http.Transport instance will round-trip the request. Risk Level: Low/medium This stricter validation semantics are controlled with the envoy.reloadable_features.validate_header_values runtime-guarded feature. Originally, the PR used a new boolean on the HTTP connection manager proto to toggle the behavior, but during the course of PR review, it was decided that this would be more appropriate for a runtime guard. Testing: new unit tests, manual tests Release Notes: Updated Fixes #7061 Signed-off-by: Dylan Carney <dcarney@gmail.com>
Title:
When making client requests that contain invalid chars in an HTTP header to an Envoy instance ("envoy-one") that's configured to proxy over H/2 to an upstream Envoy ("envoy-two"), we see envoy-one return a 503 response with response flags "UR", without proxying the request to “envoy-two”. If we change the Envoy ←→ Envoy communication to use HTTP/1.1, we don’t see this behavior, and the request is happily proxied to the upstream Envoy.
Description:
While investigating some strange errors we were seeing in one of our applications, I wrote a quick test harness, so that I could make requests that contained invalid ASCII chars in HTTP headers.
The “invalid ASCII chars” are defined by the relevant RFC specs [0] [1]. In this specific test case, I inserted an ASCII control char (ASCII value 31, or
0x1F
) into an HTTP header value and executed that request against our stack (Envoy → Envoy → application).Relevant Links:
[0] https://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.2
[1] https://www.w3.org/Protocols/rfc2616/rfc2616-sec2.html#sec2.2
I saw that this would generate a 503 from the first Envoy before the request ever reaches our application, since it sits upstream of the second Envoy.
Repro steps:
The repro config and necessary files are located in a github repo: https://github.com/dcarney/envoy-repro
Reproing this scenario involves setting up the following topography:
I have the HTTP request saved in the
request.txt
file. It includes the HTTP header value with an invalid ASCII control character, along with other random characters.The included Docker Compose file sets up the following:
Each envoy has its listener port and admin port mounted to the host, and the two Envoys are able to reach each other over the Docker network “envoytest”.
Build the two Envoy containers, and start everything up with:
In another terminal, run the following, in order to make the HTTP request to the first Envoy (“envoy-one”):
You should see the 503 response, which is emitted by the first Envoy (“envoy-one”):
Envoy access logs for the “client Envoy” (envoy-one) show that the “response flags” for this request are “UR”, which according to the Envoy docs means:
We can change the connection between “envoy-one” and “envoy-two” to use HTTP/1.1, by removing line 43-45 in
envoy-one.yaml
:If we re-build the containers, restart the Docker Compose topology, and execute the request again, we find that we now get a 404 response from https://www.google.com/foo, which is what we expect when the request is successfully proxied through both Envoys.
I've included packet captures from
tcpdump
from both Envoys in the repo (envoy-{one,two}.pcap
). Looking at them in Wireshark clearly shows the upstream Envoy resetting the stream because of a "protocol error":Similarly, in the trace logs, we see this message from the "client Envoy":
And this message from the upstream Envoy:
I would expect that the client Envoy would return a 400 error, rather than constructing an H/2 request that fails to parse in the upstream Envoy.
Config:
envoy-one config
envoy-two config
Logs:
trace logs from both Envoys
I'm happy to provide any more details you might need. Thanks for looking!
The text was updated successfully, but these errors were encountered: