-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How are requests with differing Host:
header field values mapped to outbound connections?
#415
Comments
Similarly, we need to decide how to deal with the possibility that we might have requests claiming to be HTTP/1.0 and others claiming to be HTTP/1.1 that might be ending up on the same outbound (to the proxied service, or to an external destination). It seems like we should be segregating these by protocol version. |
@seanmonstar @carllerche Do you have thoughts about this? Could you explain how things currently work and whether you think my suggestion above is good or bad and, if bad, why? Also, how much work would it be to change things, if we were to make these changes? Would we need to bump hyper, tokio, and/or tower? |
It looks like @seanmonstar would know more about the impl details as it looks like the original PR came from him. It sounds like you want any connection pool to be keyed on the host header and not the socket addr? As for the last question, I would highly doubt this would relate to Tokio and probably wouldn't touch Hyper either. It might relate to some code in Tower, but that is not on crates.io yet and we are tracking git, so if there are any required fixes, it should be easy to do. |
Basically, for HTTP/1 + TLS, we need to ensure we're not mixing (outbound) traffic for different hosts on the same (outbound) connection, and I propose we do the same for non-TLS HTTP/1 traffic. That's partially because I want to minimize differences between non-TLS and TLS traffic, and partially because I"m concerned that not all destinations will correctly support mixing different |
This seems fine to me. |
Currently the HTTP/1 proxy part looks for the The code that creates the hyper client does not check that the outgoing request host matches previous requests sent on the same connection. To make sure the connections are keyed not only on the socket address but the host as well, we could probably adjust the |
What seems like an appropriate type to use to represent the |
The |
Okay, so I’m beginning to think that the most correct thing to do is just to reuse the I was going to have an enum that’s either Therefore, I think it's best to use |
That seems fine, however there is a difference in how Even with a |
@seanmonstar Okay, that makes sense. Thanks for clearing that up for me, I think I understand what's necessary now. |
Here's my understanding of how to determine a
The question I've arrived at now is, what do we do if the request lacks both an authority in the URI and a |
I wonder if to be on the safe side, an HTTP/1.0 request with absolutely no authority should just always have its own connection. It makes me sad to waste resources like that, but perhaps it's expected? |
@seanmonstar that seems like a safe bet for now, at least. |
The idea of normalizing to
Then we can have something like this:
Importantly, there's never a reason to have Also note that we have to decide what to do in the case that |
I also think we don't have to optimize the case where there is no |
In fact,
And then use |
@briansmith Thanks for all of the additional information. Do you think all of the changes you're describing here are in scope for a branch to address ensuring that each host has its' own connection, or should some of these changes be done in additional PRs? |
It would be good to first do the refactoring such that It would be bad to have a PR that uses |
@briansmith okay, that sounds good to me. Should there be a separate issue to track the |
I don't care if there is a separate issue. A separate PR would be nice, though not strictly required. |
@briansmith I've opened PR #476 to make the changes you described here. |
A quick update: I think I've made the necessary changes here. Since this is hard to validate in a unit test, I'm talking to the release confidence team about end-to-end testing for this behaviour. |
I am pretty sure we can use (and/or extend) the proxy's own integration test framework to test this. I think the end-to-end tests should be reserved for Kubernetes- or environment- dependent behavior whenever possible. |
If wanting to validate in a unit test, I would edit the |
Okay, thanks @briansmith and @seanmonstar, I'll see what I can do. |
#476) As requested by @briansmith in #415 (comment) and #415 (comment), I've refactored `FullyQualifiedAuthority::normalize` to _always_ return a `FullyQualifiedAuthority`, along with a boolean value indicating whether or not the Destination service should be used for that authority. This is in contrast to returning an `Option<FullyQualifiedAuthority>` where `None` indicated that the Destination service should not be used, which is what this function did previously. This is required for further progress on #415. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
I have these changes ready, but writing unit tests for them is currently blocked by a Hyper bug that's causing multiple connections to be opened and then immediately closed. I believe @seanmonstar is looking into that. Should I go ahead and open a PR for this without the unit tests, and then add the tests in a subsequent PR? |
Okay, so while trying to test the branch I wrote to add the Host header to If we're still interested in merging the code I've written to modify our |
Yes, please.
We don't need to merge the new code as long as we have a good test. If we change how connection pooling works in the future, then the test will catch any breakage, which is one of the main points of having a test. The even more important reason to have a test is to inform all of us how our product actually works. :) |
#476) As requested by @briansmith in #415 (comment) and #415 (comment), I've refactored `FullyQualifiedAuthority::normalize` to _always_ return a `FullyQualifiedAuthority`, along with a boolean value indicating whether or not the Destination service should be used for that authority. This is in contrast to returning an `Option<FullyQualifiedAuthority>` where `None` indicated that the Destination service should not be used, which is what this function did previously. This is required for further progress on #415. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
Okay, opened PR #489 for the tests. |
I suspect your code may still be needed to handle the case of there being no host header at all, since that just assumes the ORIG_DST as the host. |
Ah, that's a good point, thanks @seanmonstar for pointing that out. |
We're going to have to revisit all of this with logical routing, fwiw.
I'm very skeptical that this is a problem we should be concerned about without a concrete use case. |
… values (#492) This PR ensures that the mapping of requests to outbound connections is segregated by `Host:` header values. In most cases, the desired behavior is provided by Hyper's connection pooling. However, Hyper does not handle the case where a request had no `Host:` header and the request URI had no authority part, and the request was routed based on the SO_ORIGINAL_DST in the desired manner. We would like these requests to each have their own outbound connection, but Hyper will reuse the same connection for such requests. Therefore, I have modified `conduit_proxy_router::Recognize` to allow implementations of `Recognize` to indicate whether the service for a given key can be cached, and to only cache the service when it is marked as cachable. I've also changed the `reconstruct_uri` function, which rewrites HTTP/1 requests, to mark when a request had no authority and no `Host:` header, and the authority was rewritten to be the request's ORIGINAL_DST. When this is the case, the `Recognize` implementations for `Inbound` and `Outbound` will mark these requests as non-cachable. I've also added unit tests ensuring that A, connections are created per `Host:` header, and B, that requests with no `Host:` header each create a new connection. The first test passes without any additional changes, but the second only passes on this branch. The tests were added in PR #489, but this branch supersedes that branch. Fixes #415. Closes #489.
#476) As requested by @briansmith in linkerd/linkerd2#415 (comment) and linkerd/linkerd2#415 (comment), I've refactored `FullyQualifiedAuthority::normalize` to _always_ return a `FullyQualifiedAuthority`, along with a boolean value indicating whether or not the Destination service should be used for that authority. This is in contrast to returning an `Option<FullyQualifiedAuthority>` where `None` indicated that the Destination service should not be used, which is what this function did previously. This is required for further progress on #415. Signed-off-by: Eliza Weisman <eliza@buoyant.io>
… values (linkerd#492) This PR ensures that the mapping of requests to outbound connections is segregated by `Host:` header values. In most cases, the desired behavior is provided by Hyper's connection pooling. However, Hyper does not handle the case where a request had no `Host:` header and the request URI had no authority part, and the request was routed based on the SO_ORIGINAL_DST in the desired manner. We would like these requests to each have their own outbound connection, but Hyper will reuse the same connection for such requests. Therefore, I have modified `conduit_proxy_router::Recognize` to allow implementations of `Recognize` to indicate whether the service for a given key can be cached, and to only cache the service when it is marked as cachable. I've also changed the `reconstruct_uri` function, which rewrites HTTP/1 requests, to mark when a request had no authority and no `Host:` header, and the authority was rewritten to be the request's ORIGINAL_DST. When this is the case, the `Recognize` implementations for `Inbound` and `Outbound` will mark these requests as non-cachable. I've also added unit tests ensuring that A, connections are created per `Host:` header, and B, that requests with no `Host:` header each create a new connection. The first test passes without any additional changes, but the second only passes on this branch. The tests were added in PR linkerd#489, but this branch supersedes that branch. Fixes linkerd#415. Closes linkerd#489.
This is a follow-up to PR #397, in particular #397 (comment).
In the case where we send the request to the
SO_ORIGINAL_DST
destination, because we didn't discover the destination using the Destinations service, we're not making anything w.r.t. whether requests with differentHost
request header field values are sent on the same connection; the original service is making the choice whether or not to do that. We should, at least, document this in the code with a comment.In the case where HTTP/1.x requests with differing host values are inbound to the proxy from an external source:
In the case where HTTP/1.x requests with different host values are outbound from the proxy to an external source:
Note in particular that in the outbound case we don't know whether the destination supports HTTP/1.1 or only HTTP/1.0; i.e. we don't know whether it might ignore the Host header. Also, even in the case where the destination claims to support HTTP/1.1, I'm not sure we can rely on this to mean that we can safely send it requests with differing Host header field values on the same connection. We should do what web browsers like Chrome do here.
All of the above is regarding plaintext (http://) connections. In the case of TLS-protected (https://) connections, I already know that at least empirical results from browsers require that the Host header field match the SNI (Server Name Indication) TLS extension. That is, HTTPS rules require us to avoid sending requests with different Host header field values on the same connection. See https://bugs.chromium.org/p/chromium/issues/detail?id=615413#c2.
Unless there's important reasons to do otherwise, I'd prefer to do the same for HTTP as is required for HTTPS; i.e. have distinct connections per Host request field value.
The text was updated successfully, but these errors were encountered: