Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make use of :scheme and XForwardedProto clear and consistent in Envoy #14587

Closed
PiotrSikora opened this issue Jan 6, 2021 · 36 comments · Fixed by #17372
Closed

Make use of :scheme and XForwardedProto clear and consistent in Envoy #14587

PiotrSikora opened this issue Jan 6, 2021 · 36 comments · Fixed by #17372
Assignees

Comments

@PiotrSikora
Copy link
Contributor

It looks that :scheme pseudo-header is missing in HTTP/1.x requests, which makes it impossible to use route-based actions based on HTTP vs HTTPS scheme.

HTTP/1.1 request:

[C3][S17504616115359689975] request headers complete (end_stream=true):
':authority', 'localhost:10000'
':path', '/'
':method', 'GET'
'user-agent', 'curl/7.52.1'
'accept', '*/*'

HTTP/2 request:

[C2][S9968071387702895111] request headers complete (end_stream=true):
':method', 'GET'
':path', '/'
':scheme', 'http'
':authority', 'localhost:10000'
'user-agent', 'curl/7.52.1'
'accept', '*/*'

cc @howardjohn

@mattklein123
Copy link
Member

Agreed we should be consistent here and fix this. cc @alyssawilk

@alyssawilk alyssawilk assigned alyssawilk and unassigned alyssawilk Jan 7, 2021
@alyssawilk
Copy link
Contributor

I thought the expectation was that ForwardedProto would represent the downstream scheme/security level internally, and :scheme was appended at the last minute for requests headed upstream, indicating the security of upstream requests. Let's sort out a plan offline.

@PiotrSikora
Copy link
Contributor Author

I thought the expectation was that ForwardedProto would represent the downstream scheme/security level internally, and :scheme was appended at the last minute for requests headed upstream, indicating the security of upstream requests. Let's sort out a plan offline.

:scheme is already present and represents the downstream scheme in HTTP/2 requests, so we shouldn't have different semantics based on the downstream protocol version, especially when Envoy uses HTTP/2 pseudo-headers to represent other properties of HTTP/1.x requests (:authority, :method and :path).

@alyssawilk
Copy link
Contributor

If filters use scheme, there's going to be issues where it means something different to upstream filters (where currently it would indicate upstream transport security) vs downstream filters (where it would be the downstream scheme if it exists, otherwise indicate downstream transport security). This extra complexity is not present for :method or :path, which is why I would like to put some thought into what solution is least likely to cause confusion (cc @mattklein123 right back =P)

@mattklein123
Copy link
Member

Yeah that's a good point that we swap :scheme: when we proxy upstream. Can we just clearly document that matching shouldn't be done on :scheme and it should be done on forwarded proto?

@alyssawilk
Copy link
Contributor

That's what I've been telling people internally.
If we think that's the proper thing to do (use x-forwarded-proto internally, serialize scheme right before encoding) I wonder if we should (very very carefully over several releases) remove scheme for http/2 to make it obvious to folks who are misusing it that they're using the wrong one? Something like

  • doc up that folks should use x-forwarded-proto
  • add warnings when the scheme header is accessed
  • make warnings fatal-by-default
  • remove scheme header where we add x-forwarded-proto

?

@mattklein123
Copy link
Member

I agree that ^ would be the right thing to do.

@alyssawilk alyssawilk changed the title :scheme pseudo-header is missing in HTTP/1.x requests Make use of :scheme and XForwardedProto clear and consistent in Envoy Jan 13, 2021
@alyssawilk alyssawilk self-assigned this Jan 13, 2021
@alyssawilk
Copy link
Contributor

I'll take on at least some of this when I get a chance.

@alyssawilk
Copy link
Contributor

OK, having tried to standardize on ForwardedProto I am increasingly convinced that's going to cause problems.

So, new plan, go the other way.

  1. set :scheme for all requests. For HTTP/1.1 it will be based on (trusted) forwarded proto, fully qualified urls, and encryption. For HTTP/2 it'll be the scheme header.
  2. Stop Envoy from sending :scheme header upstream based on upstream encryption which, to piotr's point, seems to be changing the target URI.
  3. Envoy announce that folks should switch from x-forwarded-proto to :scheme as the source of truth for scheme. Wait a month and/or a release
  4. Stop setting X-Forwarded-Proto when not present, Remove X-Forwarded-Proto as a special header, only apply to HTTP/1.1 requests upstream.

All data plane changes would be flag guarded. I think either 4) would have to go in its own release and/or we'd have to doc up how to safely upgrade with flags from (4) set false.

@snowp @antoniovicente @PiotrSikora SG? I'd like explicit thumbs up from at least two of you before I dive into this.

@PiotrSikora
Copy link
Contributor Author

SGTM. Thanks!

@mattklein123
Copy link
Member

In general this sgtm but 2 things:

  1. I know grpc has certain requirements with scheme. As long as scheme is set in all cases no matter what I think we are fine, but can we check this?
  2. I'm very concerned that we will break people in the transition. Can we somehow warn if people are matching on XFP? Anything else we can do? I would also make the transition extra long with warnings first.

@alyssawilk
Copy link
Contributor

alyssawilk commented Feb 1, 2021

yeah, I think this is going to be a multi-release fix, with lots of envoy-announce and warnings. I plan on auditing all envoy use of XFP, and removing the XFP header accessors to help with downstream C++ filters but I don't think there's much we can do for wasm/lua other than envoy-annouce and slow guarded rollout.

I think we can do 1-3 in this release, so in the next cut, :scheme should always be set and should be consistent with XFP.
We encourage folks using Envoy at head to move off XFP, but don't tweak anything until at least after the March release.

alyssawilk added a commit that referenced this issue Feb 24, 2021
Commit Message: Setting :scheme for headers in-Envoy.
Additional Description: This affects a number of components which do :scheme (rather than X-Forwarded-Proto) based logic for HTTP/1.1 traffic, up to but not exclusively wasm/lua filters looking for :scheme, gRPC access loggers, CSRF filter and oath2 filter
Risk Level: High
Testing: new unit tests, integration tests
Release Notes: inline
Runtime guard: envoy.reloadable_features.add_and_validate_scheme_header
Part 1 of #14587

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
alyssawilk added a commit that referenced this issue Mar 10, 2021
This replaces prior logic where the :scheme header was consistently overwritten based on the encryption level of the upstream connection.

Risk Level: High (l7 change)
Testing: new integration tests, unit tests
Docs Changes: api docs updated
Release Notes: inline
Runtime guard: envoy.reloadable_features.preserve_downstream_scheme
Part of #14587
Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Contributor

No argument needed! I should have been more clear that the 6 month thing is more of a default bound on how long we have to find that workaround and make sure it's solid - it is extensible but I'm sure we'll find a way to make sure your deployment works without flipping that guard back. I'm not clear today what that will look like, to both fix your use case and ensure we don't have a giant security hole for the edge proxy case but we'll find something :-)

I'm out today (just checking in to land the H3 alpha) but if you're around next week I'd be happy to sync in real time, to get a better understanding of your deployment set-up and either help sort out the rewrite case or find something else that works. Any chance you're on envoy slack? If so I can ping over there on Monday

@alyssawilk
Copy link
Contributor

I've played around myself and agree I don't think there's a trivial way to revert this in Envoy today.
I think my preferred option is to add a specific scheme rewrite option to the route, just as we have a plethora of host rewrite options. @heatonmatthew if this SGTY I'll go get api-shephard thumbs up, throw a PR together, and bump back that feature deprecation one more version.

@heatonmatthew
Copy link

Looking around more broadly I can see that this support will be needed for a long time. The .NET team agree that the spec is ambiguous but are working on the assumption that all proxies will need to support the previous behaviour as a configuration option into the future (see comment).

I also note that the following workarounds are either in place or being requested:

If support was provided for a specific scheme rewrite option for each route, I believe that it will enable the desired outcome. What does concern me is that each solution (e.g. Google, AWS NLB, Istio, Gloo etc.) will need to either provide a mechanism to detect the downgrade and set the scheme rewrite for each route as part of their xDS plane implementation, or pass the additional configuration required onto their end-users.

Deliberating on how they want to support the change is likely to take those teams a while, so I suspect that pushing out the timeframe for removal of the envoy.reloadable_features.preserve_downstream_scheme flag makes sense to me.

@mattklein123
Copy link
Member

mattklein123 commented Jun 8, 2021

Deliberating on how they want to support the change is likely to take those teams a while, so I suspect that pushing out the timeframe for removal of the envoy.reloadable_features.preserve_downstream_scheme flag makes sense to me.

I would like to hear from @alyssawilk on this, but I continue to be concerned that we may be causing undo hardship on our users with this change, and I'm wondering if there are any compromise that are possible. (I don't know what that would be as I haven't spent enough time fully understanding this issue, and @alyssawilk has spent a ton of time, so hopefully she can summarize and educate us on the current thinking.)

@alyssawilk
Copy link
Contributor

If we think per route config is too onerous, we could have a HCM option which allows switching for adding absent scheme based on upstream or downstream? @mattklein123 will sync with you offline and sort out a plan

@qiwzhang
Copy link
Contributor

Here I like to summarize the issue ESPv2 is facing with this change:

Setting correct :scheme header is important for Google IAP to work. Google IAP requires :scheme to be https, otherwise it will send 302 redirect.
When ESPv2 envoy is deployed in Google Cloud-Run service, its downstream protocol is http since it is getting request from GFE and GFE terminates the SSL from the client, and its upstream uses https, they could be any of Google serverless platforms, e.g. app-engine, Cloud-function, or Cloud-Run. "preserve_downstream_scheme" behaviors is not correct for ESPv2.
ESPv2 will like to revert back to the old behaviors that setting :scheme based on the upstream security level.

So I propose to add a config flag somewhere to keep the old behaviors.

@alyssawilk
Copy link
Contributor

So Matt and I I discussed this offline and think the way to go about this is have a per-route scheme modifier, as we do for host and path, and also have a per-HCM option to set scheme. I think for ESPv2 you'd just set that to https, and get the behavior you want. Does that sound reasonable?

@qiwzhang
Copy link
Contributor

Sound Good. Thanks

alyssawilk added a commit that referenced this issue Jul 7, 2021
Adding the option to override scheme
Risk Level: low (config guarded code)
Testing: unit testing
Docs Changes: n/a
Release Notes: inline
Part of #14587
Fixes #17105

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@alyssawilk
Copy link
Contributor

Ok, way back in January I declared the last step as not adding XFP, as a forcing function to get folks to migrate over to :scheme
I think that's going to be more trouble than it's worth (it's definitely going to break a ton of filters and while they're probably occasionally doing the wrong thing breaking them isn't great) so I think I'm going to stop at :scheme being always set and the new docs informing which should be used when, and letting folks have their buggy filters (which now that we're sending :scheme correctly are only buggy for the .0001% of traffic that XFP and :scheme don't match)

alyssawilk added a commit that referenced this issue Jul 21, 2021
Corrects all Envoy uses of ForwardedProto which actually want request URI over to :scheme

As a reminder, XFP indicates the encryption of the (original) downstream connection where :scheme is part of the URI and the resource requested. It's legal (though unusual) to request http:// urls over a TLS connection for HTTP/2. It's possible (if ill advised) to have an internal mesh forwarding https schemed requests in the clear.

Current uses of X-Forwarded-Proto are

in the HCM, clearing XFP from untrusted users (unchanged)
in the HCM, setting absent XFP based on downstream transport security (unchanged)
in the HCM setting absent :scheme to XFP (unchanged)
in buildOriginalUri, changing from using XFP to scheme (changed. new URIs should be based on original URIs not on transport security.
in the router, clearing default port based on XFP (unchanged)
in the router serving redirect URLs based on scheme (changed - used to be XFP but is now based on the scheme of the original URI)
in the router, applying SSL route redirect based on XFP (unchanged)
in the router, using :scheme for internal redirect url checks (changed - used to use XFP. new URIs should be based on original URI)
in the cache filter, using :scheme to serve content (changed we used to serve based on XFP but if http://foo.com/ differs from https://foo.com and the http version is requested over a TLS connection the http response should be served)
in oath2 serving redirect URLs based on scheme (changed this used to be based on SFP but URLs should be based on original URL scheme)
Risk Level: High
Testing: updated tests
Docs Changes: inline
Release Notes: inline
Runtime guard: envoy.reloadable_features.correct_scheme_and_xfp
Fixes #14587

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
leyao-daily pushed a commit to leyao-daily/envoy that referenced this issue Sep 30, 2021
Adding the option to override scheme
Risk Level: low (config guarded code)
Testing: unit testing
Docs Changes: n/a
Release Notes: inline
Part of envoyproxy#14587
Fixes envoyproxy#17105

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
leyao-daily pushed a commit to leyao-daily/envoy that referenced this issue Sep 30, 2021
Corrects all Envoy uses of ForwardedProto which actually want request URI over to :scheme

As a reminder, XFP indicates the encryption of the (original) downstream connection where :scheme is part of the URI and the resource requested. It's legal (though unusual) to request http:// urls over a TLS connection for HTTP/2. It's possible (if ill advised) to have an internal mesh forwarding https schemed requests in the clear.

Current uses of X-Forwarded-Proto are

in the HCM, clearing XFP from untrusted users (unchanged)
in the HCM, setting absent XFP based on downstream transport security (unchanged)
in the HCM setting absent :scheme to XFP (unchanged)
in buildOriginalUri, changing from using XFP to scheme (changed. new URIs should be based on original URIs not on transport security.
in the router, clearing default port based on XFP (unchanged)
in the router serving redirect URLs based on scheme (changed - used to be XFP but is now based on the scheme of the original URI)
in the router, applying SSL route redirect based on XFP (unchanged)
in the router, using :scheme for internal redirect url checks (changed - used to use XFP. new URIs should be based on original URI)
in the cache filter, using :scheme to serve content (changed we used to serve based on XFP but if http://foo.com/ differs from https://foo.com and the http version is requested over a TLS connection the http response should be served)
in oath2 serving redirect URLs based on scheme (changed this used to be based on SFP but URLs should be based on original URL scheme)
Risk Level: High
Testing: updated tests
Docs Changes: inline
Release Notes: inline
Runtime guard: envoy.reloadable_features.correct_scheme_and_xfp
Fixes envoyproxy#14587

Signed-off-by: Alyssa Wilk <alyssar@chromium.org>
@johnlanni
Copy link
Contributor

johnlanni commented Oct 9, 2023

@alyssawilk I'm still a little confused about this mechanism.
Based on the above discussion, the XFP header will retain settings in the future, so this header itself can be used to indicate whether envoy's downstream has TLS enabled. There is no need to use the downstream scheme header as the upstream scheme header, if this scheme header is transparently transmitted , causing grpc frameworks such as .NET to be unable to handle it normally.
Although HCM provides options to override the scheme, this is not enough, because what we actually need is to use the upstream secure state to set the scheme header like the previous version do:

const bool transport_secure =
Runtime::runtimeFeatureEnabled("envoy.reloadable_features.preserve_downstream_scheme")
? downstream_secure
: upstream_secure;
if (transport_secure) {
headers.setReferenceScheme(Http::Headers::get().SchemeValues.Https);
} else {
headers.setReferenceScheme(Http::Headers::get().SchemeValues.Http);
}

@alyssawilk
Copy link
Contributor

the mechanism boils down to

  1. if we trust the source, we use XFP
  2. if we don't, we set XFP based on the scheme of the conneciton

This seems to have worked for all users for a couple of years now. If you have a concrete example of it not working let us know and we'll try to address.

@johnlanni
Copy link
Contributor

Hi @alyssawilk , thanks for you reply. My scenario is that Envoy receives gRPCs(encrypted) requests and proxies gRPC (not encrypted) request to a .NET service. The .NET gRPC framework checks the :scheme header, if the request is not encrypted, but the :scheme is https, the request will fail.

I solved the problem by setting the HCM option scheme_header_transformation to http, but this option has a side effect of setting the XFP header to http, which makes it impossible to determine the Envoy downstream protocol through the XFP header.

@johnlanni
Copy link
Contributor

the mechanism boils down to

  1. if we trust the source, we use XFP
  2. if we don't, we set XFP based on the scheme of the conneciton

This seems to have worked for all users for a couple of years now. If you have a concrete example of it not working let us know and we'll try to address.

This mechanism is reasonable, but I think the :scheme header of the upstream request should not be set to the same as XFP, because this makes :scheme ambiguous, is the current request‘s scheme or the original request's scheme? I prefer to define it this way:
:scheme: The currently request's scheme.
x-forwarded-proto: The original request's scheme.

@alyssawilk
Copy link
Contributor

yeah unfortunately many servers predate the x-forwarded-proto header, and so expect scheme to be the scheme of the original request. I think always setting scheme based on upstream crypto instead of defaulting to XFP is a reasonable request for a cluster configuration option but you'd probably need to find someone to sign on to implement it. Worth filing as its own feature request if you're interested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants