Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add proper SNI+wildcard based egress routing. #14126

Closed
libesz opened this issue Nov 9, 2023 · 13 comments
Closed

Add proper SNI+wildcard based egress routing. #14126

libesz opened this issue Nov 9, 2023 · 13 comments

Comments

@libesz
Copy link
Contributor

libesz commented Nov 9, 2023

I've been working on to fill this gap after forward_downstream_sni and a few other Istio specific envoy features were dropped. See istio/proxy#4958 #10052 #10053 #11291

I assume many Istio users will face this once the missing filters breaks their egress traffic more or less silently. My use-case is exactly the one which was dropped from the official docs without any replacement after 1.13.

The use-case: route TLS traffic to a subset of hostnames (including wildcard hostnames) to the exact destination that they are targeting. This requires dynamic routing, which the first class Istio resources cannot really express. The original docs suggested to use a dedicated nginx container to do the dynamic SNI dispatch part per connection and the actual forwarding + proxying. This allows the egress traffic restriction of the application namespace, so that targeting the non-listed hostnames can be blocked with i.e. NetworkPolicies.
See: https://istio.io/v1.13/docs/tasks/traffic-management/egress/wildcard-egress-hosts/

The idea now is to use the built-in SNI forwarder in Envoy instead of a standalone container: https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/network_filters/sni_dynamic_forward_proxy_filter

I developed two PoCs around this:

Double listener:

One using two independent Envoy listeners to work around the problem that TLS inspection can inspect only the outer TLS handshake, which is Istios mTLS. Without the forward_downstream_sni filter, the sidecar sends the egress GW service name in the mTLS session, so a second listener+inspection is needed to determine the final destination.

PoC code: https://gist.github.com/libesz/961ac221f210f8a2a3ff3b8e69ad9cc2

istio-sni-double

Single listener:

The other one is using the new Envoy set-filter-state extension that is able to do the job of forward_downstream_sni, so a single (heavily modified) Istio Gateway can do but the mesh mTLS termination and the SNI forwarding. New filter docs: https://www.envoyproxy.io/docs/envoy/latest/configuration/listeners/network_filters/set_filter_state#config-network-filters-set-filter-state

PoC code: https://gist.github.com/libesz/804ad714d47d57ac44c8f2d8dd7ba1f6

istio-sni-single


I an open to work on the contribution of these to the docs before the users are affected.

@frankbu
Copy link
Collaborator

frankbu commented Nov 9, 2023

IIUC, sounds like the second approach should be documented as the new replacement for the deleted doc: https://istio.io/v1.13/docs/tasks/traffic-management/egress/wildcard-egress-hosts/? Is that right?

The first approach sounds like something that is inferior, so not sure if it needs to be documented at all. Does that make sense, or am I missing something where either design might be preferable, in which case both designs should be documented with pros and cons?

@libesz
Copy link
Contributor Author

libesz commented Nov 10, 2023

The second solution is more intrusive to the istio-generated envoy config parts, the first one "just" adds a brand new cluster+listener. But I agree that if review validates the second solution is good enough (i.e. does not break any other mesh functionality), than the first could be dropped on long term, but!
The architecture of the first one is almost identical to the one that was dropped from the docs, with the only differences that:

  • outer TLS does not contain the final hostname between the sidecar and the gateway
  • the allowed host list is enforced on the SNI proxy listener, not on the mesh listener
  • there is no external container

And the big advantage is that the first solution is istio version independent. Any user can seamlessly migrate to it from the previous setup before 1.20! That's why it makes sense to me. Envoy is introducing the new filter capability (that replaces forward_downstream_sni and so makes the single-listener solution possible) only in 1.20 as I understood the PRs.

@libesz libesz changed the title Add proper SNI+wildcard based egress rouging. Add proper SNI+wildcard based egress routing. Nov 10, 2023
@libesz
Copy link
Contributor Author

libesz commented Nov 10, 2023

I can imagine the first version could be the way for 1.19 users and the second one for 1.20+. But I see also documenting both may be too much.

@frankbu
Copy link
Collaborator

frankbu commented Nov 10, 2023

Any user can seamlessly migrate to it from the previous setup before 1.20!

Given that 1.20 will be released next week, I'm not sure this advantage is so important.

I think it would be great to update the documentation ASAP with the single listener wildcard egressgateway, assuming (as you say) the solution is validated and doesn't break anything, given that many people have asked how to do it since the old doc was removed.

cc @howardjohn

@libesz
Copy link
Contributor Author

libesz commented Nov 10, 2023

To be on the safe side, we already started the migration of our systems to the first approach here on 1.18, so basically before upgrading to 1.19, to change as less as possible in a single step. Not sure how the old way operates on 1.19, but if it is already effected by the code removal in any ways, it might be better to have something that is compatible with that release already. Otherwise we can ignore it.

Anyways, of course I also like the single-listener pattern more. Let's see what others think about it.

@howardjohn
Copy link
Member

FWIW 2 listeners can use "internal listener" concept in Envoy. Not a full network traversal to localhost needed.

@libesz
Copy link
Contributor Author

libesz commented Nov 13, 2023

@howardjohn Thanks! I managed to rework the first approach to have only an internal listener for the SNI routing part. So it seem to have both design's advantages, such as:

  • Istio release independent
  • Does not change any sidecar envoy config with EnvoyFilters
  • No double network traversal inside the gateway

Can we proceed with this design as the preferred?

PoC: https://gist.github.com/libesz/ddf16b9d074d6af3b9cda2946047cf93
Description:
istio-sni-double (internal)

@howardjohn
Copy link
Member

I am not sure exactly what the request is. You absolutely can use the config above. Is the request to also official document it on istio.io?

If so, I am a bit concerned. We don't want to be pushing users towards these break-glass API usage. For super advanced users its ok, but the reason we removed the doc in the first place was it was a constant source of confusion for users

@frankbu
Copy link
Collaborator

frankbu commented Nov 13, 2023

Maybe a blog post that talks about the use case not covered by the docs (egress wildcard configuration for arbitrary domains) and then show these patterns as a couple of possible approaches. This way, we don't commit to supporting them, but we do give people some ideas that they can try. Minimally, we could also add a sentence or two to the mainline doc to mention the gap in the doc and even point to the blog post for some example ideas?

@howardjohn, would that alleviate your concern?

@howardjohn
Copy link
Member

howardjohn commented Nov 13, 2023 via email

@frankbu
Copy link
Collaborator

frankbu commented Nov 13, 2023

I like the idea of adding some examples to the wiki. How about adding both of them to the wiki and also make a blog post that explains in detail the one we think is best with a complete step by step walk-through? We can then update the doc with pointers to both places, something like this: #14146

@libesz
Copy link
Contributor Author

libesz commented Nov 24, 2023

The blog post is ready for a broader review ^.

@libesz
Copy link
Contributor Author

libesz commented Nov 30, 2023

Blog post merged. Thanks for all the input and help!
https://istio.io/latest/blog/2023/egress-sni/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants