Kubernetes SNI support for clusters can generate illegal DNS names #2766

webvictim · 2019-06-10T16:23:19Z

What happened:
The Kubernetes SNI support (introduced in #2619) doesn't work when it ends up generating a DNS name which is longer than 63 characters (such as k67757374657374696e676e65772d656173742e6772617669746174696f6e616c2e636f.gustestingnew-main.gravitational.co)

$ kubectl get pods
Unable to connect to the server: dial tcp: lookup k67757374657374696e676e65772d656173742e6772617669746174696f6e616c2e636f.gustestingnew-main.gravitational.co: no such host
------------------------------------------------------------
$ host k67757374657374696e676e65772d656173742e6772617669746174696f6e616c2e636f.gustestingnew-main.gravitational.co
host: 'k67757374657374696e676e65772d656173742e6772617669746174696f6e616c2e636f.gustestingnew-main.gravitational.co' is not a legal IDN name (domain label longer than 63 characters), use +noidnin
------------------------------------------------------------
$ echo "k67757374657374696e676e65772d656173742e6772617669746174696f6e616c2e636f.gustestingnew-main.gravitational.co" | wc -c
108

k67757374657374696e676e65772d656173742e6772617669746174696f6e616c2e636f here is gustestingnew-east.gravitational.co which is a fairly standard cluster name.

The effect of these >63 character records is that the connection fails and Kubernetes support can't be used for those clusters. >63 characters is pretty easy to hit in many standard multi-region setups which contain the region in the DNS name.

In my experience so far, hex-encoding the string containing the name of the trusted cluster causes its length to roughly double, so realistically any trusted cluster name which is longer than 30 characters or so will probably be unusable, and this value gets shorter depending on the length of the main cluster's domain name.

The only workaround for this so far is "use shorter cluster names" which isn't really doable in many cases.

What you expected to happen:
Kubernetes SNI support to work as described in the PR. This static encoding of the SNI probably needs to be replaced with some kind of stateful encoding which doesn't have limits on cluster name length and actively tries to generate DNS records which are shorter than 63 characters.

How to reproduce it (as minimally and precisely as possible):
Set up a trusted cluster with a name that contains a few subdomains (such as gustestingnew-east.gravitational.co above)

Environment:

Teleport version (use teleport version): Teleport Enterprise v3.2.4git:v3.2.4-0-g339827c6 go1.11.5
Tsh version (use tsh version): Teleport v3.2.4 git:v3.2.4-0-g339827c6 go1.11.5
OS (e.g. from /etc/os-release): Fedora 29

The text was updated successfully, but these errors were encountered:

kontsevoy · 2019-06-11T00:55:44Z

Looks like we have an enterprise evaluation that depends on it. Let's schedule it for a 4.1 point-release.

klizhentas · 2019-07-05T18:02:23Z

It would be impossible to completely solve the problem given the fact that it would still be possible to create a cluster name of 252 characters (with max DNS name of 253) and even with the shortener that would result in reaching out max DNS name.

However, we can solve the problem of 63 max label name by using a hash function instead of variable length hex encoding.

I suggest to use SHA1, for the example above it will look like this:

>>> hashlib.sha1("gustestingnew-main.gravitational.co").hexdigest()
'c2ead9d108feec21599547c7cee5f62e3c3ce684'

The collision probability on our set of trusted clusters ~1K range is negligible, so I'm leaning towards this option.

I've picked SHA1 because it generates 40 char hex dumps which is nice given our 63 max limit.

@fspmarshall @russjones thoughts?

knisbet · 2019-07-06T03:32:28Z

I'm not sure I understand teleport or this problem well enough to provide useful feedback. I'm also not sure if this is a problem only for the kubernetes integrations, or other protocols / integrations have similar problems.

Assuming it's only Kubernetes/kubectl/client-go as described by Gus, one area I would explore is whether teleport uses a single x509 certificate for all proxied clusters or whether it creates or is able to create a certificate per cluster. If it creates a certificate per cluster, I would consider embedding the state for which cluster to access into the certificate, instead of trying to signal it over SNI with wildcard DNS. This way the kubeconfig can just point to the proxy, and the proxy can make it's routing decision based on a field embedded in the Subject. Something like /CN=kevin/O=gustestingnew-west.gravitational.co. If the current behaviour is when logging in, teleport signs a single certificate, and kubectl is then configured for all clusters, this obviously won't work. Another benefit, is it might mean it doesn't require wildcard DNS, which I assume adds some friction to getting teleport setup in some cases.

Some other random thoughts:

I wouldn't use sha1, for this use case there's nothing wrong with it, it's just due to it's weaknesses scanners/audits etc may take it out of context and decide teleport has a security issue.
If this doesn't require crypto properties, I would probably pick a non-crypto hash so as not to confuse this as requiring strong cryptograghy. I'm not up to date on non-crypto hashes, but CityHash/Siphash/SpookyHash could be looked into.
If it does require a crypto hash, probably for better collision resistance, I would then go with something from sha2 (can be truncated), or blake 2b. I'm not sure if blake2b should be used though, as it may not be compatible with the fips compliant builds.

klizhentas · 2019-07-06T21:44:55Z

I like your Subject-driven routing idea, it's very nice, that's not how teleport works at the moment - user gets a single client cert per cluster.

I agree re: sha1, it's better to stick to non-crypto cert to signal that it's just for routing, I will look into it.

fspmarshall · 2019-07-08T21:59:39Z

I'm still getting caught up on the multi-cluster stuff, but I take it that the reason for not requiring a cryptographic hash would be because adding a cluster requires administrative privileges, yeah? And, in theory, a hash collision could cause a client to be routed to a different cluster than intended?

Assuming the above to be true, and assuming a large dose of professional paranoia, I'd opt for using a cryptographic hash (barring any performance issues). Clearly there is no way to prevent a malicious administrator from doing damage, but I still think it is preferable to reduce the ability of malicious administrators to pull off subtle attacks. Inducing a hash collision is a less obvious attack than re-routing traffic explicitly, and almost surely would be harder to detect during an audit.

On the subject of maximum paranoia, SHAKE128 with a digest length of 31 bytes, would result in a 62 character hex string (just under the limit), is FIPS compliant, and offers 124 bits of collision resistance at that configuration.

webvictim · 2019-07-09T12:50:26Z

Unless I'm missing something, I don't see how we could reverse a SHA1/SHA2/any one-way hash to extract the real name of the target cluster from an SNI request? The beauty of hex encoding was that it was easily reversible to get back to a real cluster name. I suppose we could just hash every remote cluster name until we find one that fits...

I like Kevin's suggestion of encoding the cluster name into the certificate.

This commit fixes issue #2766. The prior logic in Kubernetes module used SNI to route requests to the target kubernetes cluster. This approach created problems with long cluster names exceeding 61 character DNS label limit and required setting up DNS wildcard records. This commit changes the routing to use the metadata encoded in client's x509 certificate to route the request to the target cluster. SNI approach will be supported for several versions to preserve backwards compatibility.

webvictim added the bug label Jun 10, 2019

kontsevoy added this to the 4.1 "Seattle" milestone Jun 11, 2019

klizhentas mentioned this issue Jul 12, 2019

Use client cert metadata to route requests. #2859

Merged

klizhentas closed this as completed Jul 16, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kubernetes SNI support for clusters can generate illegal DNS names #2766

Kubernetes SNI support for clusters can generate illegal DNS names #2766

webvictim commented Jun 10, 2019 •

edited

Loading

kontsevoy commented Jun 11, 2019

klizhentas commented Jul 5, 2019

knisbet commented Jul 6, 2019

klizhentas commented Jul 6, 2019

fspmarshall commented Jul 8, 2019

webvictim commented Jul 9, 2019 •

edited

Loading

Kubernetes SNI support for clusters can generate illegal DNS names #2766

Kubernetes SNI support for clusters can generate illegal DNS names #2766

Comments

webvictim commented Jun 10, 2019 • edited Loading

kontsevoy commented Jun 11, 2019

klizhentas commented Jul 5, 2019

knisbet commented Jul 6, 2019

klizhentas commented Jul 6, 2019

fspmarshall commented Jul 8, 2019

webvictim commented Jul 9, 2019 • edited Loading

webvictim commented Jun 10, 2019 •

edited

Loading

webvictim commented Jul 9, 2019 •

edited

Loading