Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deal with network=host containers #6

Open
therc opened this issue Mar 6, 2016 · 3 comments
Open

Deal with network=host containers #6

therc opened this issue Mar 6, 2016 · 3 comments

Comments

@therc
Copy link

therc commented Mar 6, 2016

An idea from kubernetes/kubernetes#14226 (comment):

For network=host containers,

it could look up the connection in /proc/net/tcp/* instead and match that against the /proc/*/fd/ symlinks, like lsof does. That's expensive, unless there's a way in iptables to munge the source IP/port to reduce the search space... loopback has this whole 127.0.0.0/8 range, after all. I'm not going to propose LD_PRELOAD or similar hacks. :-)

Another idea would be to only search in /proc directories where we know that a) there's a container and, ideally, b) it's a network=host container. Maybe this would be feasible only if ec2metaproxy were a library, as in #5.

@dump247
Copy link
Owner

dump247 commented Mar 7, 2016

This is the major shortcoming of the proxy. There are two issues that need to be resolved to make it work. The first is what you mention, mapping the request source port to a container. The second issue is how to do it and still allow the proxy to connect to the real metadata service. Somehow you have to configure iptables to only re-route non-ec2metaproxy packets. Otherwise you get an infinite loop.

I welcome ideas on how to resolve this. I learned enough about iptables to write the current rules for the proxy, so I don't have a lot of expertise there. Maybe the metadata proxy can run with it's own network bridge? That would make deployment a bit more complex.

The performance may or may not be a big issue. The AWS SDKs will cache the credentials until they expire, so you should only be paying the price once an hour for a container. I guess it depends on what the container is doing.

@therc
Copy link
Author

therc commented Mar 7, 2016

A few rough ideas:

  1. iptables has support for cmd-owner, gid-owner and uid-owner. We could match the command name and mark packets. Or perhaps the admin can run the proxy as its own user. Either way we might be able to whitelist that traffic.
  2. Bind to a port in a range (e.g. 0 to 1023) outside the standard ephemeral range, then connect from that port to 169.254.169.254:80. A whitelist entry lets that range pass through. A bit more work, especially dealing with arcane socket stuff. See https://idea.popcount.org/2014-04-03-bind-before-connect/
  3. Run an additional bridge as you suggest

@therc
Copy link
Author

therc commented Mar 7, 2016

As to performance: the cloudprovider in Kubernetes fetches metadata fairly frequently. Even if it's not hitting the credential endpoint, it's still going to go through the proxy. It would be nice to expose stats on traffic levels (by endpoint, preferably, as well as roles, errors, etc.) so that administrators and developers can have a better idea of what's happening behind the scenes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants