-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
firewall: new plugin which adds allow rules for container IPs to firewalls #75
Conversation
8f2c100
to
1619a37
Compare
For the record, the code adds rules like:
|
Man, this has been on my mental to-do list for over a year. I might even add a firewalld mode to it :-) |
If you're not going to delete, does it make sense to just idempotently create one allow rule per subnet? |
} | ||
|
||
var intfName string | ||
if iptConf.PrevResult != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it's relying on ordering to pick the "bridge" interface. Does it make sense to also accept the ifname as a configuration hint?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@squeed yes, I'll do this as an option with fallback to the first PrevResult host interface. This will also make the plugin compatible with 0.1.0 and later too.
@squeed We do kinda idempotently create one rule per bridge, which would also work if the bridge changes IP subnets which sometimes happens. If possible I'd like to stick with interface name. Is there a good reason to do IP net instead? |
Disregard my comment about subnets; I'd misread the code. |
return fmt.Errorf("failed to initialize iptables helper: %v", err) | ||
} | ||
|
||
adminChainName := fmt.Sprintf("CNI-ADMIN-%s", intf) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making this name overridable might be useful; admins might want to just have a single chain for interfaces or, as in the case of ptp, not have predictable if (and therefore chain) names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@squeed you mean just one chain name for all admin overrides, or making it a conf option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A conf option, of course :-)
213212f
to
67c2d9d
Compare
@squeed admin chain name override done. |
func getPrivChainRules(intf string) [][]string { | ||
var rules [][]string | ||
rules = append(rules, []string{"-i", intf, "!", "-o", intf, "-j", "ACCEPT"}) | ||
rules = append(rules, []string{"-i", intf, "-o", intf, "-j", "ACCEPT"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need two rules here? Can't we just not match on -o at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewdupre hmm, I mostly just copied what Docker was doing here; assuming they knew what they were doing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@matthewdupre consolidated the two rules, you're right we don't need the double-match. The docker code that was adding both was doing it from two unrelated pieces of code which probably aren't aware of each other.
67c2d9d
to
22f3da1
Compare
@containernetworking/cni-maintainers any more comments on this? |
Tested and it works well. Could you add some mention in the README that the rules are not cleaned up? I've been thinking about how this would work with just a ptp setup instead of a bridge, especially since we're considering doing that for kubenet. There might be a lot of pollution with no cleanup. A few ideas for how to solve this:
Thoughts? If you want to add this to a follow-up PR that'd be fine. |
Nevermind, the solution is simple: just hard-code to match on interface |
@squeed can you explain a bit more? |
So, imagine kubenet moves from bridge to ptp. Now we have one rule per interface, rather than one rule total (for the bridge). Given that this plugin doesn't do delete... things start to look pretty ugly pretty quickly. I can think of two solutions to this: Add some kind of devicegroup allow mode that adds each interface to a devicegroup, then makes sure there is an allow rule for that group, or, use the "interface" parameter to match on a interface name wildcard. This is a bit of a problem for non-cni veths, though. Probably a bit too fragile. |
3bd99fa
to
50bbfa1
Compare
@squeed ok, how about the newly-pushed approach instead. It implements delete by checking whether the given interface is a master for any other interfaces (eg, a bridge). If so and there are no child/slave devices attached (since presumably the last one just got deleted) then DEL will clean up the iptables rules. Otherwise DEL leaves them alone. If this seems worthwhile, we probably want to use this same approach in the bridge plugin's IPMasq code, since that also never cleans itself up. |
50bbfa1
to
66f5a8a
Compare
@squeed ok my approach is busted (and so would a devgroup one) because DEL runs plugins in reverse order, and so the bridge still has the port attached. Any suggestions? Update: so it's not totally busted, because my approach does get the container interface name and we can look at IFLA_LINK on that interface for its peer ifindex, and ignore that peer when checking IFLA_MASTER attributes in the host namespace. |
79b6192
to
a0691a7
Compare
@squeed updated to exclude a container's peer when checking if the bridge has interfaces. Also removed the PrevResult stuff, since on DEL we don't have a PrevResult to pull the host ifname from, since plugins are executed in reverse order. I guess we could try caching that somewhere from the ADD call, but for now you have to specify the bridge interface for bridge, and you might as well just copy that for iptables-allow since you know it already. |
@dcbw what do you think about extracting the "find peer interface" logic into a public library function? that way we could use it in #96 too (specifically for this issue ) |
Looks good to me |
What if the system has |
@bboreham AFAICT even docker 1.13 doesn't add anything to the nat+INPUT chain, just FORWARD. So we're doing nothing different here than docker already. I'd imagine INPUT defaulting to DROP would just kill all your return traffic :) |
Docker didn't expose services from the container network; the approach was to map ports out to the host network. I'm thinking about Kubernetes, where services do live on the container network. Imagine a system where the admin has carefully set things up to reject incoming packets from anywhere outside a defined set of rules. Shouldn't our firewall plugin allow for this case, and add rules to the INPUT chain? |
Any progress on this? This is a blocking issue for podman. |
Is #168 a blocking issue for this PR or should it be solved as a follow-up? |
Any progress here? Our QE folks are blocked from merging new OS level CI tests for |
@mheon thanks! I assume that supersedes this PR? |
@ashcrow It's more of a temporary solution until this can be rewritten and merged, which could be some months still |
@mheon makes sense. Thanks! |
Any update on this? |
@dcbw Is there a workaround for this I can implement on Silverblue 29 via |
What is the status of this? |
} | ||
|
||
// Tolerate errors if the container namespace has been torn down already | ||
containerNS, err := ns.GetNS(args.Netns) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This plugin shouldn't are about the network namespace, right?
This PR was updated and pushed as #290 |
Distros often have additional rules in the filter table that do things like:
-A FORWARD -j REJECT --reject-with icmp-host-prohibited
docker, for example, gets around this by adding explicit rules to the filter
table's FORWARD chain to allow traffic from the docker0 interface. Do that
for a given host interface too, as a chained plugin. eg:
@containernetworking/cni-maintainers @squeed @danwinship