Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

firewall: new plugin which adds allow rules for container IPs to firewalls #75

Closed
wants to merge 5 commits into from

Conversation

dcbw
Copy link
Member

@dcbw dcbw commented Sep 29, 2017

Distros often have additional rules in the filter table that do things like:

-A FORWARD -j REJECT --reject-with icmp-host-prohibited

docker, for example, gets around this by adding explicit rules to the filter
table's FORWARD chain to allow traffic from the docker0 interface. Do that
for a given host interface too, as a chained plugin. eg:

{
    "cniVersion": "0.3.1",
    "name": "bridge-thing",
    "plugins": [
      {
        "type": "bridge",
        "bridge": "cni0",
        "isGateway": true,
        "ipMasq": true,
        "ipam": {
            "type": "host-local",
            "subnet": "10.88.0.0/16",
            "routes": [
                { "dst": "0.0.0.0/0" }
            ]
        }
      },
      {
        "type": "firewall"
      }
    ]
}

@containernetworking/cni-maintainers @squeed @danwinship

@dcbw
Copy link
Member Author

dcbw commented Sep 29, 2017

For the record, the code adds rules like:

*filter
:FORWARD ACCEPT [0:0]
:CNI-ADMIN-cni0 - [0:0]
:CNI-FORWARD-cni0 - [0:0]
-A FORWARD -i cni0 ! -o cni0 -m comment --comment "CNI interface cni0 administrator overrides" -j CNI-ADMIN-cni0
-A FORWARD -m comment --comment "CNI interface cni0 private rules" -j CNI-FORWARD-cni0
-A CNI-FORWARD-cni0 -i cni0 ! -o cni0 -j ACCEPT
-A CNI-FORWARD-cni0 -i cni0 -o cni0 -j ACCEPT
-A CNI-FORWARD-cni0 -o cni0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

@squeed
Copy link
Member

squeed commented Oct 2, 2017

Man, this has been on my mental to-do list for over a year. I might even add a firewalld mode to it :-)

@squeed
Copy link
Member

squeed commented Oct 2, 2017

If you're not going to delete, does it make sense to just idempotently create one allow rule per subnet?

}

var intfName string
if iptConf.PrevResult != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it's relying on ordering to pick the "bridge" interface. Does it make sense to also accept the ifname as a configuration hint?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squeed yes, I'll do this as an option with fallback to the first PrevResult host interface. This will also make the plugin compatible with 0.1.0 and later too.

@dcbw
Copy link
Member Author

dcbw commented Oct 5, 2017

If you're not going to delete, does it make sense to just idempotently create one allow rule per subnet?

@squeed We do kinda idempotently create one rule per bridge, which would also work if the bridge changes IP subnets which sometimes happens. If possible I'd like to stick with interface name. Is there a good reason to do IP net instead?

@squeed
Copy link
Member

squeed commented Oct 5, 2017

Disregard my comment about subnets; I'd misread the code.

return fmt.Errorf("failed to initialize iptables helper: %v", err)
}

adminChainName := fmt.Sprintf("CNI-ADMIN-%s", intf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making this name overridable might be useful; admins might want to just have a single chain for interfaces or, as in the case of ptp, not have predictable if (and therefore chain) names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@squeed you mean just one chain name for all admin overrides, or making it a conf option?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A conf option, of course :-)

@dcbw dcbw force-pushed the ipt-fw-allow-interface branch 3 times, most recently from 213212f to 67c2d9d Compare October 5, 2017 21:40
@dcbw
Copy link
Member Author

dcbw commented Oct 5, 2017

@squeed admin chain name override done.

func getPrivChainRules(intf string) [][]string {
var rules [][]string
rules = append(rules, []string{"-i", intf, "!", "-o", intf, "-j", "ACCEPT"})
rules = append(rules, []string{"-i", intf, "-o", intf, "-j", "ACCEPT"})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need two rules here? Can't we just not match on -o at all?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewdupre hmm, I mostly just copied what Docker was doing here; assuming they knew what they were doing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@matthewdupre consolidated the two rules, you're right we don't need the double-match. The docker code that was adding both was doing it from two unrelated pieces of code which probably aren't aware of each other.

@dcbw
Copy link
Member Author

dcbw commented Nov 20, 2017

@containernetworking/cni-maintainers any more comments on this?

@squeed
Copy link
Member

squeed commented Nov 21, 2017

Tested and it works well.

Could you add some mention in the README that the rules are not cleaned up?

I've been thinking about how this would work with just a ptp setup instead of a bridge, especially since we're considering doing that for kubenet. There might be a lot of pollution with no cleanup.

A few ideas for how to solve this:

  1. Add the veths to a devgroup and match on devgroup
  2. Match on CIDR instead of interface.

Thoughts? If you want to add this to a follow-up PR that'd be fine.

@squeed
Copy link
Member

squeed commented Nov 24, 2017

Nevermind, the solution is simple: just hard-code to match on interface veth+

@dcbw
Copy link
Member Author

dcbw commented Nov 28, 2017

Nevermind, the solution is simple: just hard-code to match on interface veth+

@squeed can you explain a bit more?

@squeed
Copy link
Member

squeed commented Nov 28, 2017

So, imagine kubenet moves from bridge to ptp. Now we have one rule per interface, rather than one rule total (for the bridge). Given that this plugin doesn't do delete... things start to look pretty ugly pretty quickly.

I can think of two solutions to this: Add some kind of devicegroup allow mode that adds each interface to a devicegroup, then makes sure there is an allow rule for that group, or, use the "interface" parameter to match on a interface name wildcard. This is a bit of a problem for non-cni veths, though. Probably a bit too fragile.

@dcbw dcbw force-pushed the ipt-fw-allow-interface branch 2 times, most recently from 3bd99fa to 50bbfa1 Compare January 6, 2018 03:27
@dcbw
Copy link
Member Author

dcbw commented Jan 6, 2018

@squeed ok, how about the newly-pushed approach instead. It implements delete by checking whether the given interface is a master for any other interfaces (eg, a bridge). If so and there are no child/slave devices attached (since presumably the last one just got deleted) then DEL will clean up the iptables rules. Otherwise DEL leaves them alone.

If this seems worthwhile, we probably want to use this same approach in the bridge plugin's IPMasq code, since that also never cleans itself up.

@dcbw
Copy link
Member Author

dcbw commented Jan 19, 2018

@squeed ok my approach is busted (and so would a devgroup one) because DEL runs plugins in reverse order, and so the bridge still has the port attached. Any suggestions?

Update: so it's not totally busted, because my approach does get the container interface name and we can look at IFLA_LINK on that interface for its peer ifindex, and ignore that peer when checking IFLA_MASTER attributes in the host namespace.

@dcbw dcbw force-pushed the ipt-fw-allow-interface branch 2 times, most recently from 79b6192 to a0691a7 Compare January 22, 2018 23:21
@dcbw
Copy link
Member Author

dcbw commented Jan 22, 2018

@squeed updated to exclude a container's peer when checking if the bridge has interfaces. Also removed the PrevResult stuff, since on DEL we don't have a PrevResult to pull the host ifname from, since plugins are executed in reverse order.

I guess we could try caching that somewhere from the ADD call, but for now you have to specify the bridge interface for bridge, and you might as well just copy that for iptables-allow since you know it already.

@rosenhouse
Copy link
Contributor

@dcbw what do you think about extracting the "find peer interface" logic into a public library function? that way we could use it in #96 too (specifically for this issue )

@vadorovsky
Copy link
Contributor

Looks good to me

@dcbw dcbw requested review from rosenhouse and bboreham June 12, 2018 20:31
@bboreham
Copy link
Contributor

What if the system has INPUT default policy set to DROP ?

@dcbw
Copy link
Member Author

dcbw commented Jun 19, 2018

@bboreham AFAICT even docker 1.13 doesn't add anything to the nat+INPUT chain, just FORWARD. So we're doing nothing different here than docker already. I'd imagine INPUT defaulting to DROP would just kill all your return traffic :)

@bboreham
Copy link
Contributor

Docker didn't expose services from the container network; the approach was to map ports out to the host network. I'm thinking about Kubernetes, where services do live on the container network.

Imagine a system where the admin has carefully set things up to reject incoming packets from anywhere outside a defined set of rules. Shouldn't our firewall plugin allow for this case, and add rules to the INPUT chain?

@dcbw
Copy link
Member Author

dcbw commented Jun 20, 2018

@bboreham added #168 for the INPUT chain issue.

@rhatdan
Copy link

rhatdan commented Jul 12, 2018

Any progress on this? This is a blocking issue for podman.

@vadorovsky
Copy link
Contributor

vadorovsky commented Jul 12, 2018

Is #168 a blocking issue for this PR or should it be solved as a follow-up?

@ashcrow
Copy link

ashcrow commented Sep 10, 2018

Any progress here? Our QE folks are blocked from merging new OS level CI tests for podman.

@mheon
Copy link

mheon commented Sep 10, 2018

@ashcrow containers/podman#1431

@ashcrow
Copy link

ashcrow commented Sep 10, 2018

@mheon thanks! I assume that supersedes this PR?

@mheon
Copy link

mheon commented Sep 10, 2018

@ashcrow It's more of a temporary solution until this can be rewritten and merged, which could be some months still

@ashcrow
Copy link

ashcrow commented Sep 10, 2018

@mheon makes sense. Thanks!

@rhatdan
Copy link

rhatdan commented Jan 10, 2019

Any update on this?

@dcbw
Copy link
Member Author

dcbw commented Jan 16, 2019

@rhatdan still waiting on the CHECK changes that @mccv1r0 is working on. It's still in progress.

@znmeb
Copy link

znmeb commented Feb 7, 2019

@dcbw Is there a workaround for this I can implement on Silverblue 29 via firewall-config?

@gbraad
Copy link

gbraad commented Feb 15, 2019

What is the status of this?

}

// Tolerate errors if the container namespace has been torn down already
containerNS, err := ns.GetNS(args.Netns)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This plugin shouldn't are about the network namespace, right?

@dcbw
Copy link
Member Author

dcbw commented Apr 12, 2019

This PR was updated and pushed as #290

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.