Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ERROR: table `nat' is incompatible, use 'nft' tool. #461

Closed
greenpau opened this issue Mar 10, 2020 · 23 comments
Closed

ERROR: table `nat' is incompatible, use 'nft' tool. #461

greenpau opened this issue Mar 10, 2020 · 23 comments

Comments

@greenpau
Copy link

greenpau commented Mar 10, 2020

Is this a BUG REPORT or FEATURE REQUEST? (leave only one on its own line)

/kind feature

Description

I use nftables; when starting a container I get:

ERRO[0000] Error adding network: failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 1: iptables v1.8.2 (nf_tables): table `nat' is incompatible, use 'nft' tool.

ERRO[0000] Error while adding pod to CNI network "podman": failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 1: iptables v1.8.2 (nf_tables): table `nat' is incompatible, use 'nft' tool.

Error: error configuring network namespace for container 51f6adbaed7d674fb4b48d501eb7ce0605d09e003ac09f6588b98dea7230ca9f: failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 1: iptables v1.8.2 (nf_tables): table `nat' is incompatible, use 'nft' tool.

Steps to reproduce the issue:

  1. Create network configuration:
cat >/etc/cni/net.d/podman.conflist <<EOF
{
  "cniVersion": "0.4.0",
  "name": "podman",
  "plugins": [
    {
      "type": "bridge",
      "bridge": "cni-podman0",
      "isGateway": true,
      "ipMasq": true,
      "ipam": {
        "type": "host-local",
        "routes": [
          {
            "dst": "0.0.0.0/0"
          }
        ],
        "ranges": [
          [
            {
              "subnet": "192.168.124.0/24",
              "gateway": "192.168.124.1"
            }
          ]
        ]
      }
    },
    {
      "type": "portmap",
      "capabilities": {
        "portMappings": true
      }
    },
    {
      "type": "firewall",
      "backend": "nftables"
    }
  ]
}
EOF
  1. Pull fedora image and start a container:
podman pull fedora:latest
podman run -it fedora bash

Describe the results you received:

ERRO[0000] Error adding network: failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 1: iptables v1.8.2 (nf_tables): table `nat' is incompatible, use 'nft' tool.

ERRO[0000] Error while adding pod to CNI network "podman": failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 1: iptables v1.8.2 (nf_tables): table `nat' is incompatible, use 'nft' tool.

Error: error configuring network namespace for container 51f6adbaed7d674fb4b48d501eb7ce0605d09e003ac09f6588b98dea7230ca9f: failed to list chains: running [/usr/sbin/iptables -t nat -S --wait]: exit status 1: iptables v1.8.2 (nf_tables): table `nat' is incompatible, use 'nft' tool.

Describe the results you expected:

No errors.

Additional information you deem important (e.g. issue happens only occasionally):

Output of podman version:

podman version 1.6.4

Output of podman info --debug:

debug:
  compiler: gc
  git commit: ""
  go version: go1.12.12
  podman version: 1.6.4
host:
  BuildahVersion: 1.12.0-dev
  CgroupVersion: v1
  Conmon:
    package: conmon-2.0.6-1.module+el8.1.1+5259+bcdd613a.x86_64
    path: /usr/bin/conmon
    version: 'conmon version 2.0.6, commit: 6ffbb2ec70dbe5ba56e4bfde946fb04f19dd8bbf'
  Distribution:
    distribution: '"rhel"'
    version: "8.1"
  MemFree: 483997450240
  MemTotal: 540217061376
  OCIRuntime:
    name: runc
    package: runc-1.0.0-64.rc9.module+el8.1.1+5259+bcdd613a.x86_64
    path: /usr/bin/runc
    version: 'runc version spec: 1.0.1-dev'
  SwapFree: 10737414144
  SwapTotal: 10737414144
  arch: amd64
  cpus: 64
  eventlogger: journald
  kernel: 4.18.0-147.5.1.el8_1.x86_64
  os: linux
  rootless: false
  uptime: 662h 43m 16.45s (Approximately 27.58 days)
registries:
  blocked: null
  insecure: null
  search:
  - registry.redhat.io
  - registry.access.redhat.com
  - quay.io
  - docker.io
store:
  ConfigFile: /etc/containers/storage.conf
  ContainerStore:
    number: 4
  GraphDriverName: overlay
  GraphOptions: {}
  GraphRoot: /var/lib/containers/storage
  GraphStatus:
    Backing Filesystem: xfs
    Native Overlay Diff: "true"
    Supports d_type: "true"
    Using metacopy: "false"
  ImageStore:
    number: 2
  RunRoot: /var/run/containers/storage
  VolumePath: /var/lib/containers/storage/volumes

Package info (e.g. output of rpm -q podman or apt list podman):

podman-1.6.4-2.module+el8.1.1+5363+bf8ff1af.x86_64

Additional environment details (AWS, VirtualBox, physical, etc.):

Physical.

Reference: containers/podman#5446

@mars1024
Copy link
Member

It seems that ipMasq==true requires some iptables actions, but from the output iptables v1.8.2 (nf_tables), the iptables of your kernel is working with mode nf_tables, as far as I know, the iptables utility package go-iptables which is used by cni-plugins does not support this mode.

ping to go-iptables owner @squeed

@greenpau
Copy link
Author

Research notes

The addToNetwork is in ../../../github.com/containernetworking/cni/libcni/api.go:

func (network *cniNetwork) addToNetwork(rt *libcni.RuntimeConf, cni *libcni.CNIConfig) (cnitypes.Result, error) {
        logrus.Infof("About to add CNI network %s (type=%v)", network.name, network.config.Plugins[0].Network.Type)
        res, err := cni.AddNetworkList(context.Background(), network.config, rt)
        if err != nil {
                logrus.Errorf("Error adding network: %v", err)
                return nil, err
        }

        return res, nil
}

Here, the AddNetworkList is the method of *libcni.CNIConfig.
It takes in network.config (*libcni.NetworkConfigList) and rt (*libcni.RuntimeConf).

The *libcni.RuntimeConf:

(*libcni.RuntimeConf)(0xc0004e29c0)({
    ContainerID: (string) (len=64) "ba494ee826f97131fa4a2adb4b2e0c048fa3c4c337b2975b9727d8afe5840b1f",
    NetNS: (string) (len=55) "/var/run/netns/cni-55772b7c-7eaf-353e-dd04-c78d96262650",
    IfName: (string) (len=4) "eth0",
    Args: ([][2]string) (len=4 cap=4) {
        ([2]string) (len=2 cap=2) {
            (string) (len=13) "IgnoreUnknown",
            (string) (len=1) "1"
        },
        ([2]string) (len=2 cap=2) {
            (string) (len=17) "K8S_POD_NAMESPACE",
            (string) (len=15) "sleepy_jennings"
        },
        ([2]string) (len=2 cap=2) {
            (string) (len=12) "K8S_POD_NAME",
            (string) (len=15) "sleepy_jennings"
        },
        ([2]string) (len=2 cap=2) {
            (string) (len=26) "K8S_POD_INFRA_CONTAINER_ID",
            (string) (len=64) "ba494ee826f97131fa4a2adb4b2e0c048fa3c4c337b2975b9727d8afe5840b1f"
        }
    },
    CapabilityArgs: (map[string]interface {}) {
    },
    CacheDir: (string) ""
})

The *libcni.NetworkConfigList:

(*ocicni.cniNetwork)(0xc0005d2180)({
    name: (string) (len=6) "podman",
    filePath: (string) (len=30) "/etc/cni/net.d/podman.conflist",
    config: (*libcni.NetworkConfigList)(0xc000505860)({
        Name: (string) (len=6) "podman",
        CNIVersion: (string) (len=5) "0.4.0",
        DisableCheck: (bool) false,
        Plugins: ([]*libcni.NetworkConfig) (len=3 cap=4) {
            (*libcni.NetworkConfig)(0xc0005c2900)({
                Network: (*types.NetConf)(0xc0003e4c00)({
                CNIVersion: (string) "",
                Name: (string) "",
                Type: (string) (len=6) "bridge",
                Capabilities: (map[string]bool) <nil>,
                IPAM: (types.IPAM) {
                    Type: (string) (len=10) "host-local"
                },
                DNS: (types.DNS) {
                    Nameservers: ([]string) <nil>,
                    Domain: (string) "",
                    Search: ([]string) <nil>,
                    Options: ([]string) <nil>
                },
            }),
            (*libcni.NetworkConfig)(0xc0005c29a0)({
                Network: (*types.NetConf)(0xc0003e4cc0)({
                CNIVersion: (string) "",
                Name: (string) "",
                Type: (string) (len=7) "portmap",
                Capabilities: (map[string]bool) (len=1) {
                    (string) (len=12) "portMappings": (bool) true
                },
                IPAM: (types.IPAM) {
                    Type: (string) ""
                },
                DNS: (types.DNS) {
                    Nameservers: ([]string) <nil>,
                    Domain: (string) "",
                    Search: ([]string) <nil>,
                    Options: ([]string) <nil>
                },
            }),
            (*libcni.NetworkConfig)(0xc0005c2a00)({
                Network: (*types.NetConf)(0xc0003e4d80)({
                CNIVersion: (string) "",
                Name: (string) "",
                Type: (string) (len=8) "firewall",
                Capabilities: (map[string]bool) <nil>,
                    IPAM: (types.IPAM) {
                    Type: (string) ""
                },
                DNS: (types.DNS) {
                    Nameservers: ([]string) <nil>,
                    Domain: (string) "",
                    Search: ([]string) <nil>,
                    Options: ([]string) <nil>
                },
            })
        })
    })
})

@greenpau
Copy link
Author

It seems that ipMasq==true requires some iptables actions, but from the output iptables v1.8.2 (nf_tables), the iptables of your kernel is working with mode nf_tables, as far as I know, the iptables utility package go-iptables which is used by cni-plugins does not support this mode.

@mars1024, I tracked it down. As you said, go-iptables does not support nftables. This is indeed related to IPMasq. I will attempt a PR using google/nftables. In sum, I would need to implement Setup and Teardown functions. It looks pretty straightforward.

@greenpau
Copy link
Author

Research Notes

The error comes from EnsureChain(). It is invoked from setupChains(). The ib (*main.iptablesBackend) contains the following structure.

The key to distinguish between iptables and nft is by using mode key. If it is nf_tables, then use nft.

(*main.iptablesBackend)(0xc0001a4100)({
    protos: (map[iptables.Protocol]*iptables.IPTables) (len=2) {
        (iptables.Protocol) 0: (*iptables.IPTables)(0xc0001a4240)({
            path: (string) (len=14) "/sbin/iptables",
            proto: (iptables.Protocol) 0,
            hasCheck: (bool) true,
            hasWait: (bool) true,
            hasRandomFully: (bool) true,
            v1: (int) 1,
            v2: (int) 8,
            v3: (int) 2,
            mode: (string) (len=9) "nf_tables"
        }),
        (iptables.Protocol) 1: (*iptables.IPTables)(0xc0001a4380)({
            path: (string) (len=15) "/sbin/ip6tables",
            proto: (iptables.Protocol) 1,
            hasCheck: (bool) true,
            hasWait: (bool) true,
            hasRandomFully: (bool) true,
            v1: (int) 1,
            v2: (int) 8,
            v3: (int) 2,
            mode: (string) (len=9) "nf_tables"
        })
    },
    privChainName: (string) (len=11) "CNI-FORWARD",
    adminChainName: (string) (len=9) "CNI-ADMIN",
    ifName: (string) ""
})

This structure gets created by newIptablesBackend() function.

func newIptablesBackend(conf *FirewallNetConf) (FirewallBackend, error) {
    adminChainName := conf.IptablesAdminChainName
    if adminChainName == "" {
        adminChainName = "CNI-ADMIN"
    }

    backend := &iptablesBackend{
        privChainName:  "CNI-FORWARD",
        adminChainName: adminChainName,
        protos:         make(map[iptables.Protocol]*iptables.IPTables),
    }

    for _, proto := range []iptables.Protocol{iptables.ProtocolIPv4, iptables.ProtocolIPv6} {
        ipt, err := iptables.NewWithProtocol(proto)
        if err != nil {
            return nil, fmt.Errorf("could not initialize iptables protocol %v: %v", proto, err)
        }
        backend.protos[proto] = ipt
    }

    return backend, nil
}

The newIptablesBackend() is being invoked by ./plugins/meta/firewall/firewall.go.

Luckily, here is another clue:

func getBackend(conf *FirewallNetConf) (FirewallBackend, error) {
    switch conf.Backend {
    case "iptables":
        return newIptablesBackend(conf)
    case "firewalld":
        return newFirewalldBackend(conf)
    }

    // Default to firewalld if it's running
    if isFirewalldRunning() {
        return newFirewalldBackend(conf)
    }

    // Otherwise iptables
    return newIptablesBackend(conf)
}

There is either firewalld or iptables.

On one hand we could add case "nftables":. On the other hand, perhaps it is possible to make things happen with firewalld.

The next step is newFirewalldBackend() ... to be continued ..

@greenpau
Copy link
Author

As my first step, I implemented "donothing" firewall backend: https://github.com/greenpau/origin_containernetworking_plugins/blob/1dde487bf69f688e932ca04fc47fa55b502a45e6/plugins/meta/firewall/nftables.go

The next step is going through iptables/firewalld backends and figuring out what is necessary.

@greenpau
Copy link
Author

Research Notes

Invoked the following command to start a container:

sudo ./bin/podman --log-level debug run -it nicolaka/netshoot bash

The newly added nftBackend.Add() receives the following result:

cmdAdd() conf: (*main.FirewallNetConf)(0xc000170000)({
    NetConf: (types.NetConf) {
    CNIVersion: (string) (len=5) "0.4.0",
    Name: (string) (len=6) "podman",
    Type: (string) (len=8) "firewall",
    Capabilities: (map[string]bool) <nil>,
    IPAM: (types.IPAM) {
        Type: (string) ""
    },
    DNS: (types.DNS) {
        Nameservers: ([]string) <nil>,
        Domain: (string) "",
        Search: ([]string) <nil>,
        Options: ([]string) <nil>
    },
    RawPrevResult: (map[string]interface {}) <nil>,
    PrevResult: (*current.Result)(0xc00016c210)(
        Interfaces:[
            {Name:cni-podman0 Mac:72:9b:5e:59:50:9d Sandbox:}
            {Name:vetha961efb4 Mac:0a:c8:8c:9d:bf:15 Sandbox:}
            {Name:eth0 Mac:22:15:00:77:83:55 Sandbox:/var/run/netns/cni-614934c3-e870-215c-c74c-95b89fe52340}],
        IP:[{Version:4 Interface:0xc00012edb8 Address:{IP:192.168.124.103 Mask:ffffff00} Gateway:192.168.124.1}],
        Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:<nil>}], DNS:{Nameservers:[] Domain: Search:[] Options:[]})
    },
    Backend: (string) (len=8) "nftables",
    IptablesAdminChainName: (string) "",
    FirewalldZone: (string) (len=7) "trusted"
})

The IP addressing (removed references to IPv6) inside the container looks like this:

3: eth0@if119: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default
    link/ether 22:15:00:77:83:55 brd ff:ff:ff:ff:ff:ff link-netnsid 0
    inet 192.168.124.103/24 brd 192.168.124.255 scope global eth0
       valid_lft forever preferred_lft forever

The main network namespace looks like this:

54: cni-podman0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
    link/ether 72:9b:5e:59:50:9d brd ff:ff:ff:ff:ff:ff
    inet 192.168.124.1/24 brd 192.168.124.255 scope global cni-podman0
       valid_lft forever preferred_lft forever
119: vetha961efb4@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue master cni-podman0 state UP group default
    link/ether 0a:c8:8c:9d:bf:15 brd ff:ff:ff:ff:ff:ff link-netns cni-614934c3-e870-215c-c74c-95b89fe52340

Inside the container, I issue nslookup www.google.com. In the logs I see the following denies:

Mar 14 11:40:29 dsmgr2 kernel: IPv4 FORWARD drop: IN=cni-podman0 OUT=bond0 MACSRC=22:15:00:77:83:55 MACDST=72:9b:5e:59:50:9d MACPROTO=0800 SRC=192.168.124.103 DST=8.8.8.8 LEN=63 TOS=0x00 PREC=0x00 TTL=63 ID=4728 PROTO=UDP SPT=41139 DPT=53 LEN=43
Mar 14 11:40:30 dsmgr2 kernel: IPv4 FORWARD drop: IN=cni-podman0 OUT=bond0 MACSRC=22:15:00:77:83:55 MACDST=72:9b:5e:59:50:9d MACPROTO=0800 SRC=192.168.124.103 DST=8.8.4.4 LEN=63 TOS=0x00 PREC=0x00 TTL=63 ID=13998 PROTO=UDP SPT=41382 DPT=53 LEN=43

My NFT config (/etc/nftables/config.nft) logs the drop. There are some existing virbr0 rules, but nothing relates to CNI bridge.

        chain FORWARD {
                type filter hook forward priority 0; policy drop;
                oifname "virbr0" ip daddr 192.168.122.0/24 ct state related,established counter packets 0 bytes 0 accept
                iifname "virbr0" ip saddr 192.168.122.0/24 counter packets 0 bytes 0 accept
                iifname "virbr0" oifname "virbr0" counter packets 0 bytes 0 accept
                log flags all prefix "IPv4 FORWARD drop: "
                counter drop
        }

The next step is figuring out which rules to we need. The FORWARD chain is a good start ... to be continued ..

@greenpau
Copy link
Author

Research Notes

The following commands would be sufficient to make the containers talk to the outside world:

nft insert rule filter FORWARD position 51 oifname "cni-podman0" ip daddr 192.168.124.0/24 ct state established,related counter packets 0 bytes 0 accept
nft insert rule filter FORWARD position 51 iifname "cni-podman0" ip saddr 192.168.124.0/24 counter packets 0 bytes 0 accept
nft insert rule filter FORWARD position 51 iifname "cni-podman0" oifname "cni-podman0" counter packets 0 bytes 0 accept

This is what the FORWARD chain looks like after the change:

$ nft list chain ip filter FORWARD -a
table ip filter {
        chain FORWARD { # handle 2
                type filter hook forward priority 0; policy drop;
                oifname "cni-podman0" ip daddr 192.168.124.0/24 ct state established,related counter packets 100 bytes 113413 accept # handle 71
                iifname "cni-podman0" ip saddr 192.168.124.0/24 counter packets 124 bytes 12996 accept # handle 72
                iifname "cni-podman0" oifname "cni-podman0" counter packets 0 bytes 0 accept # handle 73
                oifname "virbr0" ip daddr 192.168.122.0/24 ct state established,related counter packets 0 bytes 0 accept # handle 51
                iifname "virbr0" ip saddr 192.168.122.0/24 counter packets 0 bytes 0 accept # handle 52
                iifname "virbr0" oifname "virbr0" counter packets 0 bytes 0 accept # handle 53
                log prefix "IPv4 FORWARD drop: " flags all # handle 54
                counter packets 10 bytes 630 drop # handle 55
        }
}

A container on 192.168.124.0/24 network is able to reach google.com successfully.

The thing necessary to complete cmdAdd() are in here ....

    PrevResult: (*current.Result)(0xc00016c210)(
        Interfaces:[
            {Name:cni-podman0 Mac:72:9b:5e:59:50:9d Sandbox:}
            {Name:vetha961efb4 Mac:0a:c8:8c:9d:bf:15 Sandbox:}
            {Name:eth0 Mac:22:15:00:77:83:55 Sandbox:/var/run/netns/cni-614934c3-e870-215c-c74c-95b89fe52340}],
        IP:[{Version:4 Interface:0xc00012edb8 Address:{IP:192.168.124.103 Mask:ffffff00} Gateway:192.168.124.1}],
        Routes:[{Dst:{IP:0.0.0.0 Mask:00000000} GW:<nil>}], DNS:{Nameservers:[] Domain: Search:[] Options:[]})
    },

The first interface in the list of the Interfaces is cni-podman0. The IP address and subnet mask is in IP.

The logic should be:

  1. If Interfaces has less than 3 interfaces, through an error
  2. The first interface in the list will be used for iifname and oifname
  3. The IP/Mask in IP.Address is ip daddr and ip saddr

@greenpau
Copy link
Author

greenpau commented Mar 14, 2020

Research Notes

To implement the above logic, let's start working on the Add() method of nftBackend:

func (nb *nftBackend) Add(conf *FirewallNetConf, result *current.Result) error {
    logrus.Errorf("nftBackend.Add() conf: %s", spew.Sdump(conf))
    logrus.Errorf("nftBackend.Add() result: %s", spew.Sdump(result))

    tables, err := nb.conn.ListTables()
    if err != nil {
        return fmt.Errorf("nftBackend.Add() error: %s", err)
    }
    logrus.Errorf("nftBackend.Add() result: %s", spew.Sdump(tables))

    return fmt.Errorf("nftBackend.Add is not supported")
    //return nil
}

The tables look like this.

nftBackend.Add() result: ([]*nftables.Table) (len=5 cap=8) {
    (*nftables.Table)(0xc000127040)({
        Name: (string) (len=6) "filter",
        Use: (uint32) 369098752,
        Flags: (uint32) 0,
        Family: (nftables.TableFamily) 2
    }),
    (*nftables.Table)(0xc0001270a0)({
        Name: (string) (len=6) "filter",
        Use: (uint32) 50331648,
        Flags: (uint32) 0,
        Family: (nftables.TableFamily) 10
    }),
    (*nftables.Table)(0xc000127100)({
        Name: (string) (len=6) "filter",
        Use: (uint32) 50331648,
        Flags: (uint32) 0,
        Family: (nftables.TableFamily) 7
    }),
    (*nftables.Table)(0xc000127180)({
        Name: (string) (len=3) "nat",
        Use: (uint32) 1627389952,
        Flags: (uint32) 0,
        Family: (nftables.TableFamily) 2
    }),
    (*nftables.Table)(0xc0001271e0)({
        Name: (string) (len=3) "nat",
        Use: (uint32) 67108864,
        Flags: (uint32) 0,
        Family: (nftables.TableFamily) 10
    })
}

The TableFamily is being described here https://godoc.org/github.com/google/nftables#TableFamily.

  • 2: IPv4 (NFPROTO_IPV4 = 0x2)
  • 10: IPv6 (NFPROTO_IPV6 = 0xa)
  • 7: IPv6 (NFPROTO_BRIDGE = 0x7)

First, step is our validation that IPv4 "filter" table exists.

@greenpau
Copy link
Author

Research Notes

The each individual rule in the FORWARD chain is in the structure references in google/nftables#102.

What I hoped for was that there is an easy way to convert that struct to a string for comparison, but there isn't.

Working on the comparison right now.

greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 15, 2020
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 15, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
@greenpau
Copy link
Author

google/nftables does not have a good interface to work with rules. I am switching to calling nft tool, i.e. os/exec.

@greenpau
Copy link
Author

Research Notes

Just for my records. I am leaving some nftables functions here:

var testNftConn *nftables.Conn

func getNftConn() (*nftables.Conn, error) {
    if testNftConn != nil {
        return testNftConn, nil
    }
    return &nftables.Conn{}, nil
}

func newNftablesBackend(conf *FirewallNetConf) (FirewallBackend, error) {
    conn, err := getNftConn()
    if err != nil {
        return nil, err
    }

    backend := &nftBackend{
        conn: conn,
    }
    return backend, nil
}

func (nb *nftBackend) getRules() error {
    rules, err := nb.conn.GetRule(nb.targetTable, nb.targetChain)
    if err != nil {
        return fmt.Errorf("nftBackend.Add() GetRule failed: %s", err)
    }

    if len(rules) > 0 {
        nb.targetHandle = rules[0].Handle
    }

    nb.rules = []string{}
    if nb.targetHandle > 0 {
        cmdArgs := []string{"list", "chain", "ip", "filter", "FORWARD", "-a"}
        stdoutLines, _, err := nb.execCommand(cmdArgs)
        if err != nil {
            return err
        }
        for _, line := range stdoutLines {
            line = strings.TrimSpace(line)
            if strings.HasPrefix(line, "tables") {
                continue
            }
            if strings.HasPrefix(line, "chain") {
                continue
            }
            nb.rules = append(nb.rules, line)
        }
    }
    return nil
}

func (nb *nftBackend) isFilterTableExists() error {
    isTableFound := false
    tables, err := nb.conn.ListTables()
    if err != nil {
        return fmt.Errorf("isFilterTableExists(): %s", err)
    }
    for _, table := range tables {
        if table.Name == "filter" && table.Family == 2 {
            nb.targetTable = table
            isTableFound = true
            break
        }
    }
    if !isTableFound {
        return fmt.Errorf("isFilterTableExists(): the IPv4 filter table does not exist")
    }
    nb.tables = tables
    return nil
}

func (nb *nftBackend) isForwardChainExists() error {
    isChainFound := false
    chains, err := nb.conn.ListChains()
    if err != nil {
        return fmt.Errorf("isForwardChainExists(): %s", err)
    }
    for _, chain := range chains {
        if chain.Name != "FORWARD" {
            continue
        }
        if chain.Type != "filter" {
            continue
        }
        if chain.Table.Name != "filter" {
            continue
        }
        if chain.Table.Family != 2 {
            continue
        }
        nb.targetChain = chain
        isChainFound = true
        break
    }
    if !isChainFound {
        return fmt.Errorf("isForwardChainExists(): the FORWARD chain in IPv4 filter table does not exist")
    }
    nb.chains = chains
    return nil
}

@greenpau
Copy link
Author

Research Notes

Noticed a lot of CNI-xxx chains in table ip nat when listing chains.

# nft list chains
table ip nat {
        chain PREROUTING {
                type nat hook prerouting priority -100; policy accept;
        }
        chain INPUT {
                type nat hook input priority 100; policy accept;
        }
        chain POSTROUTING {
                type nat hook postrouting priority 100; policy accept;
        }
        chain OUTPUT {
                type nat hook output priority -100; policy accept;
        }
        chain CNI-0c06bdc28bb4c85248530a39 {
        }
        chain CNI-b919aedbfb7e47721111dc17 {
        }
        chain CNI-f2ffcbd84f7c33a6a8a1a55c {
        }
        chain CNI-eb6d91571ef405734b309c20 {
        }
        chain CNI-e667801e98ebd1f763158e1e {
        }

Upon review the nat table, noticed the following:

table ip nat { # handle 27
        chain PREROUTING { # handle 1
                type nat hook prerouting priority -100; policy accept;
        }

        chain INPUT { # handle 2
                type nat hook input priority 100; policy accept;
        }

        chain POSTROUTING { # handle 3
                type nat hook postrouting priority 100; policy accept;
                ip saddr 192.168.124.8  counter packets 0 bytes 0 jump CNI-0c06bdc28bb4c85248530a39 # handle 8

.. intentionaly omitted ...

        }

        chain OUTPUT { # handle 4
                type nat hook output priority -100; policy accept;
        }

        chain CNI-0c06bdc28bb4c85248530a39 { # handle 5
                ip daddr 192.168.124.0/24  counter packets 0 bytes 0 accept # handle 6
                ip daddr != 224.0.0.0/4  counter packets 0 bytes 0 masquerade  # handle 7
        }

.. intentionaly omitted ...

}

This basically means that if a packet came from IP 192.168.124.8 and the packet's destination is local subnet, i.e. 192.168.124.0/24, then pass the packet and do not masquerade. Any other packets, except multicast, are being NATed (masqueraded).

@greenpau
Copy link
Author

@greenpau
Copy link
Author

greenpau commented Mar 16, 2020

Noticed a lot of CNI-xxx chains in table ip nat when listing chains.

When I removec containers, the chains stayed ...

greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 16, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 16, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 16, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 17, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 17, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 17, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Mar 20, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
@xtreme-sameer-vohra
Copy link
Contributor

Hey @greenpau
Are you still intending to submit a nftables firewall backend PR?

@greenpau
Copy link
Author

@xtreme-sameer-vohra , yes. Wanted to resume my PR it next week. Are you using nft?

@xtreme-sameer-vohra
Copy link
Contributor

Great.
We're not at the moment, but we would consider using it in the future as it seems much more performant and cleaner to operate with multiple entities being able to set namespaced rules.

@dcbw
Copy link
Member

dcbw commented Jun 24, 2020

I would suggest updating iptables-nft to at least 1.8.3 or 1.8.4. There were known bugs in 1.8.2 that this may be fixed already. Does a later iptables-nft help your original issue?

@greenpau
Copy link
Author

I would suggest updating iptables-nft to at least 1.8.3 or 1.8.4. There were known bugs in 1.8.2 that this may be fixed already. Does a later iptables-nft help your original issue?

@dcbw , I think there is still an issue. I also think that NAT is being impacted too, not just filter table.

@greenpau
Copy link
Author

@dcbw , just for the reference:

$ iptables -V
iptables v1.8.4 (nf_tables)

greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Jul 30, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Jul 31, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Jul 31, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Jul 31, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Jul 31, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 1, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 1, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 1, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 1, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 2, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 3, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 3, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
greenpau added a commit to greenpau/origin_containernetworking_plugins that referenced this issue Aug 3, 2020
Resolves: containernetworking#461

Signed-off-by: Paul Greenberg <greenpau@outlook.com>
@nemith
Copy link

nemith commented Jan 26, 2022

Any plans on picking this back up. I am kinda surprised to find out nftables isn't supported on centos 9 with podman.

@greenpau
Copy link
Author

Any plans on picking this back up. I am kinda surprised to find out nftables isn't supported on centos 9 with podman.

@nemith , i did. See https://github.com/greenpau/cni-plugins

@egberts
Copy link

egberts commented Jul 18, 2022

To get rid of that libvirt error, my permanent workaround in Debian 11 (as a host) with libvirtd daemon is to block the loading of iptables-related modules:

Create a file in /etc/modprobe.d/nft-only.conf:


#  Source: https://www.gaelanlloyd.com/blog/migrating-debian-buster-from-iptables-to-nftables/
#
blacklist x_tables
blacklist iptable_nat
blacklist iptable_raw
blacklist iptable_mangle
blacklist iptable_filter
blacklist ip_tables
blacklist ipt_MASQUERADE
blacklist ip6table_nat
blacklist ip6table_raw
blacklist ip6table_mangle
blacklist ip6table_filter
blacklist ip6_tables

libvirtd daemon now starts without any error.

Post-analysis: Apparently, I had iptables module loaded alongside with many nft-related modules; once iptables was gone, the pesky error message went away.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants