Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calico networking broken when host OS uses iptables >= 1.8 #2322

Closed
danderson opened this issue Dec 1, 2018 · 29 comments · Fixed by #7111
Closed

Calico networking broken when host OS uses iptables >= 1.8 #2322

danderson opened this issue Dec 1, 2018 · 29 comments · Fixed by #7111

Comments

@danderson
Copy link

Pods cannot communicate with each other or the internet when running with Calico networking on Debian Testing (aka Buster)

Expected Behavior

Installing Calico using the getting started manifests (k8s datastore, not etcd) should result in a cluster where pods can talk to each other.

Current Behavior

I bootstrapped a single-node k8s cluster on a Debian Testing (Buster) machine, using kubeadm init --pod-network-cidr=192.168.0.0/16 and KUBECONFIG=/etc/kubernetes/admin.conf kubectl taint nodes --all node-role.kubernetes.io/master-.

I then installed Calico using the instructions at: https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/calico#installing-with-the-kubernetes-api-datastore50-nodes-or-less .

Calico pods start, and once the CNI config is installed other pods start up as well.

However, no pods can talk to any other pods, or to the internet. Packets flow correctly out of the container and onto the host, but never flow back out from there.

Switching the OS back to Debian Stable (stretch), Calico works flawlessly again.

Possible Solution

I suspect, although I have no proof, that the root cause is the release of iptables 1.8. See related bug kubernetes/kubernetes#71305 . iptables 1.8 switches to using nf_tables in the kernel, and splits the tooling into iptables (translation layer for nf_tables) and iptables-legacy (the "classic" iptables). So, you end up with nf_tables in the kernel, an aware iptables 1.8 on the host OS, but legacy iptables 1.6 in the networking containers (including calico-node).

A breakage in netfilter is consistent with the symptoms I've found in my debugging so far. I'm going to add the debugging I've done so far in a separate post, since it's a lot of data and I want to keep the initial report fairly crisp.

Steps to Reproduce (for bugs)

Create a trivial k8s cluster on Debian Buster machines using kubeadm, then install Calico. Observe that pod<>pod and pod<>internet routing is broken.

Context

I'm the main developer of MetalLB. I've been working on creating a VM-based test harness for MetalLB that cross-tests compatibility against a bunch of k8s network addons, including Calico. I've struggled for the past 2 days with bizarre "none of my network addons seem to work" issues, which I've just figured out is caused by "something that changed recently in Debian Buster" (because I have older Debian Buster clusters on which Calico worked fine).

Your Environment

Calico version

  • Calico: v3.3.1
  • Orchestrator version (e.g. kubernetes, mesos, rkt): Kubernetes 1.12.3
  • Operating System and version: Debian Buster aka Debian Testing aka "the rolling release of pretty recent versions of everything"
@danderson
Copy link
Author

Steps to reproduce

Get a Debian Buster machine. All this is going to be done on a single machine, so physical vs. virtual, specific platform etc. are irrelevant.

Install k8s prereqs:

apt-get install curl ebtables ethtool gpg gpg-agent

Add docker and k8s repos and install docker and kubeadm:

curl -fsSL https://download.docker.com/linux/debian/gpg | apt-key add -
curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
cat >/etc/apt/sources.list.d/k8s.list <<EOF
deb [arch=amd64] https://download.docker.com/linux/debian buster stable
deb http://apt.kubernetes.io/ kubernetes-xenial main
EOF
apt-get update
apt-get install docker-ce=18.06.* kubelet kubeadm kubectl
echo br_netfilter >>/etc/modules
modprobe br_netfilter
systemctl start docker
export KUBECONFIG=/etc/kubernetes/admin.conf

Bootstrap a single-node k8s cluster with kubeadm, and remove the master taint:

kubeadm init --pod-network-cidr=192.168.0.0/16
kubectl taint nodes --all node-role.kubernetes.io/master-

Install Calico:

kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/rbac-kdd.yaml
kubectl apply -f https://docs.projectcalico.org/v3.3/getting-started/kubernetes/installation/hosted/kubernetes-datastore/calico-networking/1.7/calico.yaml

Wait a bit for Calico to start up and other pods to schedule. Notice that coredns is crashlooping. This is a symptom of network connectivity problems, because in coredns 1.2.2 the loop-detection plugin interprets "I can't talk to my upstream DNS server" as a fatal problem, and crashes.

Start 2 playground pods:

kubectl run -it test --image=debian /bin/bash
kubectl run -it test2 --image=debian /bin/bash

Run ip addr in each pod, to get their IPs. In the rest of this, I'll use the IPs I got on my installation: test is 192.168.0.5, and test is 192.168.0.6.

Try to ping ping 192.168.0.6 from test. Note that it gets no responses. Leave the ping running.

Back on the host node, use tcpdump to check what packets are flowing:

tcpdump -i any -vvv -n host 192.168.0.6
tcpdump: listening on any, link-type LINUX_SLL (Linux cooked), capture size 262144 bytes
05:50:47.036348 IP (tos 0x0, ttl 64, id 19754, offset 0, flags [DF], proto ICMP (1), length 84)
    192.168.0.5 > 192.168.0.6: ICMP echo request, id 7, seq 32, length 64

Run tcpdump again, specifically on the interface that links to the test pod. Note you get the same output, so we see the packet arriving on the node via the correct calico veth.

Run tcpdump again on the interface that links to test2. Note that you don't see any packets. You can also tcpdump against all other interfaces, to prove that the Echo Request is not going anywhere once it arrives on the host.

Add an iptables trace rule:

iptables -t raw -A PREROUTING -d 192.168.0.6 -p icmp -j TRACE

Now, the trick here is that iptables is the new iptables 1.8, so actually we just installed an nf_tables trace rule, not iptables. To view the trace, install the nftables package and run nft monitor trace. I get:

trace id 9e2ea253 ip raw PREROUTING packet: iif "cali5318d3e5a32" ether saddr 92:e7:9d:23:1f:20 ether daddr ee:ee:ee:ee:ee:ee ip saddr 192.168.0.5 ip daddr 192.168.0.6 ip dscp cs0 ip ecn not-ect ip ttl 64 ip id 53262 ip length 84 icmp type echo-request icmp code 0 icmp id 8 icmp sequence 71 @th,64,96 18919700218802367107883207936 
trace id 9e2ea253 ip raw PREROUTING rule meta l4proto 1 ip daddr 192.168.0.6 counter packets 60 bytes 5040 nftrace set 1 (verdict continue)
trace id 9e2ea253 ip raw PREROUTING verdict continue 
trace id 9e2ea253 ip raw PREROUTING 
trace id 9e2ea253 ip nat PREROUTING verdict continue mark 0x00040000 
trace id 9e2ea253 ip nat PREROUTING mark 0x00040000 
trace id 9e2ea253 ip filter FORWARD packet: iif "cali5318d3e5a32" oif "calie377d8195aa" ether saddr 92:e7:9d:23:1f:20 ether daddr ee:ee:ee:ee:ee:ee ip saddr 192.168.0.5 ip daddr 192.168.0.6 ip dscp cs0 ip ecn not-ect ip ttl 63 ip id 53262 ip length 84 icmp type echo-request icmp code 0 icmp id 8 icmp sequence 71 @th,64,96 18919700218802367107883207936 
trace id 9e2ea253 ip filter FORWARD rule counter packets 327 bytes 27507 jump DOCKER-USER (verdict jump DOCKER-USER)
trace id 9e2ea253 ip filter DOCKER-USER verdict return mark 0x00010000 
trace id 9e2ea253 ip filter FORWARD rule counter packets 327 bytes 27507 jump DOCKER-ISOLATION-STAGE-1 (verdict jump DOCKER-ISOLATION-STAGE-1)
trace id 9e2ea253 ip filter DOCKER-ISOLATION-STAGE-1 verdict return mark 0x00010000 
trace id 9e2ea253 ip filter FORWARD verdict continue mark 0x00010000 
trace id 9e2ea253 ip filter FORWARD mark 0x00010000 

Note that it's not traversing any calico rulesets, because they're getting programmed in the legacy iptables universe, so we only see Docker rules in this trace.

Install a trace with iptables-legacy to see what's happening in the old universe:

iptables-legacy -t raw -A PREROUTING -d 192.168.0.6 -p icmp -j TRACE

This trace gets dumped into dmesg:

[ 1416.610744] TRACE: raw:PREROUTING:rule:1 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.610768] TRACE: raw:cali-PREROUTING:rule:1 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.610788] TRACE: raw:cali-PREROUTING:rule:2 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.610810] TRACE: raw:cali-PREROUTING:return:5 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.610851] TRACE: raw:PREROUTING:rule:2 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.610870] TRACE: raw:PREROUTING:policy:3 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.610899] TRACE: mangle:PREROUTING:rule:1 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.610920] TRACE: mangle:cali-PREROUTING:rule:3 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.610941] TRACE: nat:PREROUTING:rule:1 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.610972] TRACE: nat:cali-PREROUTING:rule:1 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.610997] TRACE: nat:cali-fip-dnat:return:1 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611021] TRACE: nat:cali-PREROUTING:return:2 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611040] TRACE: nat:PREROUTING:rule:2 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611067] TRACE: nat:KUBE-SERVICES:return:10 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611085] TRACE: nat:PREROUTING:policy:3 IN=cali5318d3e5a32 OUT= MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611116] TRACE: mangle:FORWARD:policy:1 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611135] TRACE: filter:FORWARD:rule:1 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611157] TRACE: filter:cali-FORWARD:rule:1 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x40000 
[ 1416.611177] TRACE: filter:cali-FORWARD:rule:2 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611206] TRACE: filter:cali-from-hep-forward:return:1 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611225] TRACE: filter:cali-FORWARD:rule:3 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611249] TRACE: filter:cali-from-wl-dispatch:rule:2 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611279] TRACE: filter:cali-fw-cali5318d3e5a32:rule:3 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611305] TRACE: filter:cali-fw-cali5318d3e5a32:rule:4 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611342] TRACE: filter:cali-pro-kns.default:rule:1 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611374] TRACE: filter:cali-pro-kns.default:rule:2 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611401] TRACE: filter:cali-fw-cali5318d3e5a32:rule:5 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611421] TRACE: filter:cali-FORWARD:rule:4 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611456] TRACE: filter:cali-to-wl-dispatch:rule:4 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611500] TRACE: filter:cali-tw-calie377d8195aa:rule:3 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611537] TRACE: filter:cali-tw-calie377d8195aa:rule:4 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611567] TRACE: filter:cali-pri-kns.default:rule:1 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 
[ 1416.611597] TRACE: filter:cali-pri-kns.default:rule:2 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611636] TRACE: filter:cali-tw-calie377d8195aa:rule:5 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611656] TRACE: filter:cali-FORWARD:rule:5 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611710] TRACE: filter:cali-to-hep-forward:return:1 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 
[ 1416.611732] TRACE: filter:cali-FORWARD:rule:6 IN=cali5318d3e5a32 OUT=calie377d8195aa MAC=ee:ee:ee:ee:ee:ee:92:e7:9d:23:1f:20:08:00 SRC=192.168.0.5 DST=192.168.0.6 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=7967 DF PROTO=ICMP TYPE=8 CODE=0 ID=8 SEQ=212 MARK=0x10000 

I'll skip the blow-by-blow, but the last line shows we matched item 6 in chain cali-FORWARD, which is an ACCEPT for packets with mark 0x10000.

So, both of these traces are ending with an ACCEPT... But somehow between the end of that ACCEPT and the transmission itself, the packet is getting dropped, because we don't see it getting transmitted in tcpdump.

More net-tools outputs for my own test case:

ip route
default via 10.0.2.2 dev ens3 
10.0.2.0/24 dev ens3 proto kernel scope link src 10.0.2.15 
172.17.0.0/16 dev docker0 proto kernel scope link src 172.17.0.1 linkdown 
172.20.0.0/24 dev ens4 proto kernel scope link src 172.20.0.1 
blackhole 192.168.0.0/24 proto bird 
192.168.0.2 dev cali14990621c2e scope link 
192.168.0.3 dev cali9a38139a383 scope link 
192.168.0.4 dev cali9bade543567 scope link 
192.168.0.5 dev cali5318d3e5a32 scope link 
192.168.0.6 dev calie377d8195aa scope link 

Looks like normal Calico routing is getting programmed just fine.

ip rule
0:	from all lookup local 
32766:	from all lookup main 
32767:	from all lookup default 

Nothing weird in policy routing.

ip neigh
192.168.0.5 dev cali5318d3e5a32 lladdr 92:e7:9d:23:1f:20 STALE
192.168.0.3 dev cali9a38139a383 lladdr 26:64:ae:91:e4:09 STALE
192.168.0.2 dev cali14990621c2e lladdr 8a:c8:0f:27:40:e6 STALE
10.0.2.2 dev ens3 lladdr 52:55:0a:00:02:02 REACHABLE
fe80::2 dev ens3 lladdr 52:56:00:00:00:02 router STALE

Note, no neighbor entry for 192.168.0.6, which indicates that the kernel never reached the point of "ok, I need to send to 192.168.0.6 on this interface, I need to do ARP resolution", the packets are getting dropped before that point.

rp_filter=1 by default on this OS (aka strict reverse path filtering). Even though it shouldn't affect these packets (the reverse path is 100% correct even for strict mode), I turned rp_filter off and observed no change.

That basically exhausted my network debugging knowledge on linux, so at that point I started varying k8s versions, calico versions, switching out calico for weave... And eventually found that downgrading to debian stable (with good old iptables 1.6) completely fixes things.

@danderson
Copy link
Author

Oh, and left out an important one, if we're thinking kernel weirdness: in all of the above, kernel is Linux 4.18.0-2-amd64 (aka "whatever Debian Buster installs right now")

@danderson
Copy link
Author

It's definitely iptables 1.8. After some friendly inspiration (hi @bradfitz!), I grabbed the iptables 1.6 package out of debian stable, and hackily overwrote all the binaries on the host OS so that the host node ends up using the same version of iptables as all the containers (1.6 from debian stable)

apt-get install wget binutils xz-utils
wget http://ftp.us.debian.org/debian/pool/main/i/iptables/iptables_1.6.0+snapshot20161117-6_amd64.deb
ar x iptables_1.6.0+snapshot20161117-6_amd64.deb
tar xvf data.tar.xz
cp -f ./sbin/* /sbin
cp -f ./usr/sbin/* /usr/sbin
cp -f ./usr/bin/* /usr/bin
reboot

I did a reboot to fully clear all the system state and go from a clean slate. After the reboot finishes and k8s comes back up, coredns is no longer crashlooping (meaning the pod is able to reach the upstream internet DNS resolver), and I can ping pod-to-pod just fine.

So, the root cause definitely seems to be mixing iptables 1.6 and iptables 1.8 against the same kernel. If you use all iptables 1.6, everything is fine. I'm guessing if you use only iptables 1.8 (which translates into nftables but faithfully emulates the userspace interfaces), everything would also work fine. But with the host OS using iptables 1.8 (which programs nftables) and containers like calico-node using iptables 1.6 (which programs legacy iptables), packet forwarding seems to break.

Given that, my guess as to a fix would be for calico-node to have both versions of iptables available, and pick which one to use based on what the host OS is doing, somehow (e.g. check via netlink if nftables are non-empty?). Either that, or spin separate containers and document that users have to be careful with which one they use.

@caseydavenport caseydavenport changed the title Calico networking broken on debian testing Calico networking broken when host OS uses iptables >= 1.8 Dec 1, 2018
@nelljerram
Copy link
Member

nelljerram commented Dec 3, 2018

@danderson Thanks for your report and analysis, which we think is spot on. We're hoping to be able to address this within the next couple of Calico releases, but can't promise that for sure yet.

@mrak
Copy link

mrak commented Jul 25, 2019

We have also run into this issue when upgrading our kubernetes nodes to Debian Buster with iptables 1.8

We were able to get around this issue by using

update-alternatives --set iptables /usr/sbin/iptables-legacy

https://wiki.debian.org/nftables#Current_status

@caseydavenport
Copy link
Member

We're including support in Calico v3.8.1+ which will allow Calico to run on hosts which use iptables in NFT mode.

Setting the FELIX_IPTABLESBACKEND=NFT option will tell Calico to use the nftables backend. For now, this will need to be set explicitly.

@mrak
Copy link

mrak commented Jul 26, 2019

Thanks @caseydavenport.

Do you mean that v3.8.1+ will automatically include support for NFT mode?
Or does v3.8.1+ support it, but requires FELIX_IPTABLESBACKEND=NFT first?

@tmjd
Copy link
Member

tmjd commented Jul 26, 2019

v3.8.1 requires that you set FELIX_IPTABLESBACKEND=NFT.
Automatic detection will hopefully be added in the future, we don't know when that feature will added.

@tungdam
Copy link

tungdam commented Oct 1, 2019

Just a note for somebody who got the same problem and switched to iptables-legacy but still doesn't help: Don't forget to check the rules that docker created by iptables-nft, it's still there if you don't manage it by yourself.
The syntax is quite the same as iptables: iptables-nft-save
In our case docker uses iptables-nft to create rules when it's starting up and set the default FORWARD policy to DROP. We covered this case with iptables but didn't notice about iptables-nft rules
I'm using Debian10 , calico 3.5.1.

@koeberlue
Copy link

update-alternatives --set iptables /usr/sbin/iptables-legacy

This just solved two days of debugging, when we tried to install Rancher on a Debian Buster cluster. Since Google did not provide any matches on the error we got i will paste the error message below so that others googling this issue will find this thread:

 [ERROR] Failed to connect to peer wss://10.42.X.X/v3/connect [local ID=10.42.X.X]: dial tcp 10.42.X.X:443: i/o timeout

Thanks, @mrak

@lingyuan2014
Copy link

Just spent 3 days debugging this issue as I'm using Debian Buster. Could we add a note to https://docs.projectcalico.org/getting-started/kubernetes/self-managed-public-cloud/gce to highlight this nuance. I believe that would a lot of people some debugging time.

@quangleehong
Copy link

sorry i'm very new to this, can anyone tell me howto / whereis to set the FELIX_IPTABLESBACKEND=NFT for the inbound/outbound traffic of pod network ? the step provided is really appreciated.
thank you

@tungdam
Copy link

tungdam commented Aug 20, 2020

It's noted here https://docs.projectcalico.org/reference/felix/configuration

Depend on how are you deploying calico-node. But i guess the most common way is a k8s daemon, just set the ENV like this

env:
  - name: FELIX_IPTABLESBACKEND
    value: "NFT"

I've never tried it before as we switch back to legacy mode instead of nftables :D

@quangleehong
Copy link

env:

  • name: FELIX_IPTABLESBACKEND
    value: "NFT"

i just use the standard K8s on kubernetes.io 1.18.6 the current one on git repo.

everything else is from the guideline installation, nothing special

i would love to switch back to liptables legacy as well, but in Centos 8 there is no legacy any more, how would we have the legacy back ?
thank you

@tungdam
Copy link

tungdam commented Aug 20, 2020

Maybe you misunderstood my comment. I mean we should set that ENV for calico-node pod. We're deploying calico-node node as a daemonset.

An example of the ENV values looks like this

kubectl -n kube-system describe pod calico-node-tmsrh | fgrep FELIX
      FELIX_IPV6SUPPORT:                  false
      FELIX_LOGSEVERITYSCREEN:            info
      FELIX_PROMETHEUSMETRICSENABLED:     true
      FELIX_PROMETHEUSMETRICSPORT:        13091
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  Accept

If you can provide your calico version and how are you deploying here, it's more useful for debugging. Maybe the latest calico version can detect the iptables mode already. I'm not so sure about that.

About reverting to "legacy" mode, it's actually pointing /usr/sbin/iptables to /usr/sbin/iptables-legacy , which in turn points to /usr/sbin/xtables-legacy-multi. ( on Debian it's easy to do all this by just using update-alternatives --config iptables )
I'm not sure if centos8 still keeps such legacy part or it uses nftables only. If not, it's better to make calico works with nftables.

@quangleehong
Copy link

quangleehong commented Aug 20, 2020

Maybe you misunderstood my comment. I mean we should set that ENV for calico-node pod. We're deploying calico-node node as a daemonset.

An example of the ENV values looks like this

kubectl -n kube-system describe pod calico-node-tmsrh | fgrep FELIX
      FELIX_IPV6SUPPORT:                  false
      FELIX_LOGSEVERITYSCREEN:            info
      FELIX_PROMETHEUSMETRICSENABLED:     true
      FELIX_PROMETHEUSMETRICSPORT:        13091
      FELIX_DEFAULTENDPOINTTOHOSTACTION:  Accept

If you can provide your calico version and how are you deploying here, it's more useful for debugging. Maybe the latest calico version can detect the iptables mode already. I'm not so sure about that.

About reverting to "legacy" mode, it's actually pointing /usr/sbin/iptables to /usr/sbin/iptables-legacy , which in turn points to /usr/sbin/xtables-legacy-multi. ( on Debian it's easy to do all this by just using update-alternatives --config iptables )
I'm not sure if centos8 still keeps such legacy part or it uses nftables only. If not, it's better to make calico works with nftables.

HI tungdam

i'm sorry for not replying soon, thank you for your help

This is my execting when building the fresh new K8s :
Bare metal cluster which is contening:
02 Master, 02 worker
OS: Centos 8.2
Kubenetes: 1.18.x
Docker 19.3.x

I'm done setting up the cluster, now i'm planning using the calico for pods network network to communicate in/out to external
So i should not use the flannel as pod network comunication , because of the iptables not using the legacy mode, should i ?
Instead of that i need to use the Calico node or keep install the flannel and use Calico on top of Flannel with env set as the threat mention?
I'm sorry for the dump question since i'm new to K8s and having issue now
Thank you

@tungdam
Copy link

tungdam commented Aug 21, 2020

I have no experience with flannel so there's not much to tell here, though calico alone would be enough for pod-to-pod networking. If you need to make it available with the external network as well, consider this.
I doubt that the root cause for your issue ( though not sure what it is ) is the problem with iptables / nftables.
If you're new with calico, i recommend watching this first.

@quangleehong
Copy link

I have no experience with flannel so there's not much to tell here, though calico alone would be enough for pod-to-pod networking. If you need to make it available with the external network as well, consider this.
I doubt that the root cause for your issue ( though not sure what it is ) is the problem with iptables / nftables.
If you're new with calico, i recommend watching this first.

thank you tungdam, i'm looking on in

@quangleehong
Copy link

I have no experience with flannel so there's not much to tell here, though calico alone would be enough for pod-to-pod networking. If you need to make it available with the external network as well, consider this.
I doubt that the root cause for your issue ( though not sure what it is ) is the problem with iptables / nftables.
If you're new with calico, i recommend watching this first.

Hi tungdam

i deploy the calico-node as demonset in which appended the parameters
name: FELIX_IPTABLESBACKEND
value: "NFT"
at the end of env section and deploy by kubectl apply -f name-of-deamonset.yaml

i get the issue of the calico-node-xxxxx stucked at ContainerCreating forever.

So what is the issue of that

thank you

@tungdam
Copy link

tungdam commented Aug 24, 2020

Try to get more info from the pod by kubectl describe or kubectl logs please.
I guess it will be much quicker for you to solve this kind of setup issue by joining calico slack channel, guys there are very friendly and helpful. People discussing there quite actively as well.

@quangleehong
Copy link

hi all,
i use the tigera operation to automate the calico node deployment, however when checking the calico-node deamonSet i saw the FELIX_IPTABLESBACKEND = auto which i found on the document that auto will get default value : legacy , centos 8 does not have iptables-legacy anymore but use nftables instead.
i'm not familiar with calicoctl to edit this vaue to NFT , so any one here can help me with step to change with calicoctl command please ?

@quangleehong
Copy link

hi all
Anyone know the calicocli patch command to change the FELIX_IPTABLEBACKEND to NFT value ?
thank your for your help

@quangleehong
Copy link

Try to get more info from the pod by kubectl describe or kubectl logs please.
I guess it will be much quicker for you to solve this kind of setup issue by joining calico slack channel, guys there are very friendly and helpful. People discussing there quite actively as well.

hi tumdam

i think i have done the calico node for pod network , all calico pod running, the detail of calico node as below

[root@leean-k8s-master ~]# kubectl -n calico-system describe pod calico-node-dtx8s | fgrep FELIX
FELIX_DEFAULTENDPOINTTOHOSTACTION: ACCEPT
FELIX_HEALTHENABLED: true
FELIX_TYPHAK8SNAMESPACE: calico-system
FELIX_TYPHAK8SSERVICENAME: calico-typha
FELIX_TYPHACAFILE: /typha-ca/caBundle
FELIX_TYPHACERTFILE: /felix-certs/cert.crt
FELIX_TYPHAKEYFILE: /felix-certs/key.key
FELIX_TYPHACN: <set to the key 'common-name' in secret 'typha-certs'> Optional: true
FELIX_TYPHAURISAN: <set to the key 'uri-san' in secret 'typha-certs'> Optional: true
FELIX_VXLANMTU: 1410
FELIX_WIREGUARDMTU: 1400
FELIX_IPINIPMTU: 1440
FELIX_IPV6SUPPORT: false
FELIX_IPTABLESBACKEND: auto
the method to deploy is using the tigera operator for calico auto deployment

as you see the result of the env is IPTABLESBACKEND = auto which is default to legacy ( documented in calico v3.16 latest)
since centos 8 have depricated the legacy mode of iptables and move to nftables
so could you give some help on how to change the value to NFT, i try with some hints over internet but did not help

thank you very muich

@tungdam
Copy link

tungdam commented Aug 28, 2020

Show me please
kubectl -n calico-system get ds calico-node -o yaml
Basically it'll return you a yaml config of the daemonset. You can simply just edit that yaml file to add the env you want and apply it again.

I highly recommend you read more about k8s basic operations.

@tmjd
Copy link
Member

tmjd commented Aug 28, 2020

@quangleehong with FELIX_IPTABLESBACKEND set to auto, calico-node should be detecting the backend to use. If you have it set to auto and still believe it is using the incorrect IPtables backend please open a new issue. In that issue please include logs from calico-node. I'd also suggest collecting output on a node from iptables-legacy and iptables-nft, the auto-detection uses the existence of rules existing in the two backends to select the appropriate version.

@lalithvaka
Copy link

@tungdam , We are facing similar issues where our Fluent-bit daemonset is failing to deploy on Master nodes with the following (works on our worker nodes with no issues)

Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "7187279a021a00f4392a2a02744417c24907cea4c835b18c90f429114ae36867" network for pod "fluent-bit-xhtkd": networkPlugin cni failed to set up pod "fluent-bit-xhtkd_logging" network: Get https://[10.233.0.1]:443/api/v1/namespaces/logging: dial tcp 10.233.0.1:443: i/o timeout

One of the Calico node log as below

2020-08-28 03:03:37.584 [INFO][8] startup.go 290: Early log level set to info
2020-08-28 03:03:37.584 [INFO][8] startup.go 306: Using NODENAME environment for node name
2020-08-28 03:03:37.585 [INFO][8] startup.go 318: Determined node name: kube01
2020-08-28 03:03:37.588 [INFO][8] startup.go 104: Skipping datastore connection test
2020-08-28 03:03:37.779 [INFO][8] startup.go 510: Using IPv4 address from environment: IP=172.16.190.61
2020-08-28 03:03:37.787 [INFO][8] startup.go 543: IPv4 address 172.16.190.61 discovered on interface eth0
2020-08-28 03:03:37.787 [INFO][8] startup.go 715: No AS number configured on node resource, using global value
2020-08-28 03:03:37.794 [INFO][8] startup.go 171: Setting NetworkUnavailable to False
2020-08-28 03:03:39.803 [WARNING][8] startup.go 1176: Failed to set NetworkUnavailable to False; will retry error=Patch https://10.233.0.1:443/api/v1/nodes/kube01/status?timeout=2s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2020-08-28 03:03:41.803 [WARNING][8] startup.go 1176: Failed to set NetworkUnavailable to False; will retry error=Patch https://10.233.0.1:443/api/v1/nodes/kube01/status?timeout=2s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)
2020-08-28 03:03:41.875 [INFO][8] startup.go 760: found v4=10.233.64.0/18 in the kubeadm config map
2020-08-28 03:03:41.876 [INFO][8] startup.go 764: found v6= in the kubeadm config map
2020-08-28 03:03:41.881 [INFO][8] startup.go 598: FELIX_IPV6SUPPORT is false through environment variable
2020-08-28 03:03:41.895 [INFO][8] startup.go 215: Using node name: kube01
2020-08-28 03:03:42.478 [INFO][19] allocateip.go 144: Current address is still valid, do nothing currentAddr="10.233.98.0" type="ipipTunnelAddress"
Calico node started successfully
bird: Unable to open configuration file /etc/calico/confd/config/bird.cfg: No such file or directory
bird: Unable to open configuration file /etc/calico/confd/config/bird6.cfg: No such file or directory
2020-08-28 03:03:44.661 [INFO][63] config.go 103: Skipping confd config file.
2020-08-28 03:03:44.662 [INFO][63] run.go 17: Starting calico-confd
2020-08-28 03:03:44.758 [INFO][65] logutils.go 82: Early screen log level set to info
2020-08-28 03:03:44.759 [INFO][65] daemon.go 144: Felix starting up GOMAXPROCS=4 builddate="c769a233f3ade99114fa130428ef5037ee297135" gitcommit="2020-03-30T20:02:38+0000" version="v3.13.2"
2020-08-28 03:03:44.762 [INFO][65] daemon.go 162: Loading configuration...
2020-08-28 03:03:44.764 [INFO][65] env_var_loader.go 40: Found felix environment variable: "prometheusmetricsenabled"="False"
2020-08-28 03:03:44.764 [INFO][65] env_var_loader.go 40: Found felix environment variable: "ignorelooserpf"="False"
2020-08-28 03:03:44.764 [INFO][65] env_var_loader.go 40: Found felix environment variable: "etcdscheme"=""
2020-08-28 03:03:44.764 [INFO][65] env_var_loader.go 40: Found felix environment variable: "etcdkeyfile"="/calico-secrets/key.pem"
2020-08-28 03:03:44.764 [INFO][65] env_var_loader.go 40: Found felix environment variable: "prometheusprocessmetricsenabled"="True"
2020-08-28 03:03:44.764 [INFO][65] env_var_loader.go 40: Found felix environment variable: "iptableslocktimeoutsecs"="10"
2020-08-28 03:03:44.765 [INFO][65] env_var_loader.go 40: Found felix environment variable: "healthenabled"="true"
2020-08-28 03:03:44.765 [INFO][65] env_var_loader.go 40: Found felix environment variable: "usagereportingenabled"="False"
2020-08-28 03:03:44.857 [INFO][65] env_var_loader.go 40: Found felix environment variable: "etcdcafile"="/calico-secrets/ca_cert.crt"
2020-08-28 03:03:44.857 [INFO][65] env_var_loader.go 40: Found felix environment variable: "chaininsertmode"="Insert"
2020-08-28 03:03:44.857 [INFO][65] env_var_loader.go 40: Found felix environment variable: "etcdendpoints"="https://172.16.190.61:2379,https://172.16.190.62:2379,https://172.16.190.63:2379"
2020-08-28 03:03:44.857 [INFO][65] env_var_loader.go 40: Found felix environment variable: "healthhost"="localhost"
2020-08-28 03:03:44.857 [INFO][65] env_var_loader.go 40: Found felix environment variable: "felixhostname"="kube01"
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "etcdcertfile"="/calico-secrets/cert.crt"
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "etcdaddr"=""
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "defaultendpointtohostaction"="RETURN"
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "logseverityscreen"="info"
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "prometheusmetricsport"="9091"
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "iptablesbackend"="Legacy"
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "ipv6support"="false"
2020-08-28 03:03:44.858 [INFO][65] env_var_loader.go 40: Found felix environment variable: "prometheusgometricsenabled"="True"
2020-08-28 03:03:44.858 [INFO][65] daemon.go 186: Loading config file: /etc/calico/felix.cfg
2020-08-28 03:03:44.860 [INFO][65] config_params.go 268: Merging in config from environment variable: map[chaininsertmode:Insert defaultendpointtohostaction:RETURN etcdaddr: etcdcafile:/calico-secrets/ca_cert.crt etcdcertfile:/calico-secrets/cert.crt etcdendpoints:https://172.16.190.61:2379,https://172.16.190.62:2379,https://172.16.190.63:2379 etcdkeyfile:/calico-secrets/key.pem etcdscheme: felixhostname:kube01 healthenabled:true healthhost:localhost ignorelooserpf:False iptablesbackend:Legacy iptableslocktimeoutsecs:10 ipv6support:false logseverityscreen:info prometheusgometricsenabled:True prometheusmetricsenabled:False prometheusmetricsport:9091 prometheusprocessmetricsenabled:True usagereportingenabled:False]
2020-08-28 03:03:44.861 [INFO][65] config_params.go 277: Ignoring empty configuration parameter. Use value 'none' if your intention is to explicitly disable the default value. name="etcdscheme" source=environment variable
2020-08-28 03:03:44.861 [INFO][65] config_params.go 277: Ignoring empty configuration parameter. Use value 'none' if your intention is to explicitly disable the default value. name="etcdaddr" source=environment variable
2020-08-28 03:03:44.861 [INFO][65] config_params.go 349: Parsing value for LogSeverityScreen: info (from environment variable)
2020-08-28 03:03:44.861 [INFO][65] config_params.go 385: Parsed value for LogSeverityScreen: INFO (from environment variable)
2020-08-28 03:03:44.862 [INFO][65] config_params.go 349: Parsing value for IptablesBackend: Legacy (from environment variable)
2020-08-28 03:03:44.862 [INFO][65] config_params.go 385: Parsed value for IptablesBackend: legacy (from environment variable)
2020-08-28 03:03:44.862 [INFO][65] config_params.go 349: Parsing value for HealthEnabled: true (from environment variable)
2020-08-28 03:03:44.862 [INFO][65] config_params.go 385: Parsed value for HealthEnabled: true (from environment variable)
2020-08-28 03:03:44.862 [INFO][65] config_params.go 349: Parsing value for EtcdEndpoints: https://172.16.190.61:2379,https://172.16.190.62:2379,https://172.16.190.63:2379 (from environment variable)
2020-08-28 03:03:44.863 [INFO][65] config_params.go 385: Parsed value for EtcdEndpoints: [https://172.16.190.61:2379/ https://172.16.190.62:2379/ https://172.16.190.63:2379/] (from environment variable)

@tungdam
Copy link

tungdam commented Sep 1, 2020

From your log i can't say that the issue related to iptables "mode" detection as described in this issue. Maybe you should create another issue with more info. Never mind if you found the solution already.

@razum2um
Copy link

I wonder, why don't it behave like FELIX_IPTABLESBACKEND=Auto by default w/o configuration?
e.g why https://docs.projectcalico.org/reference/felix/configuration Default: Legacy well?)

@avoidik
Copy link
Contributor

avoidik commented Dec 16, 2021

if you've still facing this issue, just change felixconfiguration and remove iptablesBackend definition

$ kubectl patch felixconfiguration default --type=json -p="[{'op': 'remove', 'path': '/spec/iptablesBackend'}]"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.