Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raspbian 10 fresh install has broken routing (iptables/nf_tables detection) #1597

Closed
ohthehugemanatee opened this issue Mar 28, 2020 · 16 comments

Comments

@ohthehugemanatee
Copy link

ohthehugemanatee commented Mar 28, 2020

Version:
k3s version v1.17.4+k3s1 (3eee8ac) on a raspberry pi 4 running Raspbian 10.

K3s arguments:
curl -sfL https://get.k3s.io | sh -

Describe the bug

On a fresh install, no traffic is routed inside the cluster, even for core services. Resolution is to uninstall the default iptables v1.8.2 (nf_tables), and install nftables.

On a fresh install, no traffic is routed inside the cluster. nodes cannot reach each other or coredns. Core services can't reach each other or the api. Ports are not opened on the physical host.

The host is not listening on port 80. Traefik LB reports that it is listening on port 80, but sudo netstat -tlp |grep 80 disagrees. External hosts cannot access created ingresses.

To Reproduce

  1. Install k3s on Raspbian 10.
  2. Run shell in a dnsutils container: kubectl run -it --rm --restart=Never dnsutils --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 sh
  3. Inside that container, run wget -O- github.com, or wget -O- kubernetes.default and observe "invalid name" errors. Try pinging any IP you please - the DNS server, external IPs - and observe failures.

Expected behavior
Traffic inside the cluster should be routed.

Actual behavior
No traffic is routed inside the cluster. Services (even kube-system) can't reach each other, nothing can reach the API server, etc.

First symptom I noticed was that ingresses failed to open port 80, and services couldn't reach their pods.

Additional context / logs

Fresh uninstall/reinstall on a raspbian host with IP 192.168.1.41:

$ sudo kubectl get all -n kube-system
NAME                                          READY   STATUS      RESTARTS   AGE
pod/metrics-server-6d684c7b5-w7swt            1/1     Running     0          25m
pod/coredns-6c6bb68b64-tqs4d                  1/1     Running     0          25m
pod/helm-install-traefik-pvsvx                0/1     Completed   0          25m
pod/svclb-traefik-pr5cn                       2/2     Running     0          22m
pod/traefik-7b8b884c8-826lt                   1/1     Running     0          22m
pod/local-path-provisioner-58fb86bdfd-cdjwd   1/1     Running     2          25m

NAME                         TYPE           CLUSTER-IP      EXTERNAL-IP    PORT(S)                      AGE
service/kube-dns             ClusterIP      10.43.0.10      <none>         53/UDP,53/TCP,9153/TCP       25m
service/metrics-server       ClusterIP      10.43.106.186   <none>         443/TCP                      25m
service/traefik-prometheus   ClusterIP      10.43.118.104   <none>         9100/TCP                     22m
service/traefik              LoadBalancer   10.43.138.141   192.168.1.41   80:30192/TCP,443:31737/TCP   22m

NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/svclb-traefik   1         1         1       1            1           <none>          22m

NAME                                     READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/metrics-server           1/1     1            1           25m
deployment.apps/coredns                  1/1     1            1           25m
deployment.apps/traefik                  1/1     1            1           22m
deployment.apps/local-path-provisioner   1/1     1            1           25m

NAME                                                DESIRED   CURRENT   READY   AGE
replicaset.apps/metrics-server-6d684c7b5            1         1         1       25m
replicaset.apps/coredns-6c6bb68b64                  1         1         1       25m
replicaset.apps/traefik-7b8b884c8                   1         1         1       22m
replicaset.apps/local-path-provisioner-58fb86bdfd   1         1         1       25m

NAME                             COMPLETIONS   DURATION   AGE
job.batch/helm-install-traefik   1/1           2m24s      25m
$ sudo kubectl describe service traefik -n kube-system
Name:                     traefik
Namespace:                kube-system
Labels:                   app=traefik
                          chart=traefik-1.81.0
                          heritage=Helm
                          release=traefik
Annotations:              <none>
Selector:                 app=traefik,release=traefik
Type:                     LoadBalancer
IP:                       10.43.138.141
LoadBalancer Ingress:     192.168.1.41
Port:                     http  80/TCP
TargetPort:               http/TCP
NodePort:                 http  30192/TCP
Endpoints:                10.42.0.6:80
Port:                     https  443/TCP
TargetPort:               https/TCP
NodePort:                 https  31737/TCP
Endpoints:                10.42.0.6:443
Session Affinity:         None
External Traffic Policy:  Cluster
Events:                   <none>
$ sudo kubectl describe ingress nginx
Name:             nginx
Namespace:        default
Address:          192.168.1.41
Default backend:  default-http-backend:80 (<none>)
Rules:
  Host                Path  Backends
  ----                ----  --------
  nginx.cluster.vert
                      /   nginx:80 (<none>)
Annotations:
Events:  <none>
$ sudo netstat -tlpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.1:8125          0.0.0.0:*               LISTEN      461/netdata
tcp        0      0 0.0.0.0:19999           0.0.0.0:*               LISTEN      461/netdata
tcp        0      0 127.0.0.1:10248         0.0.0.0:*               LISTEN      2754/k3s
tcp        0      0 127.0.0.1:10249         0.0.0.0:*               LISTEN      2754/k3s
tcp        0      0 127.0.0.1:6444          0.0.0.0:*               LISTEN      2754/k3s
tcp        0      0 127.0.0.1:10256         0.0.0.0:*               LISTEN      2754/k3s
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      523/sshd
tcp        0      0 127.0.0.1:10010         0.0.0.0:*               LISTEN      2821/containerd
tcp6       0      0 ::1:8125                :::*                    LISTEN      461/netdata
tcp6       0      0 :::19999                :::*                    LISTEN      461/netdata
tcp6       0      0 :::10250                :::*                    LISTEN      2754/k3s
tcp6       0      0 :::10251                :::*                    LISTEN      2754/k3s
tcp6       0      0 :::6443                 :::*                    LISTEN      2754/k3s
tcp6       0      0 :::10252                :::*                    LISTEN      2754/k3s
tcp6       0      0 :::30192                :::*                    LISTEN      2754/k3s
tcp6       0      0 :::22                   :::*                    LISTEN      523/sshd
tcp6       0      0 :::31737                :::*                    LISTEN      2754/k3s

This all started with a power loss/reboot of my working pi cluster, after an apt update.

See my eventual resolution. Seems to me that it was applying rules to both nftables and iptables-legacy, and there was some conflict.

UPDATE: changed focus now that I know routing is completely borked.
UPDATE 2: rewrite title/description after I discovered/resolved the problem. Left open because I believe it will affect other Raspbian 10 users and could probably use a PR to improve iptables vs nftables behavior.

@ohthehugemanatee
Copy link
Author

More debugging information:

I tried launching a dnsutils pod inside the cluster. It can't resolve anything. Even kubernetes.default times out trying to reach internal DNS.

Coredns logs are filled with entries like this:

E0328 22:36:15.894784       1 reflector.go:125] pkg/mod/k8s.io/client-go@v0.0.0-20190620085101-78d2af792bab/tools/cache/reflector.go:98: Failed to list *v1.Endpoints: Get https://10.43.0.1:443/api/v1/endpoints?limit=500&resourceVersion=0: dial tcp 10.43.0.1:443: i/o timeout
[INFO] plugin/ready: Still waiting on: "kubernetes"

Same for v1.Namespace and v1.Service. The logs start with several lines of:
[ERROR] plugin/errors: 2 1889223156.156930755. HINFO: read udp 10.42.0.4:58686->192.168.1.40:53: i/o timeout
where 192.168.1.40:53 is my local DNS server, accessible from the host machine.

And I guess because kube-dns isn't ready, it doesn't have any endpoints, even after 8 hours. :|

Logs for metrics-server are filled with:
E0328 22:56:23.510224 1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:cluster1: unable to fetch metrics from Kubelet cluster1 (cluster1): Get https://cluster1:10250/stats/summary?only_cpu_and_memory=true: dial tcp: i/o timeout
That URL is perfectly accessible from outside the cluster (though unauthorized of course).

So I'm left with some kind of problem with routing. Traffic originating inside the cluster seems to be getting nowhere. Services can't talk to each other, or to the outside world. The only IP my dnsutils pod can ping is that of the host machine, 192.168.1.41.

I uninstalled, rebooted, and reinstalled once again, and the problem persists. So clearly there is some possible system state that causes this on a fresh install. I just don't know what. :(

@ohthehugemanatee ohthehugemanatee changed the title Fresh install not listening on port 80 Fresh install - no routing inside the cluster Mar 29, 2020
@jfmatth
Copy link

jfmatth commented Mar 29, 2020

@ohthehugemanatee Have you checked if u have a firewall running?

I have another issue where the install isn't working, and I've found the local linux firewall (firewalld or UFW) is the issue.

@ohthehugemanatee
Copy link
Author

@jfmatth can you give some more detail? I would expect the k3s installer to validate that...

Anyway no, there's no firewall running. Just a default raspbian install. In fact if I was only trying to solve my own problem I would wipe/reinstall raspbian... but now I'm walking my way through routing in k3s in case this problem hits someone else...

@jfmatth
Copy link

jfmatth commented Mar 30, 2020

Sorry @ohthehugemanatee I'm affraid I don't. There is issue #1543 that some of us are seeing, and I noticed that the same install at home didn't behave the same on Linode. The main difference was the firewall.

You can see my notes there, but basically, any firewall running before install seems to keep both .13 and .14 from working inside the cluster.

Maybe on Raspbian check the iptables --list just to be sure?

@ohthehugemanatee
Copy link
Author

it's true! After uninstall, iptables -L and iptables-legacy -L both showed residual rules hanging around. BUT remember, I had rebooted in between uninstall and reinstall... in any case I did an uninstall, iptables -F; iptables-legacy -F, then reinstall. Same problem.

Metrics server logs are flooded with entries like this:

E0330 19:53:51.599412       1 manager.go:111] unable to fully collect metrics: unable to fully scrape metrics from source kubelet_summary:cluster1: unable to fetch metrics from Kubelet cluster1 (cluster1): Get https://cluster1:10250/stats/summary?only_cpu_and_memory=true: dial tcp: i/o timeout

On the host machine I can wget that address and get an immediate response (and certificate failure since I don't have the cluster's CA in my chain)

I notice that I have iptables 1.8.2 nf_tables... and that used to be a problem.. but that's solved, right? I'm looking at kubernetes/kubernetes#82966 . The fix got into kubernetes 1.17, and I'm running k3s v1.17.4+k3s1 (3eee8ac).

@ohthehugemanatee
Copy link
Author

Got it! W00tarz!

So here's the problem, for future frustrated folk:

Raspbian 10 comes with an iptables wrapper around nf_tables in the kernel. So the command iptables exists, but only as a simlink to iptables_nft. It returns version string iptables v1.8.2 (nf_tables) which seems like it should be correctly handled in check-config.sh. Still, I found firewall entries both in iptables -L (ie nf_tables) and iptables-legacy -L.

The fix was to remove the iptables wrapper and explicitly install nftables:
sudo apt remove iptables -y && sudo apt install nftables

I then reinstalled with a reboot for good measure. And hey presto, everything works!

I'm leaving this issue open and re-titling/describing, because I believe this should be common to all recently updated Raspbian 10 installs, and it probably indicates something to be improved in the installer.

@ohthehugemanatee ohthehugemanatee changed the title Fresh install - no routing inside the cluster Raspbian 10 fresh install has broken routing (iptables/nf_tables detection) Mar 30, 2020
@dictcp
Copy link

dictcp commented Apr 12, 2020

I encountered similar issues. seems like the kubernetes.default / 10.43.0.1 route is broken after reboot / further deployment. My temp workaround:

pi@raspberrypi:~$ export eth0_IP=`xxxxx` # set IP of your node
pi@raspberrypi:~$ sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
pi@raspberrypi:~$ sudo iptables -t nat -I PREROUTING -d 10.43.0.1/32 -p tcp -m tcp --dport 443 -j DNAT --to-destination $eth0_IP:6443
pi@raspberrypi:~$ sudo iptables -t nat -I OUTPUT -d 10.43.0.1/32 -p tcp -m tcp --dport 443 -j DNAT --to-destination $eth0_IP:6443

@dictcp
Copy link

dictcp commented Apr 13, 2020

some ppl report similar issue after install docker
#703

for my own case, it turns out some related IPv6. (unrelated to iptables, seems)
After I disable IPv6 via sysctl. everything works properly. Possibly I will create another separated issue.

@rudderfeet
Copy link

ohthehugemanatee you are amazing - I burned up a couple days trying to figure out why this wasn't working on my Pi Bramble. Your solution worked a charm.

@natikgadzhi
Copy link

Just did a clean install of raspbian on two Rpi4s. update-alternatives method didn't work for me, and I'm reluctant to disable IPv6. @dictcp temp fix on the master node (where the failing containers were) worked for now. Not sure if it'll flush after a reboot and I'll need to add it to a script and cron on reboot for now or not.

Details:

# OS distrib: 
nate-mbp17:~ ls ~/Downloads/2020-05-27-raspios-buster-lite-armhf.zip
/Users/xnutsive/Downloads/2020-05-27-raspios-buster-lite-armhf.zip

# Steps I did: 
sudo apt-get update && apt-get upgrade
sudo apt-get install vim fish tmux git

# Installation
# Used k3s-ansible to setup. 

# Problem 

nate-mbp17:~ kubectl get pods -A -o wide
NAMESPACE     NAME                                     READY   STATUS             RESTARTS   AGE     IP           NODE   NOMINATED NODE   READINESS GATES
kube-system   helm-install-traefik-nspr7               0/1     Completed          0          8m57s   10.42.0.3    rpi2   <none>           <none>
kube-system   svclb-traefik-vh9mz                      2/2     Running            2          7m41s   10.42.1.4    rpi3   <none>           <none>
kube-system   coredns-8655855d6-q79dn                  0/1     Running            1          8m56s   10.42.0.7    rpi2   <none>           <none>
kube-system   traefik-758cd5fc85-9jx77                 1/1     Running            1          7m41s   10.42.1.5    rpi3   <none>           <none>
kube-system   svclb-traefik-wrflb                      2/2     Running            2          7m41s   10.42.0.10   rpi2   <none>           <none>
kube-system   metrics-server-7566d596c8-fb9t5          0/1     CrashLoopBackOff   4          8m56s   10.42.0.9    rpi2   <none>           <none>
kube-system   local-path-provisioner-6d59f47c7-5stjg   0/1     CrashLoopBackOff   5          8m56s   10.42.0.8    rpi2   <none>           <none>

# After applying the iptables tunnel preroute / ouput hack
nate-mbp17:~ kubectl get pods -A -o wide
NAMESPACE     NAME                                     READY   STATUS      RESTARTS   AGE   IP           NODE   NOMINATED NODE   READINESS GATES
kube-system   helm-install-traefik-nspr7               0/1     Completed   0          22m   10.42.0.3    rpi2   <none>           <none>
kube-system   svclb-traefik-wrflb                      2/2     Running     4          20m   10.42.0.13   rpi2   <none>           <none>
kube-system   svclb-traefik-vh9mz                      2/2     Running     4          20m   10.42.1.6    rpi3   <none>           <none>
kube-system   traefik-758cd5fc85-9jx77                 1/1     Running     2          20m   10.42.1.7    rpi3   <none>           <none>
kube-system   coredns-8655855d6-q79dn                  1/1     Running     2          22m   10.42.0.12   rpi2   <none>           <none>
kube-system   local-path-provisioner-6d59f47c7-5stjg   1/1     Running     11         22m   10.42.0.14   rpi2   <none>           <none>
kube-system   metrics-server-7566d596c8-fb9t5          1/1     Running     11         22m   10.42.0.11   rpi2   <none>           <none>

@stale
Copy link

stale bot commented Jul 31, 2021

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

@stale stale bot added the status/stale label Jul 31, 2021
@coopstools
Copy link

@ohthehugemanatee I followed your guidance and ran sudo apt remove iptables -y && sudo apt install nftables, followed by a reboot than an install of k3s, all on a fresh install of raspbian. Now, k3s won't start.

:~ k3s config-check returns

...
- links: aux/ip6tables should link to iptables-detect.sh (fail)
- links: aux/ip6tables-restore should link to iptables-detect.sh (fail)
- links: aux/ip6tables-save should link to iptables-detect.sh (fail)
- links: aux/iptables should link to iptables-detect.sh (fail)
- links: aux/iptables-restore should link to iptables-detect.sh (fail)
- links: aux/iptables-save should link to iptables-detect.sh (fail)
...

Did you also setup alternatives as well? It seems like k3s needs to be able to locate an alternative to iptables, but just installing nftables isn't enough.

@stale stale bot removed the status/stale label Aug 10, 2021
@virtualstaticvoid
Copy link

I had a similar experience, although in my case I rebooted a long running k3s agent and then noticed the network issue.

Running the following commands fixed the issue:

sudo iptables -F
sudo update-alternatives --set iptables /usr/sbin/iptables-legacy
sudo update-alternatives --set ip6tables /usr/sbin/ip6tables-legacy
sudo reboot

I found this solution in the k3s documentation under the Advanced Options and Configuration topic.

@brandond
Copy link
Member

brandond commented Sep 1, 2021

@virtualstaticvoid possibly related to #3117 (comment)

@ohthehugemanatee
Copy link
Author

@coopstools I didn't have to update-alternatives to my memory... but the time between my solution and your problem is long enough to have multiple releases in between. I doubt if my diagnosis still applies on modern raspberry pi os installs. Did you find another solution?

@stale
Copy link

stale bot commented Mar 26, 2022

This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants