-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Raspbian 10 fresh install has broken routing (iptables/nf_tables detection) #1597
Comments
More debugging information: I tried launching a Coredns logs are filled with entries like this:
Same for And I guess because kube-dns isn't ready, it doesn't have any endpoints, even after 8 hours. :| Logs for metrics-server are filled with: So I'm left with some kind of problem with routing. Traffic originating inside the cluster seems to be getting nowhere. Services can't talk to each other, or to the outside world. The only IP my dnsutils pod can ping is that of the host machine, I uninstalled, rebooted, and reinstalled once again, and the problem persists. So clearly there is some possible system state that causes this on a fresh install. I just don't know what. :( |
@ohthehugemanatee Have you checked if u have a firewall running? I have another issue where the install isn't working, and I've found the local linux firewall (firewalld or UFW) is the issue. |
@jfmatth can you give some more detail? I would expect the k3s installer to validate that... Anyway no, there's no firewall running. Just a default raspbian install. In fact if I was only trying to solve my own problem I would wipe/reinstall raspbian... but now I'm walking my way through routing in k3s in case this problem hits someone else... |
Sorry @ohthehugemanatee I'm affraid I don't. There is issue #1543 that some of us are seeing, and I noticed that the same install at home didn't behave the same on Linode. The main difference was the firewall. You can see my notes there, but basically, any firewall running before install seems to keep both .13 and .14 from working inside the cluster. Maybe on Raspbian check the |
it's true! After uninstall, Metrics server logs are flooded with entries like this:
On the host machine I can I notice that I have iptables 1.8.2 nf_tables... and that used to be a problem.. but that's solved, right? I'm looking at kubernetes/kubernetes#82966 . The fix got into kubernetes 1.17, and I'm running k3s v1.17.4+k3s1 (3eee8ac). |
Got it! W00tarz! So here's the problem, for future frustrated folk: Raspbian 10 comes with an iptables wrapper around nf_tables in the kernel. So the command iptables exists, but only as a simlink to iptables_nft. It returns version string The fix was to remove the iptables wrapper and explicitly install nftables: I then reinstalled with a reboot for good measure. And hey presto, everything works! I'm leaving this issue open and re-titling/describing, because I believe this should be common to all recently updated Raspbian 10 installs, and it probably indicates something to be improved in the installer. |
I encountered similar issues. seems like the
|
some ppl report similar issue after install docker for my own case, it turns out some related IPv6. (unrelated to iptables, seems) |
ohthehugemanatee you are amazing - I burned up a couple days trying to figure out why this wasn't working on my Pi Bramble. Your solution worked a charm. |
Just did a clean install of raspbian on two Rpi4s. update-alternatives method didn't work for me, and I'm reluctant to disable IPv6. @dictcp temp fix on the master node (where the failing containers were) worked for now. Not sure if it'll flush after a reboot and I'll need to add it to a script and cron on reboot for now or not. Details: # OS distrib:
nate-mbp17:~ ls ~/Downloads/2020-05-27-raspios-buster-lite-armhf.zip
/Users/xnutsive/Downloads/2020-05-27-raspios-buster-lite-armhf.zip
# Steps I did:
sudo apt-get update && apt-get upgrade
sudo apt-get install vim fish tmux git
# Installation
# Used k3s-ansible to setup.
# Problem
nate-mbp17:~ kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system helm-install-traefik-nspr7 0/1 Completed 0 8m57s 10.42.0.3 rpi2 <none> <none>
kube-system svclb-traefik-vh9mz 2/2 Running 2 7m41s 10.42.1.4 rpi3 <none> <none>
kube-system coredns-8655855d6-q79dn 0/1 Running 1 8m56s 10.42.0.7 rpi2 <none> <none>
kube-system traefik-758cd5fc85-9jx77 1/1 Running 1 7m41s 10.42.1.5 rpi3 <none> <none>
kube-system svclb-traefik-wrflb 2/2 Running 2 7m41s 10.42.0.10 rpi2 <none> <none>
kube-system metrics-server-7566d596c8-fb9t5 0/1 CrashLoopBackOff 4 8m56s 10.42.0.9 rpi2 <none> <none>
kube-system local-path-provisioner-6d59f47c7-5stjg 0/1 CrashLoopBackOff 5 8m56s 10.42.0.8 rpi2 <none> <none>
# After applying the iptables tunnel preroute / ouput hack
nate-mbp17:~ kubectl get pods -A -o wide
NAMESPACE NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
kube-system helm-install-traefik-nspr7 0/1 Completed 0 22m 10.42.0.3 rpi2 <none> <none>
kube-system svclb-traefik-wrflb 2/2 Running 4 20m 10.42.0.13 rpi2 <none> <none>
kube-system svclb-traefik-vh9mz 2/2 Running 4 20m 10.42.1.6 rpi3 <none> <none>
kube-system traefik-758cd5fc85-9jx77 1/1 Running 2 20m 10.42.1.7 rpi3 <none> <none>
kube-system coredns-8655855d6-q79dn 1/1 Running 2 22m 10.42.0.12 rpi2 <none> <none>
kube-system local-path-provisioner-6d59f47c7-5stjg 1/1 Running 11 22m 10.42.0.14 rpi2 <none> <none>
kube-system metrics-server-7566d596c8-fb9t5 1/1 Running 11 22m 10.42.0.11 rpi2 <none> <none>
|
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions. |
@ohthehugemanatee I followed your guidance and ran
Did you also setup alternatives as well? It seems like k3s needs to be able to locate an alternative to iptables, but just installing nftables isn't enough. |
I had a similar experience, although in my case I rebooted a long running k3s agent and then noticed the network issue. Running the following commands fixed the issue:
I found this solution in the k3s documentation under the Advanced Options and Configuration topic. |
@virtualstaticvoid possibly related to #3117 (comment) |
@coopstools I didn't have to update-alternatives to my memory... but the time between my solution and your problem is long enough to have multiple releases in between. I doubt if my diagnosis still applies on modern raspberry pi os installs. Did you find another solution? |
This repository uses a bot to automatically label issues which have not had any activity (commit/comment/label) for 180 days. This helps us manage the community issues better. If the issue is still relevant, please add a comment to the issue so the bot can remove the label and we know it is still valid. If it is no longer relevant (or possibly fixed in the latest release), the bot will automatically close the issue in 14 days. Thank you for your contributions. |
Version:
k3s version v1.17.4+k3s1 (3eee8ac) on a raspberry pi 4 running Raspbian 10.
K3s arguments:
curl -sfL https://get.k3s.io | sh -
Describe the bug
On a fresh install, no traffic is routed inside the cluster, even for core services. Resolution is to uninstall the default iptables v1.8.2 (nf_tables), and install nftables.
On a fresh install, no traffic is routed inside the cluster. nodes cannot reach each other or coredns. Core services can't reach each other or the api. Ports are not opened on the physical host.The host is not listening on port 80. Traefik LB reports that it is listening on port 80, butsudo netstat -tlp |grep 80
disagrees. External hosts cannot access created ingresses.To Reproduce
kubectl run -it --rm --restart=Never dnsutils --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 sh
wget -O- github.com
, orwget -O- kubernetes.default
and observe "invalid name" errors. Try pinging any IP you please - the DNS server, external IPs - and observe failures.Expected behavior
Traffic inside the cluster should be routed.
Actual behavior
No traffic is routed inside the cluster. Services (even kube-system) can't reach each other, nothing can reach the API server, etc.
First symptom I noticed was that ingresses failed to open port 80, and services couldn't reach their pods.
Additional context / logs
Fresh uninstall/reinstall on a raspbian host with IP 192.168.1.41:
This all started with a power loss/reboot of my working pi cluster, after an apt update.
See my eventual resolution. Seems to me that it was applying rules to both nftables and iptables-legacy, and there was some conflict.
UPDATE: changed focus now that I know routing is completely borked.
UPDATE 2: rewrite title/description after I discovered/resolved the problem. Left open because I believe it will affect other Raspbian 10 users and could probably use a PR to improve iptables vs nftables behavior.
The text was updated successfully, but these errors were encountered: