Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K3S Breaks after a firewalld reload #7542

Closed
VirtualEvan opened this issue May 13, 2023 · 2 comments
Closed

K3S Breaks after a firewalld reload #7542

VirtualEvan opened this issue May 13, 2023 · 2 comments

Comments

@VirtualEvan
Copy link

Environmental Info:
K3s Version:

k3s version v1.26.4+k3s1 (8d0255a)
go version go1.19.8

Node(s) CPU architecture, OS, and Version:

Linux HOSTNAME 4.18.0-489.el8.x86_64 #1 SMP Thu Apr 27 17:02:11 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux

Cluster Configuration:

Single node configuration, CIDR - 10.44.0.0/16

Describe the bug:

Opening the K3S ports using firewalld as explained here and then installing the application works fine and any app deployed in K3S is accessible.
However, reloading firewalld afterwards (for example, to set up a non Kubernetes service) flushes all iptables rules. This breaks any deployment in K3S, making then not accessible anymore.

I tried restarting the K3S service without any success, the K3S iptables rules are not added back.
I have found that rebooting the whole physical server gets it back to to work again. So, there is a step there I have not been able to reproduce manually.
Uninstalling and re-installing K3S again works as well, although this is not very handy.

Steps To Reproduce:

  1. Set up firewalld as instructed here: https://docs.k3s.io/advanced#red-hat-enterprise-linux--centos
    • Allowed 10.44.0.0 instead of 10.42.0.0 (10.42.0.0 is already in use)
  2. Install K3S
    • Beware that the provided script runs the K3S uninstallation script
    • --write-kubeconfig-mode 644
    • --cluster-cidr "10.44.0.0/16"
  3. Deploy whoami service for testing
  4. Check that the whoami service works
  5. Reload firewalld
  6. Check that the whoami service doesn't work anymore
  7. Check the before.txt and after.txt file from differences in iptables

Provided script to reproduce the steps above

#!/bin/bash

set -e
current_dir=$(dirname -- $0)

# Clean up current K3S installation
if [ -f /usr/local/bin/k3s-uninstall.sh ]
then
    /usr/local/bin/k3s-uninstall.sh
fi

# Open firewall for K3S
# https://docs.k3s.io/advanced#red-hat-enterprise-linux--centos
echo "Setting up firewall"
firewall-cmd --permanent --add-port=6443/tcp # apiserver
firewall-cmd --permanent --add-port=10250/tcp # metrics
firewall-cmd --permanent --zone=trusted --add-source=10.44.0.0/16 # pods
firewall-cmd --permanent --zone=trusted --add-source=10.43.0.0/16 # services
firewall-cmd --reload

# Install K3S
# https://docs.k3s.io/quick-start
# https://docs.k3s.io/cli/server
curl -sfL https://get.k3s.io | sh -s - --write-kubeconfig-mode 644 --cluster-cidr "10.44.0.0/16"

# Wait for K3S to be ready
while
    PODS=$(/usr/local/bin/kubectl get pods --namespace kube-system 2>&1)
    [ "$PODS" = "No resources found in kube-system namespace." ]
do
    sleep 3
done
/usr/local/bin/kubectl wait jobs --namespace kube-system --all --for condition=complete --timeout 300s
/usr/local/bin/kubectl wait pods --namespace kube-system --selector='!job-name' --for condition=ready --timeout 300s
/usr/local/bin/kubectl get pods -A

# Test deployment
/usr/local/bin/kubectl apply -f $current_dir/whoami.yml

# Wait for deployment to finish
/usr/local/bin/kubectl wait pods --namespace whoami-example --all --for condition=ready --timeout 120s

# Test
curl --fail http://whoami.localhost

# Log iptables before reloading firewalld
iptables -L -v -n --line-numbers > $current_dir/before_reload.txt

# Reload firewall
firewall-cmd --reload
sleep 10

# Log iptables after reloading firewalld
iptables -L -v -n --line-numbers > $current_dir/after_reload.txt

sleep 30

# Log iptables after reloading firewalld
iptables -L -v -n --line-numbers > $current_dir/after_reload_and_30_secs.txt

# Test
curl --fail http://whoami.localhost
apiVersion: v1
kind: Namespace
metadata:
  name: whoami-example
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: whoami-deploy
  namespace: whoami-example
  labels:
    app: whoami
spec:
  replicas: 2
  selector:
    matchLabels:
      app: whoami
  template:
    metadata:
      labels:
        app: whoami
    spec:
      containers:
        - name: whoami
          image: traefik/whoami:latest
          ports:
            - name: whoami
              containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: whoami-service
  namespace: whoami-example
  labels:
    service: whoami
spec:
  type: ClusterIP
  ports:
    - name: http
      port: 80
      protocol: TCP
  selector:
    app: whoami
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: whoami-ingress
  namespace: whoami-example
  annotations:
    traefik.ingress.kubernetes.io/router.entrypoints: web
spec:
  rules:
    - host: whoami.localhost
      http:
        paths:
          - path: /
            pathType: Exact
            backend:
              service:
                name: whoami-service
                port:
                  number: 80

Expected behavior:

K3S adds back all its iptables rules after a firewalld reload. Not sure if there is any option for this that I have missed.

As mentioned above, it all goes back to normal after a system reboot.
Knowing the command to manually reload K3S iptables rules could be a temporal solution to be executed after every reload. Not ideal, but would be helpful

Actual behavior:

All K3S-related iptables rules are flushed after firewalld reloads
All K3S deployments are inaccesible

PS: While writing creating this issue I noticed that ~30 seconds after reloading firewalld, a few rules are added back.
This means that K3S is indeed reactive to firewalld reloads, but fore some reason not all the rules are restored and the services remain inaccessible

Additional context / logs:

It looks like other similar technologies have had similar issues. Although I lack the expertise to know whether this specific cases could be relevant for K3S.
weaveworks/weave#3586
https://github.com/moby/moby/pull/9397/files

iptables output:
1_iptables_before_firewalld_reload.txt
2_iptables_after_firewalld_reload.txt
3_iptables_after_firewalld_reload_and_30_secs.txt
4_iptables_after_system_reboot.txt

@VirtualEvan
Copy link
Author

Bonus question:

Is there some way to make my system aware of the ports being used by K3S?
The devices I am working on are multi-user and not exclusive to K3S. So I am afraid that, at some point, somebody will end trying to bind their own service to one this ports

@brandond
Copy link
Member

However, reloading firewalld afterwards (for example, to set up a non Kubernetes service) flushes all iptables rules. This breaks any deployment in K3S, making then not accessible anymore.

Don't do that. We don't recommend using self-managed iptables-based firewalls alongside k3s. If you must, make sure that they don't attempt to exclusively manage the iptables ruleset, and make sure that you've allowed access to/from the cluster CIDRs. See the examples at https://docs.k3s.io/advanced#red-hat-enterprise-linux--centos

@github-project-automation github-project-automation bot moved this from New to Done Issue in K3s Development May 15, 2023
@k3s-io k3s-io locked and limited conversation to collaborators May 15, 2023
@brandond brandond converted this issue into discussion #7555 May 15, 2023

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
None yet
Projects
Archived in project
Development

No branches or pull requests

2 participants