-
Notifications
You must be signed in to change notification settings - Fork 6.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Networking broken on CentOS/OracleLinux 8.3 #7268
Comments
I think your problem is because you have iptables enabled but no rules are configured. Please remember that kubespray does not configure iptables for you, so you have to do it by hand or just disable it. Try: |
I know since 8.3 if you are using Mellanox cards you need the very last firmware that was released in january else the kernel thinks the card supports IPIP offload even when it's not the case. Maybe there is a similar issue with VMWare since 8.3.
You can also:
or VXLAN CrossSubnet
or VXLAN
|
Thank you for your suggestions, @antonio-guillen and @champtar! As for
I've also verified that As for suggestions given by @champtar I will try them out bit later today. |
@champtar, network adapters in question are ESXi vmxnet 3. As for the ethtool diff, it is following:
I will check if disabling UDP tunnel TX segmentation helps and report it back here. Update: After running following commands I see RX packet count increasing on
Update2: checked DNS, prometheus deployments and ingresses, and everything is now working flawlessly. |
Can you show
or even better test with CentOS 8 Streams, and report a bug upstream ? |
Full
|
ethtool -i (not -k) to see the driver name and version |
CentOS 8.2:
8.3:
8-stream:
|
Same issue on CentOS-Stream:
|
Can you really test if it's broken, it's likely but just having |
Unfortunately it's still broken, behavior is exactly the same as with 8.3, kubespray deployment appears to be successful, but all
Disabling UDP tunnel TX segmentation resolves the issue on CentOS-Stream too. |
I have same issue on CentOS 8.3. Disabling tx-udp_tnl-csum-segmentation and tx-udp_tnl-segmentation resolved problem |
For anyone wondering, how to add these settings to network interfaces to persist during reboot:
|
Can you also show |
|
firewalld is disabled ? no filtering between the hosts ? |
@champtar indeed calico requires to open port 179/tcp what I missed. Although calico reached ready state after opening mentioned port, other pods (like coredns) claimed that they cannot reach api servers container ip address (10.233.0.1 -> no route to host). Unfortunately I cannot say if this is still related to this calico compatibility issue here but for me it seems very likely... (However I went for flannel as temporary workaround as it seems to work.) |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
I was hitting exactly this issue in my environment: Confirmed that running the ethtool commands worked:
I am running RHEL 8.3 with K8s 1.17.6 deployed using kubespray. Had to set calico iptables backend to NFT in kubespray as it defaults to "legacy" mode. Calico tunneling mode used is IPIP. |
Stale issues rot after 30d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-contributor-experience at kubernetes/community. |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Has the problem been solved? |
Environment:
ESXi VMs
printf "$(uname -srm)\n$(cat /etc/os-release)\n"
):CentOS 8.2:
CentOS 8.3:
ansible --version
):python --version
):Kubespray version (commit) (
git rev-parse --short HEAD
):1a91792
Network plugin used:
Calico with NFT (
calico_iptables_backend: "NFT"
)Full inventory with variables (
ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"
):Command used to invoke ansible:
ansible-playbook cluster.yml -b -i inventory/devk8s1/inventory.yml
Output of ansible run:
All
cluster.yml
playbook tasks are successful both on CentOS 8.2 and 8.3, here's output from playbook run on 8.3:Anything else do we need to know:
Kubespray 2.15 and master does not work with CentOS 8.3. Networking is completely broken after deploying it on CentOS 8.3 (and OracleLinux 8.3) hosts. Using same repository and inventory on CentOS 8.2 works just fine.
All deployments in
kube-system
namespace are running fine, nothing particular in logfiles, but DNS is broken and NodePorts are not reachable. What I've noticed is thattunl0
interface has 0 RX packets, which is probably cause for all issues.On 8.2:
8.3, after performing following commands on 8.2 (without
kubespray
deployed prior to OS update):The text was updated successfully, but these errors were encountered: