Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Networking broken on CentOS/OracleLinux 8.3 #7268

Closed
karlism opened this issue Feb 8, 2021 · 23 comments
Closed

Networking broken on CentOS/OracleLinux 8.3 #7268

karlism opened this issue Feb 8, 2021 · 23 comments
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@karlism
Copy link

karlism commented Feb 8, 2021

Environment:
ESXi VMs

  • OS (printf "$(uname -srm)\n$(cat /etc/os-release)\n"):
    CentOS 8.2:
Linux 4.18.0-193.28.1.el8_2.x86_64 x86_64
NAME="CentOS Linux"
VERSION="8 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="8"

CentOS 8.3:

Linux 4.18.0-240.10.1.el8_3.x86_64 x86_64
NAME="CentOS Linux"
VERSION="8"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="8"
PLATFORM_ID="platform:el8"
PRETTY_NAME="CentOS Linux 8"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:8"
HOME_URL="https://centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"
CENTOS_MANTISBT_PROJECT="CentOS-8"
CENTOS_MANTISBT_PROJECT_VERSION="8"
  • Version of Ansible (ansible --version):
ansible 2.9.17
  config file = /home/username/.ansible.cfg
  configured module search path = ['/home/username/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
  ansible python module location = /usr/local/lib/python3.8/site-packages/ansible
  executable location = /usr/local/bin/ansible
  python version = 3.8.6 (default, Oct 13 2020, 09:04:17) [Clang 10.0.1 ]
  • Version of Python (python --version):
Python 3.8.6

Kubespray version (commit) (git rev-parse --short HEAD):
1a91792

Network plugin used:
Calico with NFT (calico_iptables_backend: "NFT")

Full inventory with variables (ansible -i inventory/sample/inventory.ini all -m debug -a "var=hostvars[inventory_hostname]"):

Command used to invoke ansible:
ansible-playbook cluster.yml -b -i inventory/devk8s1/inventory.yml

Output of ansible run:

All cluster.yml playbook tasks are successful both on CentOS 8.2 and 8.3, here's output from playbook run on 8.3:

PLAY RECAP ****************************************************************************
XXXdevketcd1               : ok=123  changed=30   unreachable=0    failed=0    skipped=228  rescued=0    ignored=0   
YYYdevketcd1               : ok=123  changed=30   unreachable=0    failed=0    skipped=228  rescued=0    ignored=0   
devkmaster1a            : ok=469  changed=91   unreachable=0    failed=0    skipped=929  rescued=0    ignored=1   
devkmaster1b            : ok=471  changed=92   unreachable=0    failed=0    skipped=927  rescued=0    ignored=1   
devknode1a              : ok=348  changed=65   unreachable=0    failed=0    skipped=591  rescued=0    ignored=1   
devknode1b              : ok=348  changed=65   unreachable=0    failed=0    skipped=591  rescued=0    ignored=1   
devknode1c              : ok=348  changed=65   unreachable=0    failed=0    skipped=591  rescued=0    ignored=1   
localhost                  : ok=1    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0   
ZZZdevketcd1               : ok=133  changed=33   unreachable=0    failed=0    skipped=240  rescued=0    ignored=0   

Monday 08 February 2021  16:59:46 +0100 (0:00:00.675)       0:35:42.779 ******* 
=============================================================================== 
Gen_certs | Write etcd master certs ---------------------------------------------------------------------------------------------- 124.79s
container-engine/containerd : ensure containerd packages are installed ---------------------------------------------------------------------------------------------- 75.03s
kubernetes/control-plane : Joining control plane node to the cluster. ---------------------------------------------------------------------------------------------- 51.19s
kubernetes/control-plane : kubeadm | Initialize first master ---------------------------------------------------------------------------------------------- 45.42s
download | Download files / images ---------------------------------------------------------------------------------------------- 27.83s
kubernetes/kubeadm : Join to cluster ---------------------------------------------------------------------------------------------- 20.96s
bootstrap-os : Install EPEL for Oracle Linux repo package ---------------------------------------------------------------------------------------------- 20.89s
Gather necessary facts ---------------------------------------------------------------------- 20.85s
container-engine/crictl : download_file | Download item ---------------------------------------------------------------------------------------------- 19.27s
kubernetes/preinstall : Install packages requirements ---------------------------------------------------------------------------------------------- 17.35s
container-engine/crictl : extract_file | Unpacking archive ---------------------------------------------------------------------------------------------- 17.10s
bootstrap-os : Install libselinux python package ---------------------------------------------------------------------------------------------- 17.02s
kubernetes-apps/ansible : Kubernetes Apps | Lay Down CoreDNS Template ---------------------------------------------------------------------------------------------- 15.99s
download_container | Download image if required ---------------------------------------------------------------------------------------------- 15.34s
download | Download files / images ---------------------------------------------------------------------------------------------- 15.17s
download_container | Download image if required ---------------------------------------------------------------------------------------------- 13.25s
download | Download files / images ---------------------------------------------------------------------------------------------- 12.69s
kubernetes/node-label : Set label to node ---------------------------------------------------------------------------------------------- 12.55s
download | Download files / images ---------------------------------------------------------------------------------------------- 12.14s
download | Download files / images ---------------------------------------------------------------------------------------------- 12.09s

Anything else do we need to know:

Kubespray 2.15 and master does not work with CentOS 8.3. Networking is completely broken after deploying it on CentOS 8.3 (and OracleLinux 8.3) hosts. Using same repository and inventory on CentOS 8.2 works just fine.
All deployments in kube-system namespace are running fine, nothing particular in logfiles, but DNS is broken and NodePorts are not reachable. What I've noticed is that tunl0 interface has 0 RX packets, which is probably cause for all issues.

On 8.2:

[username@devkmaster1a ~]$ lsmod 
Module                  Size  Used by
nft_chain_nat_ipv6     16384  4
nf_conntrack_ipv6      20480  1
nf_nat_ipv6            16384  1 nft_chain_nat_ipv6
xt_CT                  16384  8
nf_conntrack_netlink    49152  0
ipt_rpfilter           16384  1
xt_multiport           16384  7
ip_set_hash_ip         36864  1
ip_set_hash_net        36864  4
veth                   28672  0
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
xt_addrtype            16384  5
xt_set                 16384  13
ip_set_hash_ipportnet    40960  1
ip_set_hash_ipportip    36864  2
ip_set_hash_ipport     36864  9
ip_set_bitmap_port     16384  4
ip_set                 49152  7 ip_set_hash_ipportnet,ip_set_bitmap_port,ip_set_hash_ip,xt_set,ip_set_hash_net,ip_set_hash_ipport,ip_set_hash_ipportip
dummy                  16384  0
nft_chain_route_ipv4    16384  1
ipt_MASQUERADE         16384  4
xt_conntrack           16384  17
xt_comment             16384  235
nft_counter            16384  205
xt_mark                16384  85
nft_compat             20480  387
nft_chain_nat_ipv4     16384  4
nf_nat_ipv4            16384  2 ipt_MASQUERADE,nft_chain_nat_ipv4
nf_nat                 36864  2 nf_nat_ipv6,nf_nat_ipv4
nf_tables             151552  302 nft_chain_route_ipv4,nft_compat,nft_chain_nat_ipv6,nft_chain_nat_ipv4,nft_counter
nfnetlink              16384  4 nft_compat,nf_conntrack_netlink,nf_tables,ip_set
nf_conntrack_ipv4      16384  22
nf_defrag_ipv4         16384  1 nf_conntrack_ipv4
ip_vs_sh               16384  0
ip_vs_wrr              16384  0
ip_vs_rr               16384  33
ip_vs                 172032  39 ip_vs_rr,ip_vs_sh,ip_vs_wrr
nf_defrag_ipv6         20480  2 nf_conntrack_ipv6,ip_vs
overlay               126976  21
vmw_vsock_vmci_transport    32768  1
vsock                  40960  2 vmw_vsock_vmci_transport
xfs                  1519616  1
intel_rapl_msr         16384  0
intel_rapl_common      24576  1 intel_rapl_msr
sb_edac                24576  0
crct10dif_pclmul       16384  1
crc32_pclmul           16384  0
ghash_clmulni_intel    16384  0
vmw_balloon            24576  0
intel_rapl_perf        20480  0
joydev                 24576  0
pcspkr                 16384  0
vmw_vmci               81920  2 vmw_balloon,vmw_vsock_vmci_transport
i2c_piix4              24576  0
nf_conntrack          155648  10 xt_conntrack,nf_conntrack_ipv6,nf_conntrack_ipv4,nf_nat,nf_nat_ipv6,ipt_MASQUERADE,nf_nat_ipv4,nf_conntrack_netlink,xt_CT,ip_vs
libcrc32c              16384  4 nf_conntrack,nf_nat,xfs,ip_vs
br_netfilter           24576  0
bridge                192512  1 br_netfilter
stp                    16384  1 bridge
llc                    16384  2 bridge,stp
ip_tables              28672  0
ext4                  749568  7
mbcache                16384  1 ext4
jbd2                  122880  1 ext4
ata_generic            16384  0
vmwgfx                352256  1
sd_mod                 53248  3
sg                     40960  0
drm_kms_helper        212992  1 vmwgfx
syscopyarea            16384  1 drm_kms_helper
crc32c_intel           24576  15
sysfillrect            16384  1 drm_kms_helper
sysimgblt              16384  1 drm_kms_helper
fb_sys_fops            16384  1 drm_kms_helper
ttm                   114688  1 vmwgfx
ata_piix               36864  0
drm                   536576  4 vmwgfx,drm_kms_helper,ttm
serio_raw              16384  0
libata                270336  2 ata_piix,ata_generic
vmxnet3                61440  0
vmw_pvscsi             28672  2
dm_mirror              28672  0
dm_region_hash         20480  1 dm_mirror
dm_log                 20480  2 dm_region_hash,dm_mirror
dm_mod                151552  23 dm_log,dm_mirror
fuse                  131072  1

[username@devkmaster1a ~]$ ifconfig tunl0
tunl0: flags=193<UP,RUNNING,NOARP>  mtu 1440
        inet 10.233.103.0  netmask 255.255.255.255
        tunnel   txqueuelen 1000  (IPIP Tunnel)
        RX packets 30785  bytes 12418805 (11.8 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 32600  bytes 12507878 (11.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

NAME                 TYPE       CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE
netchecker-service   NodePort   10.233.49.116   <none>        8081:31081/TCP   146m

NAME                 ENDPOINTS           AGE
netchecker-service   10.233.93.10:8081   150m

[username@devkmaster1a ~]$ curl devkmaster1a:31081/api/v1/connectivity_check
{"Message":"All 20 pods successfully reported back to the server","Absent":null,"Outdated":null}

[username@devkmaster1a ~]$ kubectl run dnsutils-${RANDOM} --rm -i -t --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 -- /bin/sh
If you don't see a command prompt, try pressing enter.
/ # cat /etc/resolv.conf 
search default.svc.dev-cluster.local svc.dev-cluster.local dev-cluster.local
nameserver 169.254.25.10
options ndots:5

/ # ping -c 2 169.254.25.10
PING 169.254.25.10 (169.254.25.10): 56 data bytes
64 bytes from 169.254.25.10: seq=0 ttl=64 time=0.113 ms
64 bytes from 169.254.25.10: seq=1 ttl=64 time=0.067 ms
--- 169.254.25.10 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.067/0.090/0.113 ms

/ # ping -c 2 google.com
PING google.com (74.125.140.101): 56 data bytes
64 bytes from 74.125.140.101: seq=0 ttl=109 time=7.708 ms
64 bytes from 74.125.140.101: seq=1 ttl=109 time=7.404 ms
--- google.com ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 7.404/7.556/7.708 ms

8.3, after performing following commands on 8.2 (without kubespray deployed prior to OS update):

ansible -b -a "yum update -y" 'devkmaster1?,devknode1?'
ansible -b -a "yum autoremove -y" 'devkmaster1?,devknode1?'
ansible -b -m reboot 'devkmaster1?,devknode1?'
[username@devkmaster1a ~]$ time kubectl get nodes
NAME              STATUS   ROLES                  AGE   VERSION
devkmaster1a   Ready    control-plane,master   14m   v1.20.2
devkmaster1b   Ready    control-plane,master   14m   v1.20.2
devknode1a     Ready    <none>                 12m   v1.20.2
devknode1b     Ready    <none>                 12m   v1.20.2
devknode1c     Ready    <none>                 12m   v1.20.2

real	0m15.198s
user	0m0.123s
sys	0m0.029s

[username@devkmaster1a ~]$ sudo /usr/local/bin/calicoctl node status
Calico process is running.
IPv4 BGP status
+--------------+-------------------+-------+----------+-------------+
| PEER ADDRESS |     PEER TYPE     | STATE |  SINCE   |    INFO     |
+--------------+-------------------+-------+----------+-------------+
| 172.16.35.11 | node-to-node mesh | up    | 15:54:08 | Established |
| 172.16.35.12 | node-to-node mesh | up    | 15:54:08 | Established |
| 172.16.35.15 | node-to-node mesh | up    | 15:54:09 | Established |
| 172.16.35.16 | node-to-node mesh | up    | 15:54:09 | Established |
| 172.16.35.17 | node-to-node mesh | up    | 15:54:08 | Established |
+--------------+-------------------+-------+----------+-------------+
IPv6 BGP status
No IPv6 peers found.

[username@devkmaster1a ~]$ lsmod
Module                  Size  Used by
xt_CT                  16384  8
nf_conntrack_netlink    49152  0
xt_multiport           16384  3
ipt_rpfilter           16384  1
ip_set_hash_ip         36864  1
ip_set_hash_net        36864  3
veth                   28672  0
ipip                   16384  0
tunnel4                16384  1 ipip
ip_tunnel              28672  1 ipip
xt_addrtype            16384  5
xt_set                 16384  10
ip_set_hash_ipportnet    40960  1
ip_set_hash_ipportip    36864  2
ip_set_bitmap_port     16384  4
ip_set_hash_ipport     36864  8
ip_set                 49152  7 ip_set_hash_ipportnet,ip_set_bitmap_port,ip_set_hash_ip,xt_set,ip_set_hash_net,ip_set_hash_ipport,ip_set_hash_ipportip
dummy                  16384  0
ipt_MASQUERADE         16384  4
xt_conntrack           16384  9
xt_comment             16384  130
nft_counter            16384  127
xt_mark                16384  43
nft_compat             20480  225
nft_chain_nat          16384  4
nf_nat                 45056  2 ipt_MASQUERADE,nft_chain_nat
nf_tables             167936  204 nft_compat,nft_counter,nft_chain_nat
nfnetlink              16384  4 nft_compat,nf_conntrack_netlink,nf_tables,ip_set
ip_vs_sh               16384  0
ip_vs_wrr              16384  0
ip_vs_rr               16384  5
ip_vs                 172032  11 ip_vs_rr,ip_vs_sh,ip_vs_wrr
overlay               135168  14
vsock_loopback         16384  0
vmw_vsock_virtio_transport_common    32768  1 vsock_loopback
vmw_vsock_vmci_transport    32768  1
vsock                  45056  5 vmw_vsock_virtio_transport_common,vsock_loopback,vmw_vsock_vmci_transport
xfs                  1511424  1
intel_rapl_msr         16384  0
intel_rapl_common      24576  1 intel_rapl_msr
sb_edac                24576  0
crct10dif_pclmul       16384  1
crc32_pclmul           16384  0
ghash_clmulni_intel    16384  0
vmw_balloon            24576  0
intel_rapl_perf        20480  0
joydev                 24576  0
pcspkr                 16384  0
vmw_vmci               86016  2 vmw_balloon,vmw_vsock_vmci_transport
i2c_piix4              24576  0
ip_tables              28672  0
ext4                  761856  7
mbcache                16384  1 ext4
jbd2                  131072  1 ext4
ata_generic            16384  0
vmwgfx                364544  1
drm_kms_helper        217088  1 vmwgfx
sd_mod                 53248  3
syscopyarea            16384  1 drm_kms_helper
sysfillrect            16384  1 drm_kms_helper
sysimgblt              16384  1 drm_kms_helper
fb_sys_fops            16384  1 drm_kms_helper
sg                     40960  0
ttm                   110592  1 vmwgfx
drm                   557056  4 vmwgfx,drm_kms_helper,ttm
ata_piix               36864  0
libata                270336  2 ata_piix,ata_generic
serio_raw              16384  0
vmxnet3                65536  0
vmw_pvscsi             28672  2
dm_mirror              28672  0
dm_region_hash         20480  1 dm_mirror
dm_log                 20480  2 dm_region_hash,dm_mirror
dm_mod                151552  23 dm_log,dm_mirror
fuse                  131072  1
nf_conntrack          172032  6 xt_conntrack,nf_nat,ipt_MASQUERADE,nf_conntrack_netlink,xt_CT,ip_vs
nf_defrag_ipv6         20480  2 nf_conntrack,ip_vs
nf_defrag_ipv4         16384  1 nf_conntrack
libcrc32c              16384  4 nf_conntrack,nf_nat,xfs,ip_vs
crc32c_intel           24576  15
br_netfilter           24576  0
bridge                192512  1 br_netfilter
stp                    16384  1 bridge
llc                    16384  2 bridge,stp

[username@devkmaster1a ~]$ ifconfig tunl0
tunl0: flags=193<UP,RUNNING,NOARP>  mtu 1440
        inet 10.233.103.0  netmask 255.255.255.255
        tunnel   txqueuelen 1000  (IPIP Tunnel)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8592  bytes 515205 (503.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0


NAME                 TYPE       CLUSTER-IP     EXTERNAL-IP   PORT(S)          AGE
netchecker-service   NodePort   10.233.54.37   <none>        8081:31081/TCP   117s
NAME                 ENDPOINTS          AGE
netchecker-service   10.233.93.6:8081   2m30s
[username@devkmaster1a ~]$ curl devkmaster1a:31081/api/v1/connectivity_check 
curl: (7) Failed to connect to devkmaster1a port 31081: Connection timed out

[username@devkmaster1a ~]$ kubectl run dnsutils-${RANDOM} --rm -i -t --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 -- /bin/sh
If you don't see a command prompt, try pressing enter.
/ # 
/ # cat /etc/resolv.conf 
search default.svc.dev-cluster.local svc.dev-cluster.local dev-cluster.local
nameserver 169.254.25.10
options ndots:5

/ # ping -c 2 169.254.25.10
PING 169.254.25.10 (169.254.25.10): 56 data bytes
64 bytes from 169.254.25.10: seq=0 ttl=64 time=0.094 ms
64 bytes from 169.254.25.10: seq=1 ttl=64 time=0.098 ms
--- 169.254.25.10 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.094/0.096/0.098 ms

/ # ping -c 2 google.com
ping: bad address 'google.com'
@karlism karlism added the kind/bug Categorizes issue or PR as related to a bug. label Feb 8, 2021
@antonio-guillen
Copy link

I think your problem is because you have iptables enabled but no rules are configured.

Please remember that kubespray does not configure iptables for you, so you have to do it by hand or just disable it.

Try:
systemctl stop firewalld

@champtar
Copy link
Contributor

champtar commented Feb 9, 2021

I know since 8.3 if you are using Mellanox cards you need the very last firmware that was released in january else the kernel thinks the card supports IPIP offload even when it's not the case. Maybe there is a similar issue with VMWare since 8.3.

  • tcpdump the IPIP traffic and have a look at the ckecksums
  • diff ethtool -k intf between 8.2 and 8.3

You can also:
Try switching to IPIP CrossSubnet

calicoctl.sh patch ipPool default-pool -p '{"spec":{"ipipMode": "CrossSubnet", "vxlanMode": "Never"}}'

or VXLAN CrossSubnet

calicoctl.sh patch ipPool default-pool -p '{"spec":{"ipipMode": "Never", "vxlanMode": "CrossSubnet"}}'

or VXLAN

calicoctl.sh patch ipPool default-pool -p '{"spec":{"ipipMode": "Never", "vxlanMode": "Always"}}'

@karlism
Copy link
Author

karlism commented Feb 9, 2021

Thank you for your suggestions, @antonio-guillen and @champtar!

As for firewalld, it is installed on the systems and disabled:

$ systemctl status firewalld
● firewalld.service - firewalld - dynamic firewall daemon
   Loaded: loaded (/usr/lib/systemd/system/firewalld.service; disabled; vendor preset: enabled)
   Active: inactive (dead)
     Docs: man:firewalld(1)

I've also verified that kube-proxy and calico are setting firewall rules properly by running nft list ruleset, and also checked that these rulesets on 8.2 and 8.3 look pretty similar.

As for suggestions given by @champtar I will try them out bit later today.

@karlism
Copy link
Author

karlism commented Feb 9, 2021

@champtar, network adapters in question are ESXi vmxnet 3. As for the ethtool diff, it is following:

--- /tmp/ens82	Tue Feb  9 14:12:05 2021
+++ /tmp/ens83	Tue Feb  9 14:12:21 2021
@@ -1,4 +1,4 @@
-[CentOS 8.2]# ethtool -k ens192
+[CentOS 8.3]# ethtool -k ens192
 Features for ens192:
 rx-checksumming: on
 tx-checksumming: on
@@ -33,8 +33,8 @@
 tx-gre-csum-segmentation: off [fixed]
 tx-ipxip4-segmentation: off [fixed]
 tx-ipxip6-segmentation: off [fixed]
-tx-udp_tnl-segmentation: off [fixed]
-tx-udp_tnl-csum-segmentation: off [fixed]
+tx-udp_tnl-segmentation: on
+tx-udp_tnl-csum-segmentation: on
 tx-gso-partial: off [fixed]
 tx-sctp-segmentation: off [fixed]
 tx-esp-segmentation: off [fixed]

I will check if disabling UDP tunnel TX segmentation helps and report it back here.

Update: After running following commands I see RX packet count increasing on tunl0 interfaces:

ansible -b -a 'ethtool -K ens192 tx-udp_tnl-csum-segmentation off' 'devknode1?,devkmaster1?'
ansible -b -a 'ethtool -K ens192 tx-udp_tnl-segmentation off' 'devknode1?,devkmaster1?'

Update2: checked DNS, prometheus deployments and ingresses, and everything is now working flawlessly.

@champtar
Copy link
Contributor

champtar commented Feb 9, 2021

Can you show

ethtool -i ens192
uname -a

or even better test with CentOS 8 Streams, and report a bug upstream ?

@karlism
Copy link
Author

karlism commented Feb 9, 2021

Full ethtool output here, sure I can also test it with CentOS Streams:

[CentOS 8.2]# ethtool -k ens192
Features for ens192:
rx-checksumming: on
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
	tx-tcp-segmentation: on
	tx-tcp-ecn-segmentation: off [fixed]
	tx-tcp-mangleid-segmentation: off
	tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
tx-udp_tnl-csum-segmentation: off [fixed]
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
tls-hw-rx-offload: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]
[CentOS 8.3]# ethtool -k ens192
Features for ens192:
rx-checksumming: on
tx-checksumming: on
	tx-checksum-ipv4: off [fixed]
	tx-checksum-ip-generic: on
	tx-checksum-ipv6: off [fixed]
	tx-checksum-fcoe-crc: off [fixed]
	tx-checksum-sctp: off [fixed]
scatter-gather: on
	tx-scatter-gather: on
	tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: on
	tx-tcp-segmentation: on
	tx-tcp-ecn-segmentation: off [fixed]
	tx-tcp-mangleid-segmentation: off
	tx-tcp6-segmentation: on
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off
rx-vlan-offload: on
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: on
highdma: on
rx-vlan-filter: on [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-gre-csum-segmentation: off [fixed]
tx-ipxip4-segmentation: off [fixed]
tx-ipxip6-segmentation: off [fixed]
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on
tx-gso-partial: off [fixed]
tx-sctp-segmentation: off [fixed]
tx-esp-segmentation: off [fixed]
tx-udp-segmentation: off [fixed]
tls-hw-rx-offload: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
l2-fwd-offload: off [fixed]
hw-tc-offload: off [fixed]
esp-hw-offload: off [fixed]
esp-tx-csum-hw-offload: off [fixed]
rx-udp_tunnel-port-offload: off [fixed]
tls-hw-tx-offload: off [fixed]
rx-gro-hw: off [fixed]
tls-hw-record: off [fixed]

@champtar
Copy link
Contributor

champtar commented Feb 9, 2021

ethtool -i (not -k) to see the driver name and version

@karlism
Copy link
Author

karlism commented Feb 9, 2021

CentOS 8.2:

$ ethtool -i ens192
driver: vmxnet3
version: 1.4.17.0-k-NAPI
firmware-version: 
expansion-rom-version: 
bus-info: 0000:0b:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

$ uname -a
Linux labkmaster1a 4.18.0-193.28.1.el8_2.x86_64 #1 SMP Thu Oct 22 00:20:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

8.3:

$ ethtool -i ens192
driver: vmxnet3
version: 1.5.0.0-k-NAPI
firmware-version: 
expansion-rom-version: 
bus-info: 0000:0b:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

$ uname -a
Linux devkmaster1a 4.18.0-240.10.1.el8_3.x86_64 #1 SMP Mon Jan 18 17:05:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

8-stream:

$ ethtool -i ens192
driver: vmxnet3
version: 1.5.0.0-k-NAPI
firmware-version: 
expansion-rom-version: 
bus-info: 0000:0b:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

$  uname -a
Linux devkmaster1a 4.18.0-240.10.1.el8_3.x86_64 #1 SMP Mon Jan 18 17:05:51 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux

@karlism
Copy link
Author

karlism commented Feb 9, 2021

Same issue on CentOS-Stream:

$ cat /etc/centos-release 
CentOS Stream release 8

$ ethtool -k ens192 | grep tnl
tx-udp_tnl-segmentation: on
tx-udp_tnl-csum-segmentation: on

@champtar
Copy link
Contributor

champtar commented Feb 9, 2021

Can you really test if it's broken, it's likely but just having tx-udp_tnl-*: on is not enough to be sure it's broken ;)

@karlism
Copy link
Author

karlism commented Feb 9, 2021

Unfortunately it's still broken, behavior is exactly the same as with 8.3, kubespray deployment appears to be successful, but all kubectl commands are very slow and stuff like DNS is not working:

$ time kubectl get nodes
NAME              STATUS   ROLES                  AGE   VERSION
devkmaster1a   Ready    control-plane,master   56m   v1.20.2
devkmaster1b   Ready    control-plane,master   56m   v1.20.2
devknode1a     Ready    <none>                 54m   v1.20.2
devknode1b     Ready    <none>                 54m   v1.20.2
devknode1c     Ready    <none>                 54m   v1.20.2

real	0m15.189s
user	0m0.115s
sys	0m0.020s

$ kubectl run dnsutils-${RANDOM} --rm -i -t --image=gcr.io/kubernetes-e2e-test-images/dnsutils:1.3 -- /bin/sh
If you don't see a command prompt, try pressing enter.
/ # cat /etc/resolv.conf
search default.svc.dev-cluster.local svc.dev-cluster.local dev-cluster.local
nameserver 169.254.25.10
options ndots:5

/ # ping -c 2 169.254.25.10
PING 169.254.25.10 (169.254.25.10): 56 data bytes
64 bytes from 169.254.25.10: seq=0 ttl=64 time=0.137 ms
64 bytes from 169.254.25.10: seq=1 ttl=64 time=0.123 ms
--- 169.254.25.10 ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0.123/0.130/0.137 ms

/ # ping -c 2 google.com
ping: bad address 'google.com'

Disabling UDP tunnel TX segmentation resolves the issue on CentOS-Stream too.

@turlen930
Copy link

@champtar, network adapters in question are ESXi vmxnet 3. As for the ethtool diff, it is following:

ansible -b -a 'ethtool -K ens192 tx-udp_tnl-csum-segmentation off' 'devknode1?,devkmaster1?'
ansible -b -a 'ethtool -K ens192 tx-udp_tnl-segmentation off' 'devknode1?,devkmaster1?'

I have same issue on CentOS 8.3. Disabling tx-udp_tnl-csum-segmentation and tx-udp_tnl-segmentation resolved problem

@karlism
Copy link
Author

karlism commented Feb 10, 2021

For anyone wondering, how to add these settings to network interfaces to persist during reboot:

# cat /etc/sysconfig/network-scripts/ifcfg-ens192 | grep ETHTOOL
ETHTOOL_OPTS="-K ens192 tx-udp_tnl-csum-segmentation off; -K ens192 tx-udp_tnl-segmentation off"

@champtar
Copy link
Contributor

@champtar, network adapters in question are ESXi vmxnet 3. As for the ethtool diff, it is following:

ansible -b -a 'ethtool -K ens192 tx-udp_tnl-csum-segmentation off' 'devknode1?,devkmaster1?'
ansible -b -a 'ethtool -K ens192 tx-udp_tnl-segmentation off' 'devknode1?,devkmaster1?'

I have same issue on CentOS 8.3. Disabling tx-udp_tnl-csum-segmentation and tx-udp_tnl-segmentation resolved problem

Can you also show ethtool -i intf ?

@turlen930
Copy link

turlen930 commented Feb 10, 2021

@champtar, network adapters in question are ESXi vmxnet 3. As for the ethtool diff, it is following:

Can you also show ethtool -i intf ?

ethtool -i ens192
driver: vmxnet3
version: 1.5.0.0-k-NAPI
firmware-version: 
expansion-rom-version: 
bus-info: 0000:0b:00.0
supports-statistics: yes
supports-test: no
supports-eeprom-access: no
supports-register-dump: yes
supports-priv-flags: no

@champtar
Copy link
Contributor

Am I overseeing something?

firewalld is disabled ? no filtering between the hosts ?

@lz006
Copy link

lz006 commented Feb 12, 2021

@champtar indeed calico requires to open port 179/tcp what I missed. Although calico reached ready state after opening mentioned port, other pods (like coredns) claimed that they cannot reach api servers container ip address (10.233.0.1 -> no route to host).

Unfortunately I cannot say if this is still related to this calico compatibility issue here but for me it seems very likely...
Basically opening all ports listed by "netstat -ntlpu" should be sufficient right?

(However I went for flannel as temporary workaround as it seems to work.)

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 13, 2021
@mohansitaram
Copy link

mohansitaram commented Jun 11, 2021

I was hitting exactly this issue in my environment: Confirmed that running the ethtool commands worked:

ethtool -K ens192 tx-udp_tnl-csum-segmentation off
ethtool -K ens192 tx-udp_tnl-segmentation off

I am running RHEL 8.3 with K8s 1.17.6 deployed using kubespray. Had to set calico iptables backend to NFT in kubespray as it defaults to "legacy" mode. Calico tunneling mode used is IPIP.

@fejta-bot
Copy link

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-contributor-experience at kubernetes/community.
/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 11, 2021
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closing this issue.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@turlen930
Copy link

Has the problem been solved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

9 participants