Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.26.2 - kube-proxy - Failed to execute iptables-restore - unknown option "--xor-mark" #4295

Closed
MrFishFinger opened this issue Nov 13, 2024 · 9 comments
Assignees
Labels
status/needs-triage Pending triage or re-evaluation type/bug Something isn't working

Comments

@MrFishFinger
Copy link

MrFishFinger commented Nov 13, 2024

Image I'm using:
v1.26.2 (linux kernel 5.15.168)

What I expected to happen:
kube-proxy to operate without errors

What actually happened:
kube-proxy repeatedly throws the error:

I1113 16:15:07.908800 1 proxier.go:853] "Syncing iptables rules"
I1113 16:15:07.928773 1 proxier.go:1464] "Reloading service iptables data" numServices=0 numEndpoints=0 numFilterChains=4 numFilterRules=3 numNATChains=4 numNATRules=5
E1113 16:15:07.931291 1 proxier.go:1481] "Failed to execute iptables-restore" err=<
exit status 2: ip6tables-restore v1.8.4 (legacy): unknown option "--xor-mark"
Error occurred at line: 16
Try `ip6tables-restore -h' or 'ip6tables-restore --help' for more information.
>
I1113 16:15:07.931308 1 proxier.go:858] "Sync failed" retryingTime="30s"
I1113 16:15:07.931317 1 proxier.go:820] "SyncProxyRules complete" elapsed="22.67239ms"

How to reproduce the problem:

  1. add a "v1.26.2" bottlerocket node to an ipv4 EKS cluster running K8s 1.24
  2. check "kube-proxy" logs
  3. observe error

NOTE: rolling back to image "v1.26.1" (using linux kernel 5.15.167) fixes the issue.


details from 1.26.2 node with issue:

[ssm-user@control]$ cat /etc/*rel*
NAME=Bottlerocket
ID=bottlerocket
VERSION="1.26.2 (aws-k8s-1.24)"
...
bash-5.1# uname -a
Linux ip-x-x-x-x.eu-west-1.compute.internal 5.15.168 #1 SMP Fri Nov 1 22:54:32 UTC 2024 x86_64 GNU/Linux

info from 1.26.1 node without issue:

[ssm-user@control]$ cat /etc/*rel*
NAME=Bottlerocket
ID=bottlerocket
VERSION="1.26.1 (aws-k8s-1.24)"
...
bash-5.1# uname -a
Linux ip-x-x-x-x.eu-west-1.compute.internal 5.15.167 #1 SMP Thu Oct 24 18:28:21 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
@MrFishFinger MrFishFinger added status/needs-triage Pending triage or re-evaluation type/bug Something isn't working labels Nov 13, 2024
@Sparksssj
Copy link
Contributor

Thanks for the detailed report. I'm working on reproducing the issue now. Will update once I have more information.

@Sparksssj
Copy link
Contributor

Sparksssj commented Nov 15, 2024

Update: I have just recreated the error and confirmed the reason. There's an issue with kernel 5.15 which uses k8s from 1.24 - 1.27, which causes it to be incompatible with IPv6. We are currently working on fixing it and appreciate your patience.

@jpculp
Copy link
Member

jpculp commented Nov 15, 2024

Bottlerocket team became aware of an issue impacting K8s 1.24-1.27 AMI from version 1.26.2 and version 1.27.0 running kernel 5.15.*. The issue manifest in all nodes using IPv6 on these variants failing due to broken ip6tables commands needed to configure the node. Bottlerocket versions earlier 1.26.2 as well as Bottlerocket variants for K8s 1.28 and above are not impacted. Bottlerocket team is working on releasing the fix. In the meantime, if you are using K8s 1.24-1.27 and you need to use IPv6, please use Bottlerocket version 1.26.1.

@yeazelm
Copy link
Contributor

yeazelm commented Nov 15, 2024

I have a fix for this in PR here bottlerocket-os/bottlerocket-core-kit#266. Thanks for the report @MrFishFinger!

@yeazelm
Copy link
Contributor

yeazelm commented Nov 20, 2024

This should be fixed in Bottlerocket 1.27.1 which should be fully released shortly. Here is the tracking issue for 1.27.1: #4303

@yeazelm
Copy link
Contributor

yeazelm commented Nov 22, 2024

Closing this issue since 1.27.1 is out and fixes this issue.

@yeazelm yeazelm closed this as completed Nov 22, 2024
@yeazelm yeazelm self-assigned this Nov 22, 2024
@ginglis13 ginglis13 unpinned this issue Nov 26, 2024
@yushoyamaguchi
Copy link

yushoyamaguchi commented Jan 12, 2025

@Sparksssj @yeazelm

I'm sorry, I'm not a Bottlerocket user, but the same error occur in my KinD environment, so please let me ask.
In my KinD environment with kernel 5.15(host-kernel), even when using newer version than 1.27, the error occurs.

Node docker image : kindest/node:v1.32.0

$ kubectl version 
Client Version: v1.31.4
Kustomize Version: v5.4.2
Server Version: v1.32.0

Please teach me the way to confirm if this problem is fixed in not-Bottlerocket k8s environment.
There is no way except updating host kernel version?

@arnaldo2792
Copy link
Contributor

Hello @yushoyamaguchi , the problem we had was that the kernel 5.15.168 (what we used at the time ) was the culprit. To fix this, we carried on a patch that was included in 5.15.170:

https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v5.15.170&id=90baa455aa7e099152898cfa5eb3928d6152da12

So if you are experiencing a similar problem, you have to either port back the patch (similar to what we did to address the problem), or just move to >= 5.15.170.

@yushoyamaguchi
Copy link

@arnaldo2792
I wil try.
Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/needs-triage Pending triage or re-evaluation type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants