Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calico in eBPF mode has bug and should be upgraded to 3.27.3 for Kops 1.28.5 and below #16589

Closed
zied-jt opened this issue May 24, 2024 · 5 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@zied-jt
Copy link

zied-jt commented May 24, 2024

/kind bug

1. What kops version are you running? The command kops version, will display
this information.

Client version: 1.28.5 (git-v1.28.5)

2. What Kubernetes version are you running? kubectl version will print the
version if a cluster is running or provide the Kubernetes version specified as
a kops flag.

v1.28.10

3. What cloud provider are you using?
AWS

4. What commands did you run? What is the simplest way to reproduce this issue?

  1. Migrate from kube-proxy to Calico with eBPF

5. What happened after the commands executed?

For a fresh node, no traffic routed for loadBalancer service type like ingress-controller With externalTrafficPolicy=Local

By default, kops 1.28.5 provides calico 3.25.2 This version have the described bug projectcalico/calico#8112 fixed in projectcalico/calico#8313 available at calico 3.27.3

The issue reports for our specific config that :

kube-proxy frontend that we use in our kubeproxy does not expect to be shut down in any other way that hard stop of the process, while we "restart" the kubeproxy when the host ip changes as it was an easy way to reconcile the NAT tables. However, the webservers that handle the health checks don't shut down. So we need to be more careful about how we handle that without control of the k8s part of the code.

Steps to Reproduce (copied from same calico bug report)

1.Kubernetes Cluster with calico cni with eBPF dataplane
2.Create Kubernetes service type LoadBalancer with externalTrafficPolicy: Local
3.reboot the node where endpoints of the service are located
4.see logs in calico-node and curl HealtCheckPort on this node like:

err="listen tcp :30904: bind: address already in use" node="i-xxxxxxxxxxxxxxxxxx" service="nginx-controllers/nginx-ingress-controller"

6. What did you expect to happen?
Calico version up and running for eBPF mode.

7. Please provide your cluster manifest. Execute
kops get --name my.example.com -o yaml to display your cluster manifest.
You may want to remove your cluster name and other sensitive information.

networking:
  calico:
    bpfEnabled: true
    awsSrcDstCheck: Disable
    encapsulationMode: vxlan
kubeProxy:
  enabled: false

8. Please run the commands with most verbose logging by adding the -v 10 flag.
Paste the logs into this report, or in a gist and provide the gist link here.

err="listen tcp :30904: bind: address already in use" node="i-xxxxxxxxxxxxxxxxxx" service="nginx-controllers/nginx-ingress-controller"

9. Anything else do we need to know?
These kops PR have already the code for the upgrade of calico and could help fixing the issue by backporting to kops <= 1.28.5 :

PS: This bug report was written with the help of @rasta-rocket, @rsicart, @sgendrot-jobteaser, @yelaissaoui

@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label May 24, 2024
@rsicart
Copy link

rsicart commented Jun 12, 2024

Hi there!

Some news about that issue?

Thanks in advance!

@hakman
Copy link
Member

hakman commented Jun 13, 2024

Resolved via #16613.
/close

@k8s-ci-robot
Copy link
Contributor

@hakman: Closing this issue.

In response to this:

Resolved via #16613.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@rsicart
Copy link

rsicart commented Jul 2, 2024

Hello!

Thanks a lot for your work and reactivity!

Do you know if there's a scheduled release for that cherry-pick mentioned above soon?

Thanks again, great job!

@hakman
Copy link
Member

hakman commented Jul 5, 2024

@rsicart there is a release planned soon

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants