-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Calico node networking errors #1606
Comments
Hitting the same error here:
|
I think this is related with https://github.com/projectcalico/calico/issues/2191
|
Hello, |
This problem seems to be present with Rancher 2.3.0 and 1.15.4. |
I can confirm that it exists in Rancher 2.3.0 and 1.15.4 on rancherOS
|
Also seeing the same with Rancher 2.3.0 and kube 1.15.4 on Ubuntu 16.04 with ipv6 disabled. Fresh install of OS and cluster. |
This problem exsits in Rancher 2.3.0 and kubernetes 1.15.4 on Ubuntu 19.04 2019-10-14 11:06:44.361 [INFO][9] startup.go 256: Early log level set to info |
Please see rancher/rancher#23430 (comment) and let me know if it resolves the issue. |
@superseb this resolved the health checks but int_dataplane errors are still present:
|
Hi @superseb, I am seeing the same errors in the logs. Applying the CRDs in the other thread fixed some errors but I still see those pasted by @piwi91 above. I am having a problem with a |
Since upgrading to Rancher v2.3.4 and Kubernetes v1.17.0-rancher1-2 I'm getting Calico errors on some of my nodes—the ones that happen to be virtual machines (Hyper-V). Bare metal ones are fine. Pod:
|
I can confirm @rbq 's error. I experience pretty much the same. |
Same errors on my cluster |
Any resolution to this? I'm seeing this in one of our test clusters we just upgraded to 1.15.5 using Rancher 2.2.9 |
I had this issue as well. I did an empty config gen and copied over the new container versions and that seems to have resolved everything for me. |
@imle Could you please provide the exact steps you took? |
I just upgraded my cluster from 1.15.5 to 1.15.10 which solved my immediate problems. Afterwards I upgraded Rancher to 2.3.5 and my cluster to 1.17.3. No issues so far. |
I was having this issue and it was due to a combination of Ubuntu, Linux kernel 5.3, and secure boot. The newer kernels have lockdown enabled and it breaks BPF. There is bug report here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1863234 If you're having this problem you'll see the below errors in dmesg.
|
My current workaround is to disable XDP until the problem @mcmcghee described is fixed: |
On a sandbox cluster that had this problem I was able to recover by doing the following (just fishing as nothing else worked). I'm advising not to try this unless you are quite sure you can live with a failed cluster. But it worked for me.
|
This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
We are seeing similar issues with rancher 2.5.0, kubernetes 1.18.8. rancher/calico-node:v3.13.4 I tried disabling ipv6 with
but that doesn't seem to help. We are using Fedora CoreOS. Only thing that seems to work after a node restart is to wait for everything to start on the node and then manually restart canal pod. That seems to restore network connectivity. |
This seem to be this issue: flannel-io/flannel#1321 Adding a file
E.g. with ignition:
For more context: |
@olivierlemasle Thank you! This appears to solve our issues! |
This issue/PR has been automatically marked as stale because it has not had activity (commit/comment/label) for 60 days. It will be closed in 14 days if no further activity occurs. Thank you for your contributions. |
RKE version:
v0.2.8
Docker version: (
docker version
,docker info
preferred)Operating system and kernel: (
cat /etc/os-release
,uname -r
preferred)CentOS 7.6 Kernel 3.10.0-957.1.3.el7.x86_64
and
CentOS 7.6 Kernel 3.10.0-957.27.2.el7.x86_64
Type/provider of hosts: (VirtualBox/Bare-metal/AWS/GCE/DO)
OpenStack
cluster.yml file:
Steps to Reproduce:
Deploy an empty cluster with RKE
Results:
The text was updated successfully, but these errors were encountered: