Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade to v0.3.2 all pods crashed #172

Closed
leonanu opened this issue Apr 1, 2023 · 6 comments
Closed

Upgrade to v0.3.2 all pods crashed #172

leonanu opened this issue Apr 1, 2023 · 6 comments

Comments

@leonanu
Copy link

leonanu commented Apr 1, 2023

OS: Debian 11.6
cri-dockered version: 0.3.2
docker-ce version: 23.0.2-1debian.11bullseye
kubernetes version: v1.26.3

After upgraded v0.3.2 all pods failed to start. Rollback to v0.3.1 everything goes well.

@leonanu
Copy link
Author

leonanu commented Apr 1, 2023

"eth0" netns="/proc/207576/ns/net"
Apr 01 14:23:54 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.803 [INFO][208122] k8s.go 583: Releasing IP address(es) ContainerID="fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676"
Apr 01 14:23:54 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.803 [INFO][208122] utils.go 195: Calico CNI releasing IP address ContainerID="fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676"
Apr 01 14:23:54 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.854 [INFO][208174] ipam_plugin.go 416: Releasing address using handleID ContainerID="fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676" HandleID="k8s-pod-network.fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676" Workload="k8s-k8s-metrics--server--6f6cdbf67d--9d8sf-eth0"
Apr 01 14:23:54 k8s cri-dockerd[1327]: time="2023-04-01T14:23:54+08:00" level=info msg="About to acquire host-wide IPAM lock." source="ipam_plugin.go:357"
Apr 01 14:23:54 k8s cri-dockerd[1327]: time="2023-04-01T14:23:54+08:00" level=info msg="Acquired host-wide IPAM lock." source="ipam_plugin.go:372"
Apr 01 14:23:54 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.901 [INFO][208174] ipam_plugin.go 435: Released address using handleID ContainerID="fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676" HandleID="k8s-pod-network.fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676" Workload="k8s-k8s-metrics--server--6f6cdbf67d--9d8sf-eth0"
Apr 01 14:23:54 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.901 [INFO][208174] ipam_plugin.go 444: Releasing address using workloadID ContainerID="fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676" HandleID="k8s-pod-network.fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676" Workload="k8s-k8s-metrics--server--6f6cdbf67d--9d8sf-eth0"
Apr 01 14:23:54 k8s cri-dockerd[1327]: time="2023-04-01T14:23:54+08:00" level=info msg="Released host-wide IPAM lock." source="ipam_plugin.go:378"
Apr 01 14:23:54 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.916 [INFO][208122] k8s.go 589: Teardown processing complete. ContainerID="fccae8c47cc8a73fbe1dfa6724e627db674859ea466e5d047637265faea0d676"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.789 [INFO][208145] k8s.go 576: Cleaning up netns ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.790 [INFO][208145] dataplane_linux.go 524: Deleting workload's device in netns. ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" iface="eth0" netns="/proc/207356/ns/net"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.790 [INFO][208145] dataplane_linux.go 535: Entered netns, deleting veth. ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" iface="eth0" netns="/proc/207356/ns/net"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.827 [INFO][208145] dataplane_linux.go 569: Deleted device in netns. ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" after=37.326303ms iface="eth0" netns="/proc/207356/ns/net"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.827 [INFO][208145] k8s.go 583: Releasing IP address(es) ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.827 [INFO][208145] utils.go 195: Calico CNI releasing IP address ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.933 [INFO][208212] ipam_plugin.go 416: Releasing address using handleID ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" HandleID="k8s-pod-network.da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" Workload="k8s-k8s-openvpn--7f6d97f685--mltks-eth0"
Apr 01 14:23:55 k8s cri-dockerd[1327]: time="2023-04-01T14:23:54+08:00" level=info msg="About to acquire host-wide IPAM lock." source="ipam_plugin.go:357"
Apr 01 14:23:55 k8s cri-dockerd[1327]: time="2023-04-01T14:23:54+08:00" level=info msg="Acquired host-wide IPAM lock." source="ipam_plugin.go:372"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:55.081 [INFO][208212] ipam_plugin.go 435: Released address using handleID ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" HandleID="k8s-pod-network.da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" Workload="k8s-k8s-openvpn--7f6d97f685--mltks-eth0"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:55.081 [INFO][208212] ipam_plugin.go 444: Releasing address using workloadID ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" HandleID="k8s-pod-network.da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d" Workload="k8s-k8s-openvpn--7f6d97f685--mltks-eth0"
Apr 01 14:23:55 k8s cri-dockerd[1327]: time="2023-04-01T14:23:55+08:00" level=info msg="Released host-wide IPAM lock." source="ipam_plugin.go:378"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:55.086 [INFO][208145] k8s.go 589: Teardown processing complete. ContainerID="da14bf236ab0a356048a8c753769c0985dd9560cf8ed43dac59e5003aa59ef3d"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.884 [INFO][208201] k8s.go 576: Cleaning up netns ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.884 [INFO][208201] dataplane_linux.go 524: Deleting workload's device in netns. ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" iface="eth0" netns="/proc/207343/ns/net"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.884 [INFO][208201] dataplane_linux.go 535: Entered netns, deleting veth. ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" iface="eth0" netns="/proc/207343/ns/net"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.915 [INFO][208201] dataplane_linux.go 569: Deleted device in netns. ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" after=30.884486ms iface="eth0" netns="/proc/207343/ns/net"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.915 [INFO][208201] k8s.go 583: Releasing IP address(es) ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:54.915 [INFO][208201] utils.go 195: Calico CNI releasing IP address ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:55.041 [INFO][208278] ipam_plugin.go 416: Releasing address using handleID ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" HandleID="k8s-pod-network.aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" Workload="k8s-k8s-calico--kube--controllers--5857bf8d58--b5tmh-eth0"
Apr 01 14:23:55 k8s cri-dockerd[1327]: time="2023-04-01T14:23:55+08:00" level=info msg="About to acquire host-wide IPAM lock." source="ipam_plugin.go:357"
Apr 01 14:23:55 k8s cri-dockerd[1327]: time="2023-04-01T14:23:55+08:00" level=info msg="Acquired host-wide IPAM lock." source="ipam_plugin.go:372"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:55.126 [INFO][208278] ipam_plugin.go 435: Released address using handleID ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" HandleID="k8s-pod-network.aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" Workload="k8s-k8s-calico--kube--controllers--5857bf8d58--b5tmh-eth0"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:55.126 [INFO][208278] ipam_plugin.go 444: Releasing address using workloadID ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" HandleID="k8s-pod-network.aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d" Workload="k8s-k8s-calico--kube--controllers--5857bf8d58--b5tmh-eth0"
Apr 01 14:23:55 k8s cri-dockerd[1327]: time="2023-04-01T14:23:55+08:00" level=info msg="Released host-wide IPAM lock." source="ipam_plugin.go:378"
Apr 01 14:23:55 k8s cri-dockerd[1327]: 2023-04-01 14:23:55.130 [INFO][208201] k8s.go 589: Teardown processing complete. ContainerID="aeaf236101c058117a9db06b2901e53b5f8d846dff034211ac18a942ccb5547d"

@evol262
Copy link
Contributor

evol262 commented Apr 2, 2023

The release is reverted, and it's likely that this is around ready handling in #168. Can you please attach the full log?

@leonanu
Copy link
Author

leonanu commented Apr 2, 2023

Sorry Bro, due to release reverted. I can't get v0.3.2 again. If could, I'll pleasure to provide full logs about dockerd, cri-dockerd and kubelet etc. Thank you.

@evol262
Copy link
Contributor

evol262 commented Apr 2, 2023

You don't have logs from earlier? Or is it possible to build from source?

Even a better/more complete description of the reproduction environment (which CNI -- these logs are network related? what workloads?) would help

@leonanu
Copy link
Author

leonanu commented Apr 3, 2023

Could you please provide me v0.3.2 amd64 binary tarball again? I'll update and reproduce error logs. I didn't build from source.

My environment, CNI is Calico v3.25.1. And node description:

Name: k8s
Roles:
Labels: beta.kubernetes.io/arch=amd64
beta.kubernetes.io/os=linux
kubernetes.io/arch=amd64
kubernetes.io/hostname=k8s
kubernetes.io/os=linux
Annotations: node.alpha.kubernetes.io/ttl: 0
projectcalico.org/IPv4Address: 172.31.31.3/24
volumes.kubernetes.io/controller-managed-attach-detach: true
CreationTimestamp: Thu, 02 Dec 2021 17:44:39 +0800
Taints:
Unschedulable: false
Lease:
HolderIdentity: k8s
AcquireTime:
RenewTime: Mon, 03 Apr 2023 21:15:15 +0800
Conditions:
Type Status LastHeartbeatTime LastTransitionTime Reason Message


NetworkUnavailable False Mon, 03 Apr 2023 21:07:04 +0800 Mon, 03 Apr 2023 21:07:04 +0800 CalicoIsUp Calico is running on this node
MemoryPressure False Mon, 03 Apr 2023 21:15:14 +0800 Wed, 25 Jan 2023 01:50:50 +0800 KubeletHasSufficientMemory kubelet has sufficient memory available
DiskPressure False Mon, 03 Apr 2023 21:15:14 +0800 Wed, 25 Jan 2023 01:50:50 +0800 KubeletHasNoDiskPressure kubelet has no disk pressure
PIDPressure False Mon, 03 Apr 2023 21:15:14 +0800 Wed, 25 Jan 2023 01:50:50 +0800 KubeletHasSufficientPID kubelet has sufficient PID available
Ready True Mon, 03 Apr 2023 21:15:14 +0800 Mon, 03 Apr 2023 21:07:04 +0800 KubeletReady kubelet is posting ready status. AppArmor enabled
Addresses:
InternalIP: 172.31.31.3
Hostname: k8s
Capacity:
cpu: 8
ephemeral-storage: 149368536Ki
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16391712Ki
pods: 220
Allocatable:
cpu: 8
ephemeral-storage: 137658042550
hugepages-1Gi: 0
hugepages-2Mi: 0
memory: 16289312Ki
pods: 220
System Info:
Machine ID: cf1b57f3df084de4964589b3de3cc86e
System UUID: 7f3b4d56-233e-56bc-c1f0-4b48ad3802d4
Boot ID: f494703e-fa5f-4a94-bf9e-be441a9f34cd
Kernel Version: 5.10.0-21-amd64
OS Image: Debian GNU/Linux 11 (bullseye)
Operating System: linux
Architecture: amd64
Container Runtime Version: docker://23.0.2
Kubelet Version: v1.26.3
Kube-Proxy Version: v1.26.3
Non-terminated Pods: (10 in total)
Namespace Name CPU Requests CPU Limits Memory Requests Memory Limits Age


default nfs-client-provisioner-75f8cc789d-629zd 0 (0%) 0 (0%) 0 (0%) 0 (0%) 20d
default openvpn-7f6d97f685-mltks 0 (0%) 0 (0%) 0 (0%) 0 (0%) 78d
default wireguard-7fc7c689d7-d7gpx 0 (0%) 0 (0%) 0 (0%) 0 (0%) 72d
ingress-nginx ingress-nginx-controller-5q6lr 100m (1%) 0 (0%) 90Mi (0%) 0 (0%) 9d
kube-system calico-kube-controllers-5857bf8d58-b5tmh 0 (0%) 0 (0%) 0 (0%) 0 (0%) 3d10h
kube-system calico-node-zn7xm 250m (3%) 0 (0%) 0 (0%) 0 (0%) 3d10h
kube-system coredns-coredns-74d6c4dc4-rpwcj 100m (1%) 100m (1%) 128Mi (0%) 128Mi (0%) 3d9h
kube-system metrics-server-6f6cdbf67d-9d8sf 100m (1%) 0 (0%) 200Mi (1%) 0 (0%) 12d
kubernetes-dashboard dashboard-metrics-scraper-64bcc67c9c-dh85x 0 (0%) 0 (0%) 0 (0%) 0 (0%) 198d
kubernetes-dashboard kubernetes-dashboard-5d8c4c67b9-z4gc4 0 (0%) 0 (0%) 0 (0%) 0 (0%) 198d
Allocated resources:
(Total limits may be over 100 percent, i.e., overcommitted.)
Resource Requests Limits


cpu 550m (6%) 100m (1%)
memory 418Mi (2%) 128Mi (0%)
ephemeral-storage 0 (0%) 0 (0%)
hugepages-1Gi 0 (0%) 0 (0%)
hugepages-2Mi 0 (0%) 0 (0%)

@nwneisen
Copy link
Collaborator

nwneisen commented Oct 9, 2023

Closing this issue as it was related to a reverted release. If the issue exists in the current release then we c an take a look

@nwneisen nwneisen closed this as completed Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants