Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Cilium] Executing nerdctl run in k8 environment is stuck #3783

Open
wzxmt opened this issue Dec 20, 2024 · 8 comments
Open

[Cilium] Executing nerdctl run in k8 environment is stuck #3783

wzxmt opened this issue Dec 20, 2024 · 8 comments

Comments

@wzxmt
Copy link

wzxmt commented Dec 20, 2024

Description

Executing nerdctl run in the k8 environment is stuck, but k8s can create pods normally

Steps to reproduce the issue

1.[root@m1 ~]# nerdctl ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f0571a9094ce quay.io/cilium/hubble-ui-backend@sha256:0e0eed917653441fded4e7cdb096b7be6a3bddded5a2dd10812a27b1fc6ed95b "/usr/bin/backend" 6 minutes ago Up k8s://kube-cilium/hubble-ui-77555d5dcf-pj77v/backend
046ba04231f7 docker.io/wangyanglinux/myapp:v1 "nginx -g daemon off;" 6 minutes ago Up k8s://default/test-z2gms/test
5c6c52541c37 docker.io/wzxmtlw/metrics-server:v0.6.3 "/metrics-server --c…" 6 minutes ago Up k8s://kube-system/metrics-server-5c7b6df7d8-md58r/metrics-server
fcb24a33d77a quay.io/cilium/hubble-relay@sha256:d352d3860707e8d734a0b185ff69e30b3ffd630a7ec06ba6a4402bed64b4456c "hubble-relay serve" 7 minutes ago Up k8s://kube-cilium/hubble-relay-7bc7544857-95dqm/hubble-relay
....

2.[root@m1 ~]# nerdctl run --name test --rm -it busybox:1.28 /bin/sh
Executing the above command gets stuck

3.Can nerdctl run be executed outside the k8s environment

Describe the results you received and expected

null

What version of nerdctl are you using?

[root@m1 ~]# nerdctl version
Client:
Version: v2.0.2
OS/Arch: linux/amd64
Git commit: 1220ce7
buildctl:
Version: v0.17.1
GitCommit: 8b1b83ef4947c03062cdcdb40c69989d8fe3fd04

Server:
containerd:
Version: v2.0.1
GitCommit: 88aa2f531d6c2922003cc7929e51daf1c14caa0a
runc:
Version: 1.2.2
GitCommit: v1.2.2-0-g7cb36325

Are you using a variant of nerdctl? (e.g., Rancher Desktop)

None

Host information

[root@m1 ~]# nerdctl info
Client:
Namespace: k8s.io
Debug Mode: false

Server:
Server Version: v2.0.1
Storage Driver: overlayfs
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Log: fluentd journald json-file none syslog
Storage: native overlayfs
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.14.0-427.13.1.el9_4.x86_64
Operating System: Rocky Linux 9.4 (Blue Onyx)
OSType: linux
Architecture: x86_64
CPUs: 4
Total Memory: 3.793GiB
Name: m1
ID: b26f2865-ca8a-49fa-a3a2-ec66adae9813

@wzxmt wzxmt added the kind/unconfirmed-bug-claim Unconfirmed bug claim label Dec 20, 2024
@wzxmt
Copy link
Author

wzxmt commented Dec 20, 2024

[root@m1 ~]# kubectl version
Client Version: v1.31.4
Kustomize Version: v5.4.2
Server Version: v1.31.4

@apostasie
Copy link
Contributor

apostasie commented Dec 20, 2024

@wzxmt I am not sure how to reproduce your problem.

Against a kind cluster, things are working just fine / as expected.

I need more details about your specific deployment.

  • How can I reproduce it from scratch?
  • How did you create your kube cluster exactly?
  • What else is involved here?
  • What are your containerd details?
  • re-run the failing/stuck nerdctl command with --debug-full

@wzxmt
Copy link
Author

wzxmt commented Dec 21, 2024

@wzxmt I am not sure how to reproduce your problem.

Against a kind cluster, things are working just fine / as expected.

I need more details about your specific deployment.

  • How can I reproduce it from scratch?
  • How did you create your kube cluster exactly?
  • What else is involved here?
  • What are your containerd details?
  • re-run the failing/stuck nerdctl command with --debug-full

My K8s deployment method uses binary deployment, and I tried again. Running "nerdctl run --name test --rm -it busybox:1.28 /bin/sh" in Flannel mode works without any stutter, but it stutters in Cilium mode. Here are the deployment modes:

linux-amd64/helm template cilium cilium/cilium --version 1.15.11
--namespace kube-cilium
--set operator.replicas=1
--set k8sServiceHost=apiserver.cluster.local
--set k8sServicePort=8443
--set ipv4NativeRoutingCIDR=172.16.0.0/16
--set ipam.operator.clusterPoolIPv4PodCIDRList=172.16.0.0/16
--set hubble.relay.enabled=true
--set hubble.ui.enabled=true
--set hubble.ui.service.type=NodePort
--set hubble.ui.service.nodePort=31235
--set routing-mode=native
--set kubeProxyReplacement=strict
--set bpf.masquerade=true
--set bandwidthManager.enabled=true >>${HOST_PATH}/roles/components/templates/cilium.yaml

[root@m1 ~]# containerd -v
containerd github.com/containerd/containerd/v2 v2.0.1 88aa2f531d6c2922003cc7929e51daf1c14caa0a

[root@m1 ~]# nerdctl info
Client:
Namespace: k8s.io
Debug Mode: false

Server:
Server Version: v2.0.1
Storage Driver: overlayfs
Logging Driver: json-file
Cgroup Driver: systemd
Cgroup Version: 2
Plugins:
Log: fluentd journald json-file none syslog
Storage: native overlayfs
Security Options:
seccomp
Profile: builtin
cgroupns
Kernel Version: 5.14.0-427.13.1.el9_4.x86_64
Operating System: Rocky Linux 9.4 (Blue Onyx)
OSType: linux
Architecture: x86_64
CPUs: 6
Total Memory: 5.755GiB
Name: m1
ID: 97bb3274-41ea-4a43-a74b-7dc0b86e3fa9

[root@m1 ~]# nerdctl run --name test --rm -it --debug-full busybox:1.28 /bin/sh
DEBU[0000] verifying process skipped
DEBU[0000] generated log driver: binary:///apps/containerd/bin/nerdctl?_NERDCTL_INTERNAL_LOGGING=%2Fvar%2Flib%2Fnerdctl%2F1935db59

@apostasie
Copy link
Contributor

apostasie commented Dec 21, 2024

Thanks @wzxmt

What happens with nerdctl network ls, or when starting your container with different networking options? (eg: --net host)

@AkihiroSuda anyone around familiar with Kube + eBPF/Cillium who could help debug this?

@wzxmt
Copy link
Author

wzxmt commented Dec 22, 2024

nerdctl network ls

I later tried the Calico mode and it worked fine. Running "nerdctl network ls" in Cilium mode still stutters, but other modes can be executed normally.

flannel

[root@m2 ~]# nerdctl network ls
NETWORK ID NAME FILE
cbr0 /etc/cni/net.d/10-flannel.conflist
17f29b073143 bridge /etc/cni/net.d/nerdctl-bridge.conflist
host
none

calico

[root@m3 ~]# nerdctl network ls
NETWORK ID NAME FILE
k8s-pod-network /etc/cni/net.d/10-calico.conflist
17f29b073143 bridge /etc/cni/net.d/nerdctl-bridge.conflist
host
none

Cilium stutters

[root@m1 ~]# nerdctl network ls

@apostasie
Copy link
Contributor

nerdctl network ls

I later tried the Calico mode and it worked fine. Running "nerdctl network ls" in Cilium mode still stutters, but other modes can be executed normally.

Interesting.

Staying stuck is rather unusual.
What I am thinking is locking on the same directory.
Been browsing Cilium source code, and indeed they do use filesystem locking, possibly on the same directory as us.

@wzxmt if you feel like it, the most helpful thing you could do is:

# clone nerdctl source code
git clone git@github.com:containerd/nerdctl.git
cd nerdctl

# Edit https://github.com/containerd/nerdctl/blob/main/pkg/netutil/netutil.go#L224
# Line 224, find this:
#	err = lockutil.WithDirLock(e.NetconfPath, fn)
# Replace it with:
#      fn()

# Compile a new nerdctl binary
make binaries

# The updated binary is under `_output`

# Now, try again
_output/nerdctl network ls

If it still does not help, you could pepper fmt.Println("debug message something") in this function (and the caller) to figure out where it is getting stuck.

I wish I could test Cilium but I am short on time right now.

Thanks @wzxmt

@wzxmt
Copy link
Author

wzxmt commented Dec 23, 2024

nerdctl network ls

I later tried the Calico mode and it worked fine. Running "nerdctl network ls" in Cilium mode still stutters, but other modes can be executed normally.

Interesting.

Staying stuck is rather unusual. What I am thinking is locking on the same directory. Been browsing Cilium source code, and indeed they do use filesystem locking, possibly on the same directory as us.

@wzxmt if you feel like it, the most helpful thing you could do is:

clone nerdctl source code

git clone git@github.com:containerd/nerdctl.git
cd nerdctl

Edit https://github.com/containerd/nerdctl/blob/main/pkg/netutil/netutil.go#L224

Line 224, find this:

err = lockutil.WithDirLock(e.NetconfPath, fn)

Replace it with:

fn()

Compile a new nerdctl binary

make binaries

The updated binary is under _output

Now, try again

_output/nerdctl network ls
If it still does not help, you could pepper fmt.Println("debug message something") in this function (and the caller) to figure out where it is getting stuck.

I wish I could test Cilium but I am short on time right now.

Thanks @wzxmt

Edit https://github.com/containerd/nerdctl/blob/main/pkg/netutil/netutil.go#L224,make binaries
You can execute nerdctl network ls, and execute nerdctl run --name test --rm -it --debug-full busybox:1.28 /bin/sh but there is still a problem
Image

@apostasie
Copy link
Contributor

Thanks a lot @wzxmt

I think this confirms what the issue is: cilium is very likely trying to lock the same directory as nerdctl (likely the cni configuration directory).

The problem here will not be trivial to solve.

We need to flock when accessing the cni conf - this is the only way to prevent racy/concurrent modifications.

What we could do is move the lock to a different location though (purely nerdctl).

cc @AkihiroSuda

@AkihiroSuda AkihiroSuda added area/kubernetes and removed kind/unconfirmed-bug-claim Unconfirmed bug claim labels Dec 23, 2024
@AkihiroSuda AkihiroSuda changed the title Executing nerdctl run in k8 environment is stuck [Cilium] Executing nerdctl run in k8 environment is stuck Dec 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants