Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI: add nerdctl #327

Merged
merged 2 commits into from
Dec 3, 2024
Merged

Conversation

AkihiroSuda
Copy link
Member

No description provided.

@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented Apr 10, 2024

https://github.com/rootless-containers/usernetes/actions/runs/8627684028/job/23648119883?pr=327

Run make kubeadm-init
nerdctl compose exec -e U7S_HOST_IP=10.1.0.133 -e U7S_NODE_NAME=u7s-fv-az569-243 -e U7S_NODE_SUBNET=10.100.227.0/24 -e U7S_NODE_IP=10.100.227.100 node sh -euc "envsubst </usernetes/kubeadm-config.yaml >/tmp/kubeadm-config.yaml"
panic: provided file is not a console

@AkihiroSuda
Copy link
Member Author

https://github.com/rootless-containers/usernetes/actions/runs/8627684028/job/23648119731?pr=327

nerdctl compose up --build -d
time="2024-04-10T07:41:28Z" level=fatal msg="getwd: no such file or directory"
make: *** [Makefile:69: up] Error 1
make: Leaving directory '/home/runner/usernetes'

@apostasie
Copy link

Mmmm... did this get autoclosed because of the fix in nerdctl?

@AkihiroSuda did you mean this to be closed, or shall we revive it as soon as we have a new nerdctl RC with the fix?

@AkihiroSuda AkihiroSuda reopened this Aug 13, 2024
@AkihiroSuda AkihiroSuda force-pushed the ci-nerdctl branch 3 times, most recently from f46b880 to a8301ef Compare August 18, 2024 13:53
@AkihiroSuda
Copy link
Member Author

msg="getwd: no such file or directory" still remains

@apostasie
Copy link

getwd looked into above ^

I am more baffled by the other failure, which seems to be that (docker) runc is not allowed to pivot root (because... apparmor?). Are we missing a step in the installation flow?

Do you have a clue what's going on with this?

https://github.com/rootless-containers/usernetes/actions/runs/10441177644/job/28911999825?pr=327#step:6:1203

@AkihiroSuda
Copy link
Member Author

The CI is now almost green, except the last "Test data persistency" step
https://github.com/rootless-containers/usernetes/actions/runs/12125498597/job/33806091150?pr=327

@apostasie
Copy link

So, Kube API is not responding after lima restart?

Maybe we can duplicate https://github.com/rootless-containers/usernetes/blob/master/.github/workflows/main.yaml#L109C1-L122C80 and add it after - name: "Test data persistency after restarting the node" so that we get better debugging info?

@AkihiroSuda AkihiroSuda force-pushed the ci-nerdctl branch 2 times, most recently from 8a5bb45 to 2e8f366 Compare December 2, 2024 22:40
@apostasie
Copy link

Grumble grumble that was not super helpful.

Looks like the kube API indeed is not being restarted on reboot (or it fails starting for some reason).

Spitballing hypothesis:

  • DNS

We need more debug in there: for eg some compose logs to see what happened with make up?

If you do not get to this before me ^ I'll debug on a separate PR later on.

@AkihiroSuda
Copy link
Member Author

https://github.com/rootless-containers/usernetes/actions/runs/12129252319/job/33817308667?pr=327

Dec 02 22:48:31 lima-host0 containerd-rootless.sh[979]: time="2024-12-02T22:48:31.259129111Z" level=error msg="apply change" error="failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running createRuntime hook #0: exit status 1, stdout: , stderr: time=\"2024-12-02T22:48:31Z\" level=fatal msg=\"bypass4netnsd not running? (Hint: run `containerd-rootless-setuptool.sh install-bypass4netnsd`): bypass4netns failed\""

bypass4netnsd seems not working

@AkihiroSuda
Copy link
Member Author

$ cat ~/.local/share/nerdctl/1935db59/containers/default/320813e981deea8eb35e422fc12ae2ce31897edf9528e20e92dd55f33f35906d/bypass4netns.log 
time="2024-12-03T05:22:16Z" level=info msg="LogFilePath: /home/suda.linux/.local/share/nerdctl/1935db59/containers/default/320813e981deea8eb35e422fc12ae2ce31897edf9528e20e92dd55f33f35906d/bypass4netns.log"
time="2024-12-03T05:22:16Z" level=fatal msg="Cannot write pid file: open /run/user/1001/bypass4netns/320813e981deea8.pid: no such file or directory"

bypass4netns refuses to work because the PID file in /run is missing after rebooting

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
@AkihiroSuda AkihiroSuda marked this pull request as ready for review December 3, 2024 05:27
@apostasie
Copy link

So...
It is failing here: https://github.com/rootless-containers/bypass4netns/blob/c6757e6185dba0448b0b63fe9222bd8a5590275a/cmd/bypass4netnsd/main.go#L95-L97

because the containing directory does not exist, right?

mkdirall(parent(pidfile))?

@apostasie
Copy link

apostasie commented Dec 3, 2024

We must have that problem outside of usernetes.

I think we can fix in nerdctl.

https://github.com/containerd/nerdctl/blob/main/pkg/bypass4netnsutil/bypass4netnsutil.go#L128-L136

@apostasie
Copy link

Tentatively: containerd/nerdctl#3724

If you get a chance to try it with this patch, curious if that would fix it.

Presumably, it works in certain cases because CreateSocketDir would have been called before.

@AkihiroSuda
Copy link
Member Author

Thanks @apostasie , will try your patch later 👍

@AkihiroSuda AkihiroSuda merged commit 71c9906 into rootless-containers:master Dec 3, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants