Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to start container with sysbox runtime after kernel update. #596

Open
netlore opened this issue Sep 22, 2022 · 28 comments
Open

Unable to start container with sysbox runtime after kernel update. #596

netlore opened this issue Sep 22, 2022 · 28 comments

Comments

@netlore
Copy link

netlore commented Sep 22, 2022

Running Ubuntu 22.04, and just received kernel update from 5.15.0-47 to 5.15.0-48, matching this security advisory, and It seems that containers can no-longer be started with the runtime:-

https://ubuntu.com/security/notices/USN-5624-1

# docker run --runtime sysbox-runc -it nestybox/ubuntu-focal-docker:latest /bin/bash
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: process_linux.go:578: handleReqOp caused: rootfs_init_linux.go:366: failed to mkdirall /var/lib/sysbox/shiftfs/002a816d-d852-4e38-ac0d-d6b37bbdd8ea/var/lib/rancher/rke2: mkdir /var/lib/sysbox/shiftfs/002a816d-d852-4e38-ac0d-d6b37bbdd8ea/var/lib/rancher: value too large for defined data type caused: mkdir /var/lib/sysbox/shiftfs/002a816d-d852-4e38-ac0d-d6b37bbdd8ea/var/lib/rancher: value too large for defined data type: unknown.

Rolled back to 5.15.0-47 and it seems to be working again.

Installed from "sysbox-ce_0.5.2-0.linux_amd64.deb" - Any thoughts would be appreciated.

@ctalledo
Copy link
Member

Hi @netlore , thanks for using Sysbox.

The error is for sure caused by incompatibility between shiftfs and the kernel (nothing in sysbox per-se).

Just yesterday someone else reported this issue too: #595

There must be something in the 5.15.0-48 kernel that is causing the incompatibility with the shiftfs module. Speculating a bit, maybe the kernel is missing a Ubuntu patch required for overlayfs to work with shiftfs, or maybe the shiftfs module needs updating to work with this kernel.

We would need to dig deep into the commits of 5.15.0-48 to see what's going on.

If rolling back to the prior kernel is not an option for you, as a workaround you can try using a newer kernel (maybe 5.18?) or configuring Sysbox to not use shiftfs (it will instead use an alternative mechanism called ID-mapped-mounts in the kernel). To do the latter, modify the sysbox systemd service for the sysbox-mgr and pass the --disable-shiftfs flag to it.

Disabling shiftfs is not ideal, but things should still work without it.

@netlore
Copy link
Author

netlore commented Sep 23, 2022

Oh wow, I've been looking into the situation with shiftfs, and it seems there could be some serious confusion going on... it seems that the upstream kernel is going for ID-Mapped mounts, but it's not yet supported for ZFS or CephFS... and shiftfs has never been officially upstreamed, but Canonical are carrying patches to include it in their kernels through 22.04... I'll need to review the diffs tomorrow, but I believe there were changes to shiftfs between -47 and -48.... perhaps because of the 11 CVE's that -48 addressed.

Can you clarify why you favour shiftfs, as you said that disabling shiftfs (using ID-Mapped mounts) is not ideal... I'd like to understand what's not ideal about it (other than the current lack of support for ZFS/CephFS).

I can of course update you with whatever details I find regarding changes to shiftfs in -48 (if you're interested)... in the morning.

@netlore
Copy link
Author

netlore commented Sep 23, 2022

I noticed in the changelog for Canonical's kernel that 5.15.0-48 includes a resync with upstream, I wonder if they lost their patch that allows shiftfs to work with overlayfs, i feel like that would break things in the above kind of way.... here's the patch for that

https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1846272

Will check to see if it's actually the case that this is missing, and/or if it was accidental or.....

@netlore
Copy link
Author

netlore commented Sep 23, 2022

This seems to show a related check in overlayfs, being reverted to mainline, could this be the smoking gun?

diff -ur linux-5.15.0-47/fs/overlayfs/super.c linux-5.15.0-48/fs/overlayfs/super.c
--- linux-5.15.0-47/fs/overlayfs/super.c	2022-09-22 15:52:28.806462177 +0100
+++ linux-5.15.0-48/fs/overlayfs/super.c	2022-09-22 15:53:35.962351879 +0100
@@ -873,7 +873,7 @@
 		pr_err("filesystem on '%s' not supported\n", name);
 		goto out_put;
 	}
-	if (mnt_user_ns(path->mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path->mnt)) {
 		pr_err("idmapped layers are currently not supported\n");
 		goto out_put;
 	}

@ctalledo
Copy link
Member

This seems to show a related check in overlayfs, being reverted to mainline, could this be the smoking gun?

diff -ur linux-5.15.0-47/fs/overlayfs/super.c linux-5.15.0-48/fs/overlayfs/super.c
--- linux-5.15.0-47/fs/overlayfs/super.c	2022-09-22 15:52:28.806462177 +0100
+++ linux-5.15.0-48/fs/overlayfs/super.c	2022-09-22 15:53:35.962351879 +0100
@@ -873,7 +873,7 @@
 		pr_err("filesystem on '%s' not supported\n", name);
 		goto out_put;
 	}
-	if (mnt_user_ns(path->mnt) != &init_user_ns) {
+	if (is_idmapped_mnt(path->mnt)) {
 		pr_err("idmapped layers are currently not supported\n");
 		goto out_put;
 	}

Hi @netlore, apologies for the late reply. I don't think that's the culprit because it's related to ID-mapped-mounts rather than shiftfs itself.

Below is the list of patches to overlayfs that are required to make it work with shiftfs. I didn't check if the 5.15.0-48 kernel is missing any of these.

07648d68cea786d2ff599b51139013044ec59a8a   (05/16/22 - UBUNTU: SAUCE: overlayfs: prevent dereferencing struct file in ovl_vm_prfile_set())                                                                                                                                                                                     
b07bc17b8363190be1328fe162768f7fdcb8fcaa   (04/14/22 - UBUNTU: SAUCE: overlayfs: fix incorrect mnt_id of files opened from map_files)                                                                                                                                                                                          
730264093da28294476d5c41b055a271facdd998   (10/02/19 - UBUNTU: SAUCE: overlayfs: allow with shiftfs as underlay)                                                                                                                                                                                                               
3fb38c98e060b327cb58373775dcc95ed52d1f22   (01/19/16 - UBUNTU: SAUCE: overlayfs: Skip permission checking for trusted.overlayfs.* xattrs)                                                                                                                                                                                      
796fe8290349ef4cd8719a68966893c1c1b5a677   (01/12/22 - UBUNTU: SAUCE: vfs: test that one given mount param is not larger than PAGE_SIZE)                                                                                                                                                                                

@ctalledo
Copy link
Member

FYI: kernel 5.19 seems to work: #595 (comment)

@fuomag9
Copy link

fuomag9 commented Oct 2, 2022

Downgrading to 5.15.0-47 fixed the issue for me as well

@drakes00
Copy link

drakes00 commented Oct 3, 2022

What are the consequences in temporarily disabling shiftfs?
Do we need to recreate volumes, mounted directories?
Will it affect file ownerships ?

Thanks for the help

@ctalledo
Copy link
Member

ctalledo commented Oct 3, 2022

Hi @drakes00,

What are the consequences in temporarily disabling shiftfs?

There should be minor negative consequences (see below), but there is no need to recreate volumes, mounted dirs, etc., and it won't affect file ownership on the host machine either.

The only thing is that without shiftfs your kernel must support ID-mapped-mounts, which works in a lot of cases but not all (it's improving fast though).

One area were ID-mapped-mounts did not work until recently is compatibility with overlayfs (which is important since Docker sets up the container's rootfs with overlayfs). Due to this incompatibility we added a work-around in sysbox where if shiftfs is not present, it chowns the container's rootfs when the container starts; that's not ideal but it works.

I believe kernel 5.19 added ID-mapped-mount support for overlayfs (need to double-check). If true, then we will adjust sysbox to use ID-mapped-mounts for the container's rootfs too, and at that point ID-mapped-mounts would essentially replace shiftfs for all practical purposes.

Hope that helps.

A bit more info on this in the sysbox user guide doc.

@sfph
Copy link

sfph commented Oct 25, 2022

Hey there - any update on this issue? Running into the same on a recently updated 22.04.1 LTS ubuntu system

selina@cirl-mrt-1:~$ uname -a
Linux cirl-mrt-1 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
docker run -d -p 1880:1880 --runtime=sysbox-runc -v /home/selina/node_red_data:/data --name mynodered localnodered
712be8c1aa6567b56b529bd48c1bd5a0cef2e3d1866d0c6c67b9ed885101c3ac
docker: Error response from daemon: OCI runtime create failed: container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: process_linux.go:578: handleReqOp caused: rootfs_init_linux.go:366: failed to mkdirall /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type caused: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type: unknown.

@drakes00
Copy link

drakes00 commented Oct 25, 2022

Hey there - any update on this issue? Running into the same on a recently updated 22.04.1 LTS ubuntu system

selina@cirl-mrt-1:~$ uname -a Linux cirl-mrt-1 5.15.0-52-generic #58-Ubuntu SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux docker run -d -p 1880:1880 --runtime=sysbox-runc -v /home/selina/node_red_data:/data --name mynodered localnodered 712be8c1aa6567b56b529bd48c1bd5a0cef2e3d1866d0c6c67b9ed885101c3ac docker: Error response from daemon: OCI runtime create failed: container_linux.go:425: starting container process caused: process_linux.go:607: container init caused: process_linux.go:578: handleReqOp caused: rootfs_init_linux.go:366: failed to mkdirall /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd/io.containerd.snapshotter.v1.overlayfs: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type caused: mkdir /var/lib/sysbox/shiftfs/b31752a4-1f50-49b6-8fd0-eb0c983a2a78/var/lib/containerd: value too large for defined data type: unknown.

Hi, kernel 6.0.0 solved the issue on my end.
Cheers

% uname -a                                                                                                                            
Linux Ma1X-Os-X-n3zu 6.0.0-060000-generic #202210022231 SMP PREEMPT_DYNAMIC Sun Oct 2 22:35:09 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
% sysbox-runc -v                                                                                                                      
sysbox-runc
        edition:        Community Edition (CE)
        version:        0.5.2
        commit:         d91c42c2125fd7aaf46f66307eb5c2a025f30289
        built at:       Wed May 18 19:49:04 UTC 2022
        built by:       Rodny Molina
        oci-specs:      1.0.2-dev

@ctalledo
Copy link
Member

ctalledo commented Oct 25, 2022

Hi @sfph,

Hey there - any update on this issue? Running into the same on a recently updated 22.04.1 LTS ubuntu system
selina@cirl-mrt-1:~$ uname -a
Linux cirl-mrt-1 5.15.0-52-generic #58 SMP Thu Oct 13 08:03:55 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Unfortunately there isn't much we can do with Ubuntu kernels 5.15.(>=48) as they are apparently missing a Ubuntu-patch on overlayfs that breaks interaction with shiftfs.

If you can, please upgrade to newer kernels (e.g., 5.19, 6.0, etc.).

If you must use kernel 5.15, try using 5.15.47 or earlier.

If you must use kernel 5.15.(>=48), you can work-around the problem by either:

  1. Removing the shiftfs module from the kernel (e.g., rmmod) or

  2. Configuring Sysbox to not use shiftfs. You do this by configuring the systemd service unit for sysbox-mgr, and passing the --disable-shiftfs flag to Sysbox. See here for more.

Hope that helps!

@sfph
Copy link

sfph commented Oct 25, 2022

This does help; thanks!

@felipecrs
Copy link
Contributor

I wonder if it's worthwhile to add a check in the sysbox-mgr to automatically disable using shiftfs in known broken kernels.

@felipecrs
Copy link
Contributor

felipecrs commented Oct 31, 2022

Here's a one-liner to disable shiftfs:

sudo mkdir -p /etc/systemd/system/sysbox-mgr.service.d && printf '%s\n' '[Service]' 'ExecStart=' 'ExecStart=/usr/bin/sysbox-mgr --disable-shiftfs' | sudo tee /etc/systemd/system/sysbox-mgr.service.d/override.conf && sudo systemctl daemon-reload && sudo systemctl restart sysbox

@pmb-nolwenture
Copy link

pmb-nolwenture commented Nov 2, 2022

I believe kernel 5.19 added ID-mapped-mount support for overlayfs (need to double-check). If true, then we will adjust sysbox to use ID-mapped-mounts for the container's rootfs too, and at that point ID-mapped-mounts would essentially replace shiftfs for all practical purposes.

Is there somewhere I can track changes in regards to sysbox/idmapped/kernel 5.19 or is there any roadmap for this to be included natively in docker now nestybox has been acquired?

@ctalledo
Copy link
Member

ctalledo commented Nov 2, 2022

Hi @philipzgithub, assuming that in fact overlayfs supports ID-mapped-mounts, this will be included in the ~v0.7 release of Sysbox. Not sure on the timeline yet, likely ~Feb 2022.

In any case, overlayfs support for ID-mapped-mounts is a "nice-to-have", but not a "must-have" as mentioned in my comment above.

@pmb-nolwenture
Copy link

pmb-nolwenture commented Nov 3, 2022

Thank you @ctalledo for your response, you have saved me a lot of time and effort.

@pmb-nolwenture
Copy link

the regression has been filed as a bug here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1990849

@ctalledo
Copy link
Member

ctalledo commented Nov 4, 2022

the regression has been filed as a bug here: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1990849

That's great, thanks for digging that up @philipzgithub.

@ScottG489
Copy link

Here's a one-liner to disable shiftfs:

printf '%s\n' '[Service]' 'ExecStart=' 'ExecStart=/usr/bin/sysbox-mgr --disable-shiftfs' | sudo tee /etc/systemd/system/sysbox-mgr.service.d/override.conf && sudo systemctl daemon-reload && sudo systemctl restart sysbox

FYI, you might need to to create /etc/systemd/system/sysbox-mgr.service.d first. Otherwise, this worked for me, thanks!

@felipecrs
Copy link
Contributor

Oh yeah, I edited it. Thanks!

@rodnymolina
Copy link
Member

@felipecrs

I wonder if it's worthwhile to add a check in the sysbox-mgr to automatically disable using shiftfs in known broken kernels.

This is a good idea, especially with so many kernel changes going on these days in this area. During sysbox-mgr's initialization we could attempt to mount a shiftfs resource and decide to enable/disable shiftfs based on this.

@ctalledo
Copy link
Member

I wonder if it's worthwhile to add a check in the sysbox-mgr to automatically disable using shiftfs in known broken kernels.

It's not trivial to implement though, because testing whether shiftfs-on-overlayfs works requires mounting shiftfs, and that in turn requires the process enter a new user-namespace, and that requires UID mappings, and so on ...

@felipecrs
Copy link
Contributor

My initial thought was to simply check if the kernel is Ubuntu >=5.15.0-58...

@ctalledo
Copy link
Member

My initial thought was to simply check if the kernel is Ubuntu >=5.15.0-58...

It's broken since 5.15-0-48 I believe, and I believe in 5.17 and possibly 5.19 too; we don't know when the fix is coming so it's hard to tie it to a kernel version.

@bokenator
Copy link

Updating to 5.19.0-28 fixed the problem for me :)

@ctalledo
Copy link
Member

ctalledo commented Feb 21, 2023

FYI: commit with the fix for shiftfs in Ubuntu: https://git.launchpad.net/~ubuntu-kernel/ubuntu/+source/linux/+git/lunar/commit/fs/shiftfs.c?h=master-next&id=cfe3544e11cc53e0038410a2199ee6afeea3687f

Should be present in the upcoming Ubuntu 23.04 release (Lunar Lobster), due April 2023.

NOTE: the upcoming release of Sysbox (v0.6.0) will automatically check if shiftfs works on the host or not, and adjust accordingly. In platforms where it works, it will use it as needed. In platforms where it does not work, it will use an alternative mechanism. The new Sysbox release will also automatically check if the kernel supports ID-mapped mounts (kernel 5.12+) and overlayfs on ID-mapped mounted lower dirs (kernel 5.19+), and use both of these features. The latter one really makes shiftfs unnecessary going forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants