-
Notifications
You must be signed in to change notification settings - Fork 606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"./etc/group doesn't have a proper root mount" while checkpointing singularity #841
Comments
There have been other attempts to checkpoint/restore singularity containers. Maybe something there helps you:
As you are trying to run rootless containers with singularity it would be good to know how singularity is creating the containers. As far as I know there are two versions of singularity around. The old version was, to work on older kernels, not using namespaces a lot but was mainly based on setuid chroot()s. The newer version of singularity, as far as I know, tries to actually use namespaces as many other container engines. Which singularity version are you using? Not sure how useful the following suggestion is. Podman can also be used to run containers rootless in HPC environments (https://podman.io/blogs/2019/09/26/podman-in-hpc.html). Podman's checkpoint/restore support would not help you as it requires running the containers as root, but we know that Podman can checkpoint/restore its root containers. Maybe that makes it easier for rootless containers. Not sure. |
Thank you for your reply! I'm using the 2.5.2 version of Singularity, released on July 3rd, 2018. Singularity is already being deployed for a while in our cluster, so it might be hard to switch container tool, but thanks for the advise! |
/etc/group has to be restored as external mount. you can look at test/zdtm/static/mnt_ext_manual.desc as an example of using the --externel options for mounts. |
@avagin Thank you! Now after adding proper "--external" option for /etc/group according to the error message, the dump process proceeded, but now this time there is another error message:
I've checked all the related posted issues about "The root task has another root than mntns" but they don't actually help my case. Below is the full log. Again, thank you so much! |
If I am not mistaken, this means that the root task changed its root by calling chroot. The criu doesn't support this case. All modern container runtimes use pivot_root to change the root file system. |
Cc: @Snorch |
That is the reason I was asking for the version of Singularity. I think the 2.x versions of Singularity are based on chroot and setuid from what I have seen. |
Now the problem is relaxed into the same one as #600 |
The error now is a completely different problem. I posted another issue #855. Thank you guys so much for the help! |
I'm running a simple HPC program using Singularity, and my goal is to be able to checkpoint/restart/migrate the whole container using criu. Note that I need to run the Singularity container without sudo privilege.
Since Singularity hasn't integrate criu into itself, I guess I'd just have to manually dump it (sorry for my stupidity), but I don't really know what's the proper way to do it. below is my command to dump it:
sudo criu dump -v4 --tree 16076 --images-dir /home/CRIU/exit_dir --external mnt[]:m --leave-stopped --shell-job
And then criu spits out the following to me:
The full test log is here:
test.log
What did I do wrong here? Appreciate your time & efforts!
The text was updated successfully, but these errors were encountered: