Skip to content

Commit

Permalink
runc run: fix mount leak
Browse files Browse the repository at this point in the history
When preparing to mount container root, we need to make its parent mount
private (i.e. disable propagation), otherwise the new in-container
mounts are leaked to the host.

To find a parent mount, we use to read mountinfo and find the longest
entry which can be a parent of the container root directory.

Unfortunately, due to kernel bug in all Linux kernels older than v5.8
(see [1], [2]), sometimes mountinfo can't be read in its entirety. In
this case, getParentMount may occasionally return a wrong parent mount.

As a result, we do not change the mount propagation to private, and
container mounts are leaked.

Alas, we can not fix the kernel, and reading mountinfo a few times to
ensure its consistency (like it's done in, say, Kubernetes) does not
look like a good solution for performance reasons.

One way of fixing this is try mount on each directory until it
succeeds (or return an error other than EINVAL, which in this case means
"not a mount point").

Fixes issue 2404.

[1]: https://github.com/kolyshkin/procfs-test
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=9f6c61f96f2d97cbb5f
Signed-off-by: Kir Kolyshkin <kolyshkin@gmail.com>
  • Loading branch information
kolyshkin committed Oct 2, 2024
1 parent 57798c6 commit dc0ca3a
Showing 1 changed file with 21 additions and 47 deletions.
68 changes: 21 additions & 47 deletions libcontainer/rootfs_linux.go
Original file line number Diff line number Diff line change
Expand Up @@ -982,54 +982,31 @@ func mknodDevice(dest string, node *devices.Device) error {
return os.Chown(dest, int(node.Uid), int(node.Gid))
}

// Get the parent mount point of directory passed in as argument. Also return
// optional fields.
func getParentMount(rootfs string) (string, string, error) {
mi, err := mountinfo.GetMounts(mountinfo.ParentsFilter(rootfs))
if err != nil {
return "", "", err
}
if len(mi) < 1 {
return "", "", fmt.Errorf("could not find parent mount of %s", rootfs)
}

// find the longest mount point
var idx, maxlen int
for i := range mi {
if len(mi[i].Mountpoint) > maxlen {
maxlen = len(mi[i].Mountpoint)
idx = i
}
}
return mi[idx].Mountpoint, mi[idx].Optional, nil
}

// Make parent mount private if it was shared
func rootfsParentMountPrivate(rootfs string) error {
sharedMount := false

parentMount, optionalOpts, err := getParentMount(rootfs)
if err != nil {
return err
}

optsSplit := strings.Split(optionalOpts, " ")
for _, opt := range optsSplit {
if strings.HasPrefix(opt, "shared:") {
sharedMount = true
// rootfsParentMountPrivate ensures rootfs parent mount is private.
// This is needed for two reasons:
// - pivot_root() will fail if parent mount is shared;
// - when we bind mount rootfs, if its parent is not private, the new mount
// will propagate (leak!) to parent namespace and we don't want that.
func rootfsParentMountPrivate(path string) error {
var err error
// Assuming path is absolute and clean (this is checked in libcontainer/validate).
for path != "/" {
err = unix.Mount("", path, "", unix.MS_PRIVATE, "")
if err != unix.EINVAL { //nolint:errorlint // unix errors are bare
break
}
// EINVAL means path is not a mount point yet, try its parent.
path = filepath.Dir(path)
}

// Make parent mount PRIVATE if it was shared. It is needed for two
// reasons. First of all pivot_root() will fail if parent mount is
// shared. Secondly when we bind mount rootfs it will propagate to
// parent namespace and we don't want that to happen.
if sharedMount {
return mount("", parentMount, "", unix.MS_PRIVATE, "")
if err == nil {
return nil
}
return &mountError{
op: "remount-private",
target: path,
flags: unix.MS_PRIVATE,
err: err,
}

return nil
}

func prepareRoot(config *configs.Config) error {
Expand All @@ -1041,9 +1018,6 @@ func prepareRoot(config *configs.Config) error {
return err
}

// Make parent mount private to make sure following bind mount does
// not propagate in other namespaces. Also it will help with kernel
// check pass in pivot_root. (IS_SHARED(new_mnt->mnt_parent))
if err := rootfsParentMountPrivate(config.Rootfs); err != nil {
return err
}
Expand Down

0 comments on commit dc0ca3a

Please sign in to comment.