Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

procfs: add safe procfs API and harden proc operations #42

Merged
merged 10 commits into from
Jul 29, 2024

Commits on Jul 28, 2024

  1. error: add (currently-test-only) ErrorKind

    This will make matching against Error types in our tests much simpler.
    Maybe we will want to export this at some point.
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    1a266c4 View commit details
    Browse the repository at this point in the history
  2. procfs: move internal handle to a new ProcfsHandle API

    At the moment this interface is not entirely safe, so it's not exported
    yet. It will gain hardening in future commits.
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    e5fc3c4 View commit details
    Browse the repository at this point in the history
  3. procfs: add stx_mnt_id hardening to detect /proc bind-mounts

    To protect against basic bind-mount attacks, we can use statx to fetch
    the mount id of the path we land on. If it is a bind-mount (even a
    procfs bind-mount) we'll detect it.
    
    This doesn't protect against:
    
     1. Path components that are bind-mounts (either tmpfs or of symlinks)
        that jump to a different place in the original procfs. Fixing this
        requires adding a specialised resolver that uses openat2 or a
        restricted subset of resolvers::opath.
    
     2. Mounts occurring on the host that race with open_follow.
        Unfortunately, there is no way to re-open a magic-link and so there
        is a race-window where a mount could be applied on top of the
        magic-link after we check the stx_mnt_id of the link but before we
        resolve it. The only real solution is to create a private procfs
        (with fsopen(2) or open_tree(2)) which cannot have racing mounts.
    
    Both issues are fixed in future patches.
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    7332e98 View commit details
    Browse the repository at this point in the history
  4. procfs: add pre-3.17 fallback for /proc/thread-self

    This logic is taken from ProcThreadSelf in runc and filepath-securejoin.
    Some old RHEL systems have pre-3.17 kernels which require this
    workaround.
    
    Unfortunately, runc has to do /proc/... operations after joining a PID
    namespace but while still using the host /proc, which results in errors
    if we just try /proc/$pid/task/$tid (the TIDs are from our PID namespace
    but the TIDs used by the procfs are based on the PID namespace used when
    it was mounted, which means that host PID namespace in most cases). In
    this case, the only other option is to just use /proc/self and hope
    nothing breaks. We could panic, but users would probably then want to
    use an API that does this fallback anyway (and there's no real way to
    conclusively verify that /proc/self is "okay" to use).
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    b87d656 View commit details
    Browse the repository at this point in the history
  5. procfs: use a restricted procfs resolver for lookups

    This is loosely based on the opath resolver, with support for using
    openat2 if possible. The new procfs_beneath resolver currently only
    supports doing O_PATH (and O_DIRECTORY) lookups. This is enough for our
    needs within libpathrs but when we expose ProcfsHandle as a public API
    we will need to support more modes. Unfortunately, we can't just do a
    /proc/self/fd/... re-open because the resolver is used within the
    ProcfsHandle implementation to do reopening.
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    18486d8 View commit details
    Browse the repository at this point in the history
  6. procfs: add fsopen(2) and open_tree(2) support

    (This is adapted from filepath-securejoin.)
    
    Because we depend on procfs to be correct when operating on other
    filesystems, having a safe procfs handle is vital. Unfortunately, if we
    are an administrative program running inside a container that can modify
    its mount configuration, our current checks are not sufficient to avoid
    being tricked into thinking a path is real.
    
    Luckily, with fsopen() and open_tree() it is possible to create a
    completely private procfs instance that an attacker cannot modify. Note
    that while they are both safe, they are safe for different reasons:
    
     1. fsopen() is safe because the created mount is completely separate to
        any other procfs mount, so any changes to mounts on the host are
        irrelevant. fsopen() can fail if we trip the mnt_too_revealing()
        check, so we may have to fall back to open_tree() in some cases.
    
     2. open_tree() creates a clone of a snapshot of the mount tree (or just
        the top mount if can avoid using AT_RECURSIVE, but
        mnt_too_revealing() may force us to use AT_RECURSIVE). While the
        tree we clone might have been messed with by an attacker, after
        cloning there is no way for the attacker to affect our clone (even
        mount propagation won't propagate into a clone[1]).
    
        The only risk is whether there are over-mounts with AT_RECURSIVE.
    
        Because anonymous mounts don't show up in mountinfo, the best we can
        do is check the mount id through statx to see whether a target has
        an overmount (this is identical to the logic we already had, but at
        least it is now safe against races). openat2 is sufficient for
        non-magic-links but for magic-links we need to check this
        explicitly.
    
        listmounts(2) would let us detect any overmounts at creation time,
        but it's not clear whether you want to error out if there is a mount
        over a path you never use (lxcfs has overmounts of /proc/cpuinfo and
        /proc/meminfo, which we never use).
    
    Unfortunately, we can only use these when running as a privileged user.
    In theory we could create a user namespace to gain the necessary
    privileges to create these handles, but this would require spawning a
    proper subprocess (CLONE_NEWUSER must be done in a single-threaded
    program) which won't always work when libpathrs is used as a library.
    It's also far too complicated to deal with in practice.
    
    [1]: This is true since at least Linux 5.12. See commit ee2e3f50629f
         ("mount: fix mounting of detached mounts onto targets that reside
         on shared mounts").
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    6750757 View commit details
    Browse the repository at this point in the history
  7. capi: add pathrs_proc_* helpers

    Provide some nice APIs to do procfs operations in a safe way. In theory
    we could just provide pathrs_proc_open(), but so many people will want
    to do pathrs_proc_readlink() that it makes sense to just provide it (to
    ensure users don't misuse readlinkat() and accidentally make the
    operation unsafe).
    
    Here are some toy examples, based on some of the usecases we have for
    these APIs in runc.
    
    1. In order to get a reference to /proc/self/exe we can use for doing a
       re-exec or doing memfd binary cloning, we can use pathrs_proc_open():
    
         /* Safely get an fd to /proc/self/exe. */
         int get_self_exe()
         {
             /* This follows the trailing magic-link! */
             int fd = pathrs_proc_open(PATHRS_PROC_SELF, "exe", O_PATH);
             if (fd < 0) {
                 pathrs_error_t *error = pathrs_errorinfo(fd);
                 /* ... print the error ... */
                 pathrs_errorinfo_free(error);
                 return -1;
             }
             return fd;
         }
    
    2. When writing to AppArmor labels, we want to be absolutely sure there
       are no bind-mounts that would trick us into not actually writing the
       labels. The key risk is something like a bind-mount of
       /proc/self/sched (which is a procfs file that is no-op for all
       writes) or a bind-mount.
    
         int write_apparmor_label(char *label)
         {
             int fd, err, saved_errno = 0;
    
             fd = pathrs_proc_open(PATHRS_PROC_SELF, "attr/exec",
                                   O_WRONLY|O_NOFOLLOW);
             if (fd < 0) {
                 pathrs_error_t *error = pathrs_errorinfo(fd);
                 /* ... print the error ... */
                 pathrs_errorinfo_free(error);
                 return -1;
             }
    
             err = write(fd, label, strlen(label));
             close(fd);
             return err;
         }
    
    3. Sometimes we need to get the "real" path of a given file descriptor.
       This is fundamentally racy, and you would only really want to do this
       for debugging information, but it is something you need to do
       sometimes:
    
         char *get_unsafe_path(int fd)
         {
             char *fdpath;
    
             if (asprintf(&fdpath, "fd/%d", fd) < 0)
                 return NULL;
    
             int linkbuf_size = 128;
             char *linkbuf = malloc(size);
             if (!linkbuf)
                 goto err;
             for (;;) {
                 int len = pathrs_proc_readlink(PATHRS_PROC_THREAD_SELF,
                                                fdpath, linkbuf, linkbuf_size);
                 if (len < 0) {
                     pathrs_error_t *error = pathrs_errorinfo(fd);
                     /* ... print the error ... */
                     pathrs_errorinfo_free(error);
                     goto err;
                 }
    
                 if (len <= linkbuf_size)
                     break;
    
                 linkbuf_size = len;
                 linkbuf = realloc(linkbuf, linkbuf_size);
                 if (!linkbuf)
                     goto err;
             }
    
             free(fdpath);
             return name_buf;
    
         err:
             free(fdpath);
             free(name_buf);
             return NULL;
         }
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    83a180a View commit details
    Browse the repository at this point in the history
  8. go bindings: add pathrs_proc_* wrappers

    The main Go-specific thing we need to keep in mind when opening files
    from /proc/thread-self is that we have to runtime.LockOSThread to avoid
    hitting cases where we are migrated to a different thread and the
    original thread dies while we're using the handle (for most procfs files
    this results in spurious errors or blank files).
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    e3fcd6e View commit details
    Browse the repository at this point in the history
  9. python bindings: add pathrs_proc_* wrappers

    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    83aff8c View commit details
    Browse the repository at this point in the history
  10. tests: add ProcfsHandle tests

    These tests are based on the filepath-securejoin tests, and notably
    include tests of the key cases when operating on procfs.
    
    Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
    cyphar committed Jul 28, 2024
    Configuration menu
    Copy the full SHA
    59fdb76 View commit details
    Browse the repository at this point in the history