Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

procfs: add safe procfs API and harden proc operations #42

Merged
merged 10 commits into from
Jul 29, 2024

Conversation

cyphar
Copy link
Member

@cyphar cyphar commented Jul 24, 2024

A toy example of using this API to do a few operations runc needs to do internally:

int get_self_exe()
{
	/* This follows the trailing magic-link! */
	int fd = pathrs_proc_open(PATHRS_PROC_SELF, "exe", O_PATH);
	if (fd < 0) {
		pathrs_error_t *error = pathrs_errorinfo(fd);
		/* ... print the error ... */
		pathrs_errorinfo_free(error);
		return -1;
	}
	return fd;
}

int write_apparmor_label(char *label)
{
	int fd, err, saved_errno = 0;

	fd = pathrs_proc_open(PATHRS_PROC_SELF, "attr/exec", O_WRONLY|O_NOFOLLOW);
	if (fd < 0) {
		pathrs_error_t *error = pathrs_errorinfo(fd);
		/* ... print the error ... */
		pathrs_errorinfo_free(error);
		return -1;
	}

	err = write(fd, label, strlen(label));
	close(fd);
	return err;
}

char *get_unsafe_path(int fd)
{
	char *fdpath;

	if (asprintf(&fdpath, "fd/%d", fd) < 0)
		return NULL;

	int linkbuf_size = 128;
	char *linkbuf = malloc(size);
	if (!linkbuf)
		goto err;
	for (;;) {
		int len = pathrs_proc_readlink(PATHRS_PROC_THREAD_SELF, fdpath, linkbuf, linkbuf_size);
		if (len < 0) {
			pathrs_error_t *error = pathrs_errorinfo(fd);
			/* ... print the error ... */
			pathrs_errorinfo_free(error);
			goto err;
		}

		if (len <= linkbuf_size)
			break;

		linkbuf_size = len;
		linkbuf = realloc(linkbuf, linkbuf_size);
		if (!linkbuf)
			goto err;
	}

	free(fdpath);
	return name_buf;

err:
	free(fdpath);
	free(name_buf);
	return NULL;
}

Fixes #7
Fixes #15
Signed-off-by: Aleksa Sarai cyphar@cyphar.com

@cyphar cyphar force-pushed the procfs-hardening branch 3 times, most recently from bb0c038 to 250bca5 Compare July 24, 2024 18:11
@cyphar cyphar force-pushed the procfs-hardening branch 10 times, most recently from 30f4811 to c3b8c39 Compare July 28, 2024 10:56
@cyphar cyphar marked this pull request as ready for review July 28, 2024 11:01
@cyphar
Copy link
Member Author

cyphar commented Jul 28, 2024

The new O_PATH resolver has extensive tests, as does ProcfsHandle (based on the filepath-securejoin test suite but more extensive).

image

We're only really missing tests for the helper Root::... methods. The C API is not included because it isn't called from Rust. In theory we could add test cases using the C API from Rust, but we can look at whether that makes sense once we have tests for everything.

cyphar added 10 commits July 29, 2024 02:18
This will make matching against Error types in our tests much simpler.
Maybe we will want to export this at some point.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
At the moment this interface is not entirely safe, so it's not exported
yet. It will gain hardening in future commits.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
To protect against basic bind-mount attacks, we can use statx to fetch
the mount id of the path we land on. If it is a bind-mount (even a
procfs bind-mount) we'll detect it.

This doesn't protect against:

 1. Path components that are bind-mounts (either tmpfs or of symlinks)
    that jump to a different place in the original procfs. Fixing this
    requires adding a specialised resolver that uses openat2 or a
    restricted subset of resolvers::opath.

 2. Mounts occurring on the host that race with open_follow.
    Unfortunately, there is no way to re-open a magic-link and so there
    is a race-window where a mount could be applied on top of the
    magic-link after we check the stx_mnt_id of the link but before we
    resolve it. The only real solution is to create a private procfs
    (with fsopen(2) or open_tree(2)) which cannot have racing mounts.

Both issues are fixed in future patches.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This logic is taken from ProcThreadSelf in runc and filepath-securejoin.
Some old RHEL systems have pre-3.17 kernels which require this
workaround.

Unfortunately, runc has to do /proc/... operations after joining a PID
namespace but while still using the host /proc, which results in errors
if we just try /proc/$pid/task/$tid (the TIDs are from our PID namespace
but the TIDs used by the procfs are based on the PID namespace used when
it was mounted, which means that host PID namespace in most cases). In
this case, the only other option is to just use /proc/self and hope
nothing breaks. We could panic, but users would probably then want to
use an API that does this fallback anyway (and there's no real way to
conclusively verify that /proc/self is "okay" to use).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
This is loosely based on the opath resolver, with support for using
openat2 if possible. The new procfs_beneath resolver currently only
supports doing O_PATH (and O_DIRECTORY) lookups. This is enough for our
needs within libpathrs but when we expose ProcfsHandle as a public API
we will need to support more modes. Unfortunately, we can't just do a
/proc/self/fd/... re-open because the resolver is used within the
ProcfsHandle implementation to do reopening.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
(This is adapted from filepath-securejoin.)

Because we depend on procfs to be correct when operating on other
filesystems, having a safe procfs handle is vital. Unfortunately, if we
are an administrative program running inside a container that can modify
its mount configuration, our current checks are not sufficient to avoid
being tricked into thinking a path is real.

Luckily, with fsopen() and open_tree() it is possible to create a
completely private procfs instance that an attacker cannot modify. Note
that while they are both safe, they are safe for different reasons:

 1. fsopen() is safe because the created mount is completely separate to
    any other procfs mount, so any changes to mounts on the host are
    irrelevant. fsopen() can fail if we trip the mnt_too_revealing()
    check, so we may have to fall back to open_tree() in some cases.

 2. open_tree() creates a clone of a snapshot of the mount tree (or just
    the top mount if can avoid using AT_RECURSIVE, but
    mnt_too_revealing() may force us to use AT_RECURSIVE). While the
    tree we clone might have been messed with by an attacker, after
    cloning there is no way for the attacker to affect our clone (even
    mount propagation won't propagate into a clone[1]).

    The only risk is whether there are over-mounts with AT_RECURSIVE.

    Because anonymous mounts don't show up in mountinfo, the best we can
    do is check the mount id through statx to see whether a target has
    an overmount (this is identical to the logic we already had, but at
    least it is now safe against races). openat2 is sufficient for
    non-magic-links but for magic-links we need to check this
    explicitly.

    listmounts(2) would let us detect any overmounts at creation time,
    but it's not clear whether you want to error out if there is a mount
    over a path you never use (lxcfs has overmounts of /proc/cpuinfo and
    /proc/meminfo, which we never use).

Unfortunately, we can only use these when running as a privileged user.
In theory we could create a user namespace to gain the necessary
privileges to create these handles, but this would require spawning a
proper subprocess (CLONE_NEWUSER must be done in a single-threaded
program) which won't always work when libpathrs is used as a library.
It's also far too complicated to deal with in practice.

[1]: This is true since at least Linux 5.12. See commit ee2e3f50629f
     ("mount: fix mounting of detached mounts onto targets that reside
     on shared mounts").

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Provide some nice APIs to do procfs operations in a safe way. In theory
we could just provide pathrs_proc_open(), but so many people will want
to do pathrs_proc_readlink() that it makes sense to just provide it (to
ensure users don't misuse readlinkat() and accidentally make the
operation unsafe).

Here are some toy examples, based on some of the usecases we have for
these APIs in runc.

1. In order to get a reference to /proc/self/exe we can use for doing a
   re-exec or doing memfd binary cloning, we can use pathrs_proc_open():

     /* Safely get an fd to /proc/self/exe. */
     int get_self_exe()
     {
         /* This follows the trailing magic-link! */
         int fd = pathrs_proc_open(PATHRS_PROC_SELF, "exe", O_PATH);
         if (fd < 0) {
             pathrs_error_t *error = pathrs_errorinfo(fd);
             /* ... print the error ... */
             pathrs_errorinfo_free(error);
             return -1;
         }
         return fd;
     }

2. When writing to AppArmor labels, we want to be absolutely sure there
   are no bind-mounts that would trick us into not actually writing the
   labels. The key risk is something like a bind-mount of
   /proc/self/sched (which is a procfs file that is no-op for all
   writes) or a bind-mount.

     int write_apparmor_label(char *label)
     {
         int fd, err, saved_errno = 0;

         fd = pathrs_proc_open(PATHRS_PROC_SELF, "attr/exec",
                               O_WRONLY|O_NOFOLLOW);
         if (fd < 0) {
             pathrs_error_t *error = pathrs_errorinfo(fd);
             /* ... print the error ... */
             pathrs_errorinfo_free(error);
             return -1;
         }

         err = write(fd, label, strlen(label));
         close(fd);
         return err;
     }

3. Sometimes we need to get the "real" path of a given file descriptor.
   This is fundamentally racy, and you would only really want to do this
   for debugging information, but it is something you need to do
   sometimes:

     char *get_unsafe_path(int fd)
     {
         char *fdpath;

         if (asprintf(&fdpath, "fd/%d", fd) < 0)
             return NULL;

         int linkbuf_size = 128;
         char *linkbuf = malloc(size);
         if (!linkbuf)
             goto err;
         for (;;) {
             int len = pathrs_proc_readlink(PATHRS_PROC_THREAD_SELF,
                                            fdpath, linkbuf, linkbuf_size);
             if (len < 0) {
                 pathrs_error_t *error = pathrs_errorinfo(fd);
                 /* ... print the error ... */
                 pathrs_errorinfo_free(error);
                 goto err;
             }

             if (len <= linkbuf_size)
                 break;

             linkbuf_size = len;
             linkbuf = realloc(linkbuf, linkbuf_size);
             if (!linkbuf)
                 goto err;
         }

         free(fdpath);
         return name_buf;

     err:
         free(fdpath);
         free(name_buf);
         return NULL;
     }

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
The main Go-specific thing we need to keep in mind when opening files
from /proc/thread-self is that we have to runtime.LockOSThread to avoid
hitting cases where we are migrated to a different thread and the
original thread dies while we're using the handle (for most procfs files
this results in spurious errors or blank files).

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
These tests are based on the filepath-securejoin tests, and notably
include tests of the key cases when operating on procfs.

Signed-off-by: Aleksa Sarai <cyphar@cyphar.com>
@cyphar cyphar merged commit 5d37a90 into openSUSE:main Jul 29, 2024
6 checks passed
@cyphar cyphar deleted the procfs-hardening branch July 29, 2024 01:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

safe procfs API /proc hardening
1 participant