Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deny CLONE_NEWUSER (restrict namespaces) #4939

Closed
rusty-snake opened this issue Feb 13, 2022 · 11 comments · Fixed by #5259
Closed

Deny CLONE_NEWUSER (restrict namespaces) #4939

rusty-snake opened this issue Feb 13, 2022 · 11 comments · Fixed by #5259
Labels
enhancement New feature request

Comments

@rusty-snake
Copy link
Collaborator

Is your feature request related to a problem? Please describe.

N/A

Describe the solution you'd like

An command (e.g. nonewuser) which blocks calls to clone (and others like unshare) if CLONE_NEWUSER is set.

Describe alternatives you've considered

N/A

Additional context

Flatpak does this for example.

@rusty-snake rusty-snake added the enhancement New feature request label Feb 13, 2022
@rusty-snake
Copy link
Collaborator Author

If anyone wants to play (on x86-64 systems!):

  • Download deny-clone-newuser.bpf.txt and remove the .txt that you need to add to files you upload to GH
  • Run bwrap --seccomp 4 --dev-bind / / /bin/bash 4<~/Downloads/deny-clone-newuser.bpf
  • and try to unshare --user
source

Cargo.toml:

[package]
name = "deny_clone_newuser_test"
version = "0.1.0"
edition = "2021"

[dependencies]
libc = "0.2"
libseccomp = "0.2.2"

src/main.rs:

use libseccomp::{get_syscall_from_name, scmp_cmp, ScmpAction, ScmpArgCompare, ScmpFilterContext};
use std::io;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    type SyscallBlocklist = &'static [(&'static str, i32, &'static [ScmpArgCompare])];

    const EPERM: i32 = libc::EPERM;
    const ENOSYS: i32 = libc::ENOSYS;
    const CLONE_NEWUSER: u64 = libc::CLONE_NEWUSER as u64;

    #[rustfmt::skip]
    const DENY_CLONE_NEWUSER: SyscallBlocklist = &[
        ("clone", EPERM, &[scmp_cmp!($arg0 & CLONE_NEWUSER == CLONE_NEWUSER)]),
        ("clone3", ENOSYS, &[]),
        ("unshare", EPERM, &[scmp_cmp!($arg0 & CLONE_NEWUSER == CLONE_NEWUSER)]),
    ];

    let mut ctx = ScmpFilterContext::new_filter(ScmpAction::Allow)?;

    for &(syscall, errno, comparators) in DENY_CLONE_NEWUSER {
        let syscall_nr = get_syscall_from_name(syscall, None)?;
        let action = ScmpAction::Errno(errno);

        ctx.add_rule_conditional(action, syscall_nr, comparators)?;
    }

    //ctx.export_pfc(&mut io::stdout())?;
    ctx.export_bpf(&mut io::stdout())?;

    Ok(())
}

@topimiettinen
Copy link
Collaborator

Good idea. I'd suggest a more generic command like systemd's RestrictNamespaces= directive, which can block multiple namespaces (cgroup, ipc, net, mnt, pid, user and uts).

@rusty-snake
Copy link
Collaborator Author

Looking at https://github.com/systemd/systemd/blob/ee6fd6a50922d2b27c97084e1c3f9872d495c273/src/shared/seccomp-util.c#L1206 this sums up to

if restrict_namespaces == ALL:
    # Block setns unconditionally because it is useless if all namespaces are disallowed.
    setns -> EPERM
else:
    # Otherwise block `arg1 == 0` which has the special meaning 'setns all namespaces'
    # allowing to bypass this restriction.
    setns(_, 0) -> EPERM

for restricted_namespace in restricted_namespaces:
    # Block unshare and setns calls which try to unshare/setns a restricted namespace.
    unshare(restricted_namespace) -> EPERM
    setns(_, restricted_namespace) -> EPERM
    # Block clone calls which try to unshare a restricted namespace.
    # NOTE: The interface of `clone` is different on different architectures.
    clone(restricted_namespace, ...) -> EPERM

# Not in systemds `seccomp_restrict_namespaces` but should be blocked to see
# https://github.com/flatpak/flatpak/security/advisories/GHSA-67h7-w3jq-vh4q
# CVE-2021-41133
# https://github.com/flatpak/flatpak/commit/a10f52a7565c549612c92b8e736a6698a53db330
clone3 -> ENOSYS

@rusty-snake
Copy link
Collaborator Author

So after CVE-2022-0185 here's the next one CVE-2022-25636.

@rusty-snake
Copy link
Collaborator Author

An the list continues with CVE-2022-1015.

@rusty-snake
Copy link
Collaborator Author

@rusty-snake
Copy link
Collaborator Author

CVE-2022-32250

Every month the same. And I don't even track all.

@glitsj16
Copy link
Collaborator

glitsj16 commented Jul 9, 2022

Just posting this here because it might be of interest:
https://blog.cloudflare.com/live-patch-security-vulnerabilities-with-ebpf-lsm/

@smitsohu
Copy link
Collaborator

smitsohu commented Jul 14, 2022

Is someone working on this one or intends to do so?

If not I would be interested in taking it.

Maybe we can also set /proc/sys/user/max_{cgroup,ipc,mnt,net,pid,time,user,uts}_namespaces to zero if there is a noroot option...
These sysctls are namespaced and cannot be raised again inside the sandbox, because Firejail doesn't map root in the new user namespace, and also because /proc/sys is read-only. As checks happen in a different place in the kernel, I think it would increase the overall robustness.

@smitsohu
Copy link
Collaborator

Maybe we can also set /proc/sys/user/max_{cgroup,ipc,mnt,net,pid,time,user,uts}_namespaces to zero if there is a noroot option...
These sysctls are namespaced and cannot be raised again inside the sandbox, because Firejail doesn't map root in the new user namespace, and also because /proc/sys is read-only. As checks happen in a different place in the kernel, I think it would increase the overall robustness.

Or even better, unshare two user namespaces: The first user namespace only exists to impose limits on future namespace creation, by doing the equivalent of echo 1 > /proc/sys/user/max_user_namespaces. Then unshare a second time, and build the sandbox in that second user namespace.

This requires a non-privileged version of Firejail though, so we need the seccomp filter as well.

kmk3 added a commit that referenced this issue Aug 18, 2022
And fix a typo of "implemented".

Relates to #4939 #5259.
@kmk3 kmk3 changed the title Deny CLONE_NEWUSER Deny CLONE_NEWUSER (restrict namespaces) Aug 18, 2022
kmk3 added a commit that referenced this issue Aug 20, 2022
kmk3 added a commit that referenced this issue Dec 20, 2022
@rusty-snake
Copy link
Collaborator Author

And more CVEs mitigated by this feature: CVE-2023-1281, CVE-2023-1829

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature request
Projects
Status: Done (on RELNOTES)
Development

Successfully merging a pull request may close this issue.

4 participants