Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ecs/lambda/fargate] [request]: Allow unprivileged containers to create new user namespaces with clone(2) and unshare(2) #2102

Open
esamattis opened this issue Aug 3, 2023 · 2 comments
Labels
Proposed Community submitted issue

Comments

@esamattis
Copy link

esamattis commented Aug 3, 2023

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Tell us about your request

I'd like to create new unprivileged user namespace so I could use clone(2) to create sandboxed processes like nsjail, bubblewrap, isolate or even how Chromium does it.

Since Linux 3.8 it should possible to create them without any extra permissions. From the CLONE_NEWUSER section of the clone(2) man page:

Before Linux 3.8, use of CLONE_NEWUSER required that the caller have three capabilities: CAP_SYS_ADMIN, CAP_SETUID, and CAP_SETGID. Starting with Linux 3.8, no privileges are needed to create a user namespace.

Linux 3.8 was released in 2013 so I think it is pretty safe to assume that AWS is running newer kernels ;)

But when I try to create new user namespace with clone(2) it errors with EPERM. Tried this is in unprivileged ECS container and in a Lambda Container. The same code ran fine in a local linux installation when executed as non-root.

Which service(s) is this request for?

All container services: Lambda Containers, unprivileged ECS, Fargate etc.

Tell us about the problem you're trying to solve. What are you trying to do, and why is it hard?

To run sandboxed processes which have no network access, does not see other process PIDs and have limited filesystem visibility.

Are you currently working around this issue?

I think I need to use privileged ECS containers. Have not tried them yet.

Additional context

Normally when a new user namespace is created with CLONE_NEWUSER it is possible to create bind mounts, use pivot root etc. without any extra permissions.

This could also allow running rootless Docker or Podman without privileged containers.

There is a great article series on Linux Namespaces on lwm.net: https://lwn.net/Articles/531114/

@esamattis esamattis added the Proposed Community submitted issue label Aug 3, 2023
@esamattis
Copy link
Author

esamattis commented Aug 9, 2023

I think I need to use privileged ECS containers. Have not tried them yet.

Update: Yes, with privileged containers it is possible to create new user namespaces with a non-root (non-zero uid) user. But that's kinda unfortunate that if you want to add extra sandboxing you'll need to first give more permissions.

@heri16
Copy link

heri16 commented May 18, 2024

+1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Proposed Community submitted issue
Projects
None yet
Development

No branches or pull requests

2 participants