Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Version 3: OpenShift compatibility? #427

Closed
narcoticfresh opened this issue Mar 18, 2022 · 7 comments
Closed

Version 3: OpenShift compatibility? #427

narcoticfresh opened this issue Mar 18, 2022 · 7 comments

Comments

@narcoticfresh
Copy link

What is the problem

So I'm forced to use OpenShift 4 and we used s6-overlay quite successfully on there in versions 2.* (there would be some startup warnings also, but it worked perfectly) - but after upgrading to 3.*, we get this startup error:

s6-chown: fatal: unable to chown /run: Operation not permitted
s6-overlay-suexec: fatal: child failed with exit code 111

The container then exits, nothing can be done, the battle to provide a home to the workload is lost..

So, obviously, I searched the issues here and found issue #309 - and the OpenShift use case is not covered there..

As OpenShift is quite a successful market offering, one should not really consider this an "edge case" - especially since version 2.x worked with it. So this is kinda also a BC break and/or regression.

What is different on OpenShift?

Well, on OpenShift (which is an "enterprise" Kubernetes distribution), containers are spun up with a random UID which is a member of the root group. Tipically, those UIDs would be in a high range like this:

$ id
uid=1000780000(1000780000) gid=0(root) groups=0(root),1000780000

These users are not in /etc/passwd.

There is no possibility to execute anything under another UID (so no setuid possible).. I guess that's why s6-chown fails..

What is the issue?

So my primary question is this:

  • Why did version 2.x work - and why does version 3.x not work? I tried with version v3.1.0.1.

Normal remedies

So admittedly, these OpenShift constraints are annoying to say the least...

In order to deal with them, one would normally (at build time) chgrp root and chmod g+rw the relevant directories that need to be writable.. that worked on all containers I encountered so far (even picky ones like nginx)..

But I had no luck just chowning /run to root - s6-chown will still try to do its thing and then crash - with crashes the whole container..

What can be done?

Now as I read the discussion in #309, this is quite a fundamental topic.. I'm not looking to any big changes here.. I'm just looking for a workaround or a flag or whatever to leave permissions to build time, not enforcing anything regarding chowns and chmods at runtime..

Thanks

Thanks for any help/suggestions.. and thanks for this project!

@skarnet
Copy link
Contributor

skarnet commented Mar 18, 2022

To answer your primary question: since v3 is a complete rewrite from v2, with a different approach (in order to solve problems that could never be solved with the v2 approach), it is bound to encounter different issues.

But regarding uids, everything should work transparently here; so if it doesn't, it means OpenShift is doing something different that's not taken into account. And given the previous reports we've gotten about USER containers and --userns host containers and one or two other ways to run containers that all compete for the Most Idiotic Security Theater Feature award, I can assure you that I've made uid management as generic as possible; so if OpenShift manages to get itself into a situation that breaks it, it must be doing something especially, aggressively, hatefully stupid.

(That, or there's a bug in s6-overlay-suexec, which we will quickly determine.)

Running with a random UID is indeed pretty stupid (it doesn't improve the security/inconvenience ratio over a fixed nonzero UID), but not the level of stupid that would break s6-overlay. To get the results you have gotten, you need much more.

You're saying there's no possibility of running anything under a different UID. Does it mean that the suid feature is disabled in the kernel? With s bits still showing up as such in the filesystem? Because doing that, i.e. changing the kernel semantics without telling the userspace, would be enough stupid. You know, like, running programs on a computer that does not follow the specifications it pretends to have.

If it is the case, you can work around that by disabling the suid bit on s6-overlay-suexec. Add to your container installation file, after unpacking all the tarballs: (I'm using the Dockerfile syntax here, convert to the OpenShift way of doing things)
RUN chmod u-s /package/admin/s6-overlay-helpers/command/s6-overlay-suexec
and see if it fixes the problem.

@narcoticfresh
Copy link
Author

@skarnet

...to run containers that all compete for the Most Idiotic Security Theater Feature award

Running with a random UID is indeed pretty stupid (it doesn't improve the security/inconvenience ratio over a fixed nonzero UID), but not the level of stupid that would break s6-overlay.

I couldn't agree more with those sentiments.. it's even more that i'm unable to reproduce the issue with docker --user <> --grop-add=0 - there it indeed all works as intended.. but as those "enterprise" solutions are blackbox-y (which is the drawback of so called "enterprise" cr*p), it's hard to say what really happens..

Back to the issue at hand..

So i tried u-s that binary, then I get

[user@host DEV ~]$ oc logs cont-775cd78bcf-dbzvn
s6-overlay-suexec: warning: unable to gain root privileges (is the suid bit set?)
s6-linux-init: fatal: unable to copy /run/s6/basedir/run-image to /run: Operation not permitted

which is strange, as /run is writable to everybody (I was desperate so I just tried this):

I have no name!@99f109bfd75c:/var/www$ ls -hals /   
total 400K
   0 drwxr-xr-x.   1 root     root  224 Mar 21 07:10 .
   0 drwxr-xr-x.   1 root     root  224 Mar 21 07:10 ..
   0 -rwxr-xr-x.   1 root     root    0 Mar 21 07:10 .dockerenv
[...]
   0 drwxrwxrwx.   1 www-data root   16 Mar 16 01:00 run

Is the error output maybe not complete, is it not referring to the root /run? Does some path prefix apply?
Maybe I missed the documentation, is there any description what is copied to where?

@skarnet
Copy link
Contributor

skarnet commented Mar 21, 2022

So, /run should definitely not be writable by everybody - but it should be owned by the user the container is running as.

  • Normally this is ensured by the container manager pre-mounting /run; most container managers know that /run should be owned by the user doing administration stuff. (And that it is traditionally a tmpfs, but that's likely irrelevant here.)
  • For cases where /run is not pre-mounted or correctly chowned, s6-overlay does the /run creation and chowning itself. (Which is why you got a s6-chown error earlier.)

The goal is to always have a working /run with the correct permissions after s6-overlay's preinit script is done. (And then the real init can proceed.) But this is why s6-overlay-suexec needs a suid bit: in order to chown /run, the preinit script needs root privileges.

Failing that, /run will remain as it was given to s6-overlay. This is okay if /run has been pre-mounted with the correct uid, which can be done automatically (in USER Docker containers, for instance) or manually at container configuration time when you have a fixed uid.

But when the container is run with a random uid, the correct uid for /run will change from one invocation to the next, and without root privileges, s6-overlay will be unable to do anything about it. And that's why you get that "operation not permitted" message.

So your only solution here is to tell OpenShift to always pre-mount /run, with mode 0755, owned by the uid of the container. Because OpenShift is the only entity that both knows what the correct uid is and can act on it.

@narcoticfresh
Copy link
Author

narcoticfresh commented Mar 21, 2022

@skarnet

So, /run should definitely not be writable by everybody

ok - but it still should work as intended then, right? strange that the copy fails..?

so i've checked another running pod (one with no s6-overlay); the perms are like this

1000780000@pod-d95bd7cd9-rbdr2:/$ ls -hals /
total 53M
[...]
   0 drwxr-xr-x.    1 root     root   21 Mar 21 08:12 run

So: /run is 0:0 and no w for g, so not writable by the container user.. it's also not mounted (just used for the kubernetes secrets)

1000780000@pod-d95bd7cd9-rbdr2:/$ mount | grep run
tmpfs on /run/secrets type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /run/secrets/kubernetes.io/serviceaccount type tmpfs (ro,relatime,seclabel)

Comparison with v2

With s6-overlay 2.* i got these startup warnings:

[s6-init] making user provided files available at /var/run/s6/etc...exited 0.
[s6-init] ensuring user provided files have correct perms...s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/00-wait: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/10-php-fpm-vars: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/90-backend-init: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/cont-init.d/91-cron-jobs: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/cont-init.d/00-wait: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/cont-init.d/10-php-fpm-vars: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/cont-init.d/91-cron-jobs: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/cont-init.d/90-backend-init: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/cron/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/php-fpm/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/php-fpm-exporter/run: Operation not permitted
s6-chown: fatal: unable to chown /var/run/s6/etc/services.d/php-fpm-log/run: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/services.d/cron/run: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/services.d/php-fpm/run: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/services.d/php-fpm-log/run: Operation not permitted
s6-chmod: fatal: unable to change mode of /var/run/s6/etc/services.d/php-fpm-exporter/run: Operation not permitted

The funny thing is, is that /var/run is a symlink to /run - and indeed with version 2, the folder /run/s6 is chowned to the container user:

1000780000@pod-768dd5bc7f-lb4nj:/var$ ls -l /run/
total 0
drwxrwxrwt. 2 root       root   6 Mar 16 01:00 lock
drwxr-xr-x. 8 **1000780000** root 116 Mar 21 11:27 s6
drwxr-xr-x. 4 root       root  80 Mar 21 11:27 secrets

I cannot really say what changed in that process that it is not longer possible with v3 ;-/

@skarnet
Copy link
Contributor

skarnet commented Mar 21, 2022

ok - but it still should work as intended then, right? strange that the copy fails..?

s6-linux-init wants /run to be an exact copy of its run-image data, including permissions (that's important). When it's not the owner of /run, it can write the data into it if it's world-writable (which it should not be), but it can't chmod it, that's why it fails.
There is no workaround to ensuring /run has the proper owner, really.

I cannot really say what changed in that process that it is not longer possible with v3 ;-/

It didn't work with v2 either, it was just more subtly broken and you didn't notice it. /run must not be world-writable, even with a t bit set; if it is, then normal users are able to DoS the system, and you don't want that.
v3 is more demanding with its setup, but it's also more resilient and secure.

OpenShift should allow you to pre-mount /run (as a tmpfs or not) owned by the uid it's running the container as. Failing that, it should enable suid binaries. Failing that as well, I'm afraid it's just unusable.

@narcoticfresh
Copy link
Author

@skarnet
i'm closing this as I have no more input to provide.. may this issue serve other users as reference in the future through the power of big tech indexing ad-serving solutions (aka search engines)

@skarnet
Copy link
Contributor

skarnet commented Mar 26, 2024

I may finally have a solution for this. I need to implement it, then test it. Please be even more patient than you have been.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants
@narcoticfresh @skarnet and others