From 3817948664388601790184a9541e7d5bcf079fb0 Mon Sep 17 00:00:00 2001 From: Sergio Oller Date: Tue, 25 Apr 2023 13:27:40 +0200 Subject: [PATCH 1/3] Add use rootless page --- _quarto.yml | 1 + use/rootless.md | 348 ++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 349 insertions(+) create mode 100644 use/rootless.md diff --git a/_quarto.yml b/_quarto.yml index 6c0db0b..e1f4df2 100644 --- a/_quarto.yml +++ b/_quarto.yml @@ -74,6 +74,7 @@ website: - use/editor.md - use/networking.md - use/shared_volumes.md + - use/rootless.md - use/reproducibility.md - use/singularity.md diff --git a/use/rootless.md b/use/rootless.md new file mode 100644 index 0000000..bae4c69 --- /dev/null +++ b/use/rootless.md @@ -0,0 +1,348 @@ +--- +title: "Running Rootless" +description: Rootless containers and rocker +--- + +## Rootless containers, security and root + +Docker traditionally ran as the `root` user. Users who wanted to run docker +containers needed to be given `sudo` access and use `sudo docker`, or be added +to the `docker` group, so they could run docker without typing `sudo` first. In +both cases, they were running docker with root privileges. + +This is considered a bad security practice because it effectively grants root +host privileges to all docker users. However, namespaces and control groups where +not as mature when docker started as they are now, and no better alternative +was available. But we have an alternative now. Docker offers the possibility to +run in [rootless](https://docs.docker.com/engine/security/rootless/) mode and +[podman](https://podman.io/) runs rootless by design. + +Running a container rootless does not mean that the container does not have +any root-like capabilities, it means that the container engine does not run +as root. + +**For most rocker-related projects, running rootless is a security advantage.** + + +### Who are we? + +At the host: + +``` +$ whoami +sergio +``` + +In the container: + + +``` +$ podman run --rm docker.io/rocker/rstudio whoami +root +``` + +### Using apt-get inside a rootless container + +It is perfectly possible to run `apt-get` commands on a +rootless container, because it just modifies files inside the container. + +At the host: + +``` +$ apt-get update +Reading package lists... Done +E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied) +``` + +In the container: + +``` +$ podman run --rm docker.io/rocker/rstudio apt-get update +Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB] +... +Fetched 26.8 MB in 6s (4,750 kB/s) +Reading package lists... +``` + +### Modifying files + +You can bind mount the `/etc/` directory (e.g. using `-v /etc:/hostetc`) but you won't +be able to modify most of its files, since you are not allowed to do that +when you are outside the container. + +At the host: + +``` +$ touch /etc/try-creating-a-file +touch: cannot touch '/etc/try-creating-a-file': Permission denied +``` + +In the container: *Rootless means no additional host permissions* + +``` +$ podman run --rm -v /etc/:/hostetc docker.io/rocker/rstudio \ + touch /hostetc/try-creating-a-file +touch: cannot touch '/hostetc/try-creating-a-file': Permission denied +``` + +However, you can modify the files *within* the container: + +``` +$ podman run --rm docker.io/rocker/rstudio touch /etc/try-creating-a-file +``` + +### Port binding + +You can't bind your container to host ports lower than 1024, +since those are reserved to root (or to be precise reserved to processes with +`CAP_NET_BIND_SERVICE` capability set). + + +``` +$ podman run --rm -p 80:8787 docker.io/rocker/rstudio +Error: rootlessport cannot expose privileged port 80, you can add +'net.ipv4.ip_unprivileged_port_start=80' to /etc/sysctl.conf (currently 1024), +or choose a larger port number (>= 1024): +listen tcp 0.0.0.0:80: bind: permission denied +``` + +However larger port numbers work perfectly fine: + +``` +$ podman run --rm -p 8787:8787 docker.io/rocker/rstudio +``` + +## Rootless containers and file permissions + +If you have a bit of experience with containers you have probably suffered +of "permission issues". + +The typical issue with permissions is that you mount a directory into the +container, and the processes in the container write files in that directory +with a user id different than yours (usually root). Once you are out of the +container you can't access or modify those files. + +### How users work in rootless containers + +With rootless containers, even if you are only one user, your container +has to behave (read and write files...) as if there were many users. There is +no way to magically do this, so the host operating system actually gives you +many "subordinate user ids" and "subordinate group ids" for you to use as you +wish. *How many?* Usually around 65k user ids and 65k group ids. When you use +a rootless container you may be impersonating up to 65k users! Since it would +be a very bad idea to impersonate other users in your computer (impersonating +root would be the most dangerous) the system administrator gives you unassigned +user ids that do not overlap with anyone else. The list of subordinate user and +group ids assigned to each user is stored in `/etc/subuid` and `/etc/subgid` +files. + +``` +$ cat /etc/subuid +sergio:100000:65536 +ana:165536:65536 +``` + +This file is read as follows: + +- The user `sergio` has assigned 65536 additional subordinate user IDs starting at 100000. + This spans the range 100000-165535. +- The user `ana` has assigned 65536 additional subordinate user IDs starting at 165536. + This spans the range 165536-231071. + +When you start a container, the user and group ids used by the image should be +mapped to the host. The default user mapping in podman maps the 0 container uid +(corresponding to the container root user) to your real user id in the host, +and all your subordinate user IDs are mapped to user ids `1:n` in the container. +The same applies to group id mapping. + +### Working alone + +In the container, you can use user ids without issues (e.g. you can be root). + +If you bind mount a directory that you own: + +- If you create a file as the root user in the container, outside of it the + file will be owned by you. +- If you create a file as the container UID 1000, outside of the container will + appear to be owned by one of your subordinate IDs (e.g. 100999) + +What about mounting directories that you DO NOT own? + +- The files and directories that you do not own belong to host UIDs that are + not mapped into the container, so when the container asks for their UID + the operating system returns the "overflow user id", which is the ID 65534 by + default and usually are listed as owned by `nobody` or `nogroup`. + + +### Sharing data with others + +If you usually work with a directory shared with other users, it is possible +that the shared directory belongs to a group you all belong to. + +There are several possible solutions. Here we describe two of them that we can +use in `rocker`. + +#### Set groups in the running process `--group-add keep-groups` + +:::{.callout-important} + +Adding `--group-add keep-groups` to `podman run` works when running an R session +or an R script, but not when logging in from the rstudio server website. + +See below for an alternative + +::: + + +By belonging to a group you may have permissions to do things (e.g. write to +your shared directory). The ones who actually *do* things are processes that +you start and you own. Your processes usually inherit your user id and your +groups, and based on those groups they are authorized to do things. + +When `podman run` starts the initial process in the container the process running +there will typically have the root uid and the root gid inside the container, +which map to your own UID and GID. There are reasons for not inheriting +all your extra group ids: + +- The other group IDs are not mapped inside the container, so they are of little + use there. +- The other group IDs may give permissions to do things in the host that the + container should never be able to do (e.g. access some particular device). + +However, `podman run` accepts `--group-add keep-groups`. When that option is +enabled, `podman` starts the initial process in the container. That process will +have your GID (mapped to the root GID in the container) and all your other extra +groups, unmapped in the container. + +On the host: + +``` +$id +uid=1000(sergio) gid=1000(sergio) groups=1000(sergio),4(adm),27(sudo),109(lpadmin),124(sambashare) +``` + +On the container: + +``` +$podman run --rm rocker/rstudio id +uid=0(root) gid=0(root) groups=0(root) +``` + +Keeping groups: + +``` +$ podman run --rm --group-add keep-groups rocker/rstudio id +uid=0(root) gid=0(root) groups=0(root),65534(nogroup) +``` + +Note how when keeping groups all the unmapped groups are grouped into 65534 (nogroup). + + +Even if the container process can't see those groups, when the process tries to read or +write a file it has the groups set, so it actually has the permissions to work. + +On web applications such as RStudio, where the user logs in through the web browser, +the process with the R session is not started directly by podman, but instead it is +started by RStudio server when the user logs in. + +In this scenario, the started process does not inherit the groups from the host, +and can't write files into your shared directories. + +:::{.callout-tip} + + +To run R code or an R script using rocker accessing a shared directory, you +can use `--group-add keep-groups`. + +``` +podman run -ti --rm -v /shared_dir:/shared_dir \ + --group-add keep-groups rocker/rstudio R +``` + +However you won't be able to access that directory if you try to login from the web browser. + +::: + + +#### Ask the system administrator to subordinate the group + +:::{.callout-important} + +This solution is complicated. There is a discussion open at +[podman#18333](https://github.com/containers/podman/issues/18333) to attempt to simplify it. + +::: + +Let's assume here that the shared directory belongs to the GID 2000. + +Your system administrator can subordinate to you and your colleagues that GID, so you can use it: + +``` +$ cat /etc/subgid +sergio:100000:65536 +ana:165536:65536 + +sergio:2000:1 +ana:2000:1 +``` + +Now `sergio` and `ana` can use the GID 2000 (note the /etc/sub**g**id). + +You will have to map your group host ID into the container so the container +can access it. There are two caveats: + +- When providing a custom mapping you need to provide a complete `uidmap` + and a complete `gidmap`. + +- When providing either of those two mappings in rootless `podman`, instead of + mapping to the container from the host, we map to the container from podman's + intermediate mapping. + + +The first caveat will require us to provide some default identity mappings. +The second caveat will require us to find out what's the intermediate podman +mapping, so we know the intermediate group ID of our host 2000 gid. + +The intermediate group mapping is found with the following command: + +``` +$ podman unshare cat /proc/self/gid_map + 0 1000 1 + 1 2000 1 + 2 100000 65536 +``` + +The table shows that gid 2000 in the host (middle column) is mapped to +intermediate gid 1 (left column). + +We will map + +| Type | Container ID | Intermediate ID | Reason | +| ----- | ------------ | --------------- | ------------------------------------------------------------------------------------ | +| User | 0 - 65534 | 0 - 65534 | Identity mapping (no change needed in user mapping besides the default one) | +| Group | 0 | 0 | Identity mapping (our main host GID 1000, was mapped to intermediate 0 by default) | +| Group | 1 - 65534 | 2 - 65535 | We skip intermediate GID 1, so 1->2, 2->3... | +| Group | 100000 | 1 | We map container GID 100000 to intermediate GID 1, that we saw matches host GID 2000 | + + +The `--uidmap` and `--gidmap` options in rootless podman map those intermediate +uids/gids to container ids: + +``` +$ podman run \ + --rm \ + -v /shared_dir:/shared_dir \ + --uidmap "0:0:65535" \ + --gidmap "0:0:1" \ + --gidmap "1:2:65535" \ + --gidmap "100000:1:1" \ + --group-add keep-groups \ + rocker/rstudio +``` + +With all that set: + +- Our rocker image will be able to obtain a container group id for the host gid 2000 +- It will add the root user to that group in the container's `/etc/groups` file +- **When you log in from the rstudio website, you will have access to the shared directory.** + From c3e8dcab0c7c53e36c095883b7c0ed56ce4b735b Mon Sep 17 00:00:00 2001 From: Sergio Oller Date: Tue, 25 Apr 2023 14:22:40 +0200 Subject: [PATCH 2/3] Fix code linting --- use/rootless.md | 138 ++++++++++++++++++++++++------------------------ 1 file changed, 69 insertions(+), 69 deletions(-) diff --git a/use/rootless.md b/use/rootless.md index bae4c69..decfcde 100644 --- a/use/rootless.md +++ b/use/rootless.md @@ -28,17 +28,17 @@ as root. At the host: -``` -$ whoami -sergio +```{.sh} +whoami +# sergio ``` In the container: -``` -$ podman run --rm docker.io/rocker/rstudio whoami -root +```{.sh} +podman run --rm docker.io/rocker/rstudio whoami +# root ``` ### Using apt-get inside a rootless container @@ -48,20 +48,20 @@ rootless container, because it just modifies files inside the container. At the host: -``` -$ apt-get update -Reading package lists... Done -E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied) +```{.sh} +apt-get update +# Reading package lists... Done +# E: Could not open lock file /var/lib/apt/lists/lock - open (13: Permission denied) ``` In the container: -``` -$ podman run --rm docker.io/rocker/rstudio apt-get update -Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB] -... -Fetched 26.8 MB in 6s (4,750 kB/s) -Reading package lists... +```{.sh} +podman run --rm docker.io/rocker/rstudio apt-get update +# Get:1 http://security.ubuntu.com/ubuntu jammy-security InRelease [110 kB] +# ... +# Fetched 26.8 MB in 6s (4,750 kB/s) +# Reading package lists... ``` ### Modifying files @@ -72,23 +72,23 @@ when you are outside the container. At the host: -``` -$ touch /etc/try-creating-a-file -touch: cannot touch '/etc/try-creating-a-file': Permission denied +```{.sh} +touch /etc/try-creating-a-file +# touch: cannot touch '/etc/try-creating-a-file': Permission denied ``` In the container: *Rootless means no additional host permissions* -``` -$ podman run --rm -v /etc/:/hostetc docker.io/rocker/rstudio \ - touch /hostetc/try-creating-a-file -touch: cannot touch '/hostetc/try-creating-a-file': Permission denied +```{.sh} +podman run --rm -v /etc/:/hostetc docker.io/rocker/rstudio \ + touch /hostetc/try-creating-a-file +# touch: cannot touch '/hostetc/try-creating-a-file': Permission denied ``` However, you can modify the files *within* the container: -``` -$ podman run --rm docker.io/rocker/rstudio touch /etc/try-creating-a-file +```{.sh} +podman run --rm docker.io/rocker/rstudio touch /etc/try-creating-a-file ``` ### Port binding @@ -98,18 +98,18 @@ since those are reserved to root (or to be precise reserved to processes with `CAP_NET_BIND_SERVICE` capability set). -``` -$ podman run --rm -p 80:8787 docker.io/rocker/rstudio -Error: rootlessport cannot expose privileged port 80, you can add -'net.ipv4.ip_unprivileged_port_start=80' to /etc/sysctl.conf (currently 1024), -or choose a larger port number (>= 1024): -listen tcp 0.0.0.0:80: bind: permission denied +```{.sh} +podman run --rm -p 80:8787 docker.io/rocker/rstudio +# Error: rootlessport cannot expose privileged port 80, you can add +# 'net.ipv4.ip_unprivileged_port_start=80' to /etc/sysctl.conf (currently 1024), +# or choose a larger port number (>= 1024): +# listen tcp 0.0.0.0:80: bind: permission denied ``` However larger port numbers work perfectly fine: -``` -$ podman run --rm -p 8787:8787 docker.io/rocker/rstudio +```{.sh} +podman run --rm -p 8787:8787 docker.io/rocker/rstudio ``` ## Rootless containers and file permissions @@ -136,10 +136,10 @@ user ids that do not overlap with anyone else. The list of subordinate user and group ids assigned to each user is stored in `/etc/subuid` and `/etc/subgid` files. -``` -$ cat /etc/subuid -sergio:100000:65536 -ana:165536:65536 +```{.sh} +cat /etc/subuid +# sergio:100000:65536 +# ana:165536:65536 ``` This file is read as follows: @@ -216,23 +216,23 @@ groups, unmapped in the container. On the host: -``` -$id -uid=1000(sergio) gid=1000(sergio) groups=1000(sergio),4(adm),27(sudo),109(lpadmin),124(sambashare) +```{.sh} +id +# uid=1000(sergio) gid=1000(sergio) groups=1000(sergio),4(adm),27(sudo),109(lpadmin),124(sambashare) ``` On the container: -``` -$podman run --rm rocker/rstudio id -uid=0(root) gid=0(root) groups=0(root) +```{.sh} +podman run --rm rocker/rstudio id +# uid=0(root) gid=0(root) groups=0(root) ``` Keeping groups: -``` -$ podman run --rm --group-add keep-groups rocker/rstudio id -uid=0(root) gid=0(root) groups=0(root),65534(nogroup) +```{.sh} +podman run --rm --group-add keep-groups rocker/rstudio id +# uid=0(root) gid=0(root) groups=0(root),65534(nogroup) ``` Note how when keeping groups all the unmapped groups are grouped into 65534 (nogroup). @@ -254,7 +254,7 @@ and can't write files into your shared directories. To run R code or an R script using rocker accessing a shared directory, you can use `--group-add keep-groups`. -``` +```{.sh} podman run -ti --rm -v /shared_dir:/shared_dir \ --group-add keep-groups rocker/rstudio R ``` @@ -277,13 +277,13 @@ Let's assume here that the shared directory belongs to the GID 2000. Your system administrator can subordinate to you and your colleagues that GID, so you can use it: -``` -$ cat /etc/subgid -sergio:100000:65536 -ana:165536:65536 - -sergio:2000:1 -ana:2000:1 +```{.sh} +cat /etc/subgid +# sergio:100000:65536 +# ana:165536:65536 +# +# sergio:2000:1 +# ana:2000:1 ``` Now `sergio` and `ana` can use the GID 2000 (note the /etc/sub**g**id). @@ -305,11 +305,11 @@ mapping, so we know the intermediate group ID of our host 2000 gid. The intermediate group mapping is found with the following command: -``` -$ podman unshare cat /proc/self/gid_map - 0 1000 1 - 1 2000 1 - 2 100000 65536 +```{.sh} +podman unshare cat /proc/self/gid_map +# 0 1000 1 +# 1 2000 1 +# 2 100000 65536 ``` The table shows that gid 2000 in the host (middle column) is mapped to @@ -328,16 +328,16 @@ We will map The `--uidmap` and `--gidmap` options in rootless podman map those intermediate uids/gids to container ids: -``` -$ podman run \ - --rm \ - -v /shared_dir:/shared_dir \ - --uidmap "0:0:65535" \ - --gidmap "0:0:1" \ - --gidmap "1:2:65535" \ - --gidmap "100000:1:1" \ - --group-add keep-groups \ - rocker/rstudio +```{.sh} +podman run \ + --rm \ + -v /shared_dir:/shared_dir \ + --uidmap "0:0:65535" \ + --gidmap "0:0:1" \ + --gidmap "1:2:65535" \ + --gidmap "100000:1:1" \ + --group-add keep-groups \ + rocker/rstudio ``` With all that set: From d165f0791bcb225d06d955f28e436e3d8fa6be9f Mon Sep 17 00:00:00 2001 From: Sergio Oller Date: Tue, 25 Apr 2023 15:48:43 +0200 Subject: [PATCH 3/3] Remove trailing whitespace Co-authored-by: eitsupi <50911393+eitsupi@users.noreply.github.com> --- use/rootless.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/use/rootless.md b/use/rootless.md index decfcde..9aac79a 100644 --- a/use/rootless.md +++ b/use/rootless.md @@ -68,7 +68,7 @@ podman run --rm docker.io/rocker/rstudio apt-get update You can bind mount the `/etc/` directory (e.g. using `-v /etc:/hostetc`) but you won't be able to modify most of its files, since you are not allowed to do that -when you are outside the container. +when you are outside the container. At the host: @@ -180,7 +180,7 @@ If you usually work with a directory shared with other users, it is possible that the shared directory belongs to a group you all belong to. There are several possible solutions. Here we describe two of them that we can -use in `rocker`. +use in `rocker`. #### Set groups in the running process `--group-add keep-groups`