Skip to content

eriksjolund/podman-socket-activation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

53 Commits
 
 

Repository files navigation

Currently a draft

Podman socket activation

Title: "Use socket activation with Podman to get improved security and native network throughput"

Subtitle: "Learn how to restrict network access for a containerized network server"

Running a web server container is one of the more common uses for Podman. Normally you would need to publish the ports that need to be open by providing the option --publish (-p) to podman run. When running rootless Podman you also need to be aware that the network traffic is processed by the user space application slirp4netns which comes with a performance penalty.

You might be surprised to hear that it's now possible to run a web server container with rootless Podman and get native network throughput! Even more surprising is that the --network=none option can be given to disable the network. There is also no need to publish ports.

The new way to run a network server container with Podman is to use socket activation provided by systemd. Not all software daemons support socket activation but it's getting more popular. For instance Apache HTTP server, MariaDB, DBUS, PipeWire, Gunicorn, CUPS all have socket activation support.

Socket activation conceptually works by having systemd create a socket (e.g. TCP, UDP or Unix socket). As soon as a client connects to the socket, systemd will start the systemd service that is configured for the socket. The newly started program inherits the open file descriptor of the socket and can accept the incoming connection. The new feature is that Podman now passes such a socket to the container. Thanks to the fork/exec model of Podman, the socket will be first inherited by conmon and then by the OCI runtime and finally by the container as can be seen in the following diagram:

stateDiagram-v2
    [*] --> systemd: client connects
    systemd --> podman: socket inherited via fork/exec
    state "OCI runtime" as s2
    podman --> conmon: socket inherited via double fork/exec
    conmon --> s2: socket inherited via fork/exec
    s2 --> container: socket inherited via exec
Loading

Before looking into this new feature, let us take a look at another form of socket activation in Podman.

Podman's socket-activated API service

Podman has supported socket activation of its API service for a long time. Here the architecture is simpler because the socket is used by Podman itself:

stateDiagram-v2
    [*] --> systemd: client connects
    systemd --> podman: socket inherited via fork/exec
Loading

The file /usr/lib/systemd/user/podman.socket on a Fedora system defines the Podman API socket for rootless users:

$ cat /usr/lib/systemd/user/podman.socket
[Unit]
Description=Podman API Socket
Documentation=man:podman-system-service(1)

[Socket]
ListenStream=%t/podman/podman.sock
SocketMode=0660

[Install]
WantedBy=sockets.target

The socket is configured to be a Unix socket and can be started like this

$ systemctl --user start podman.socket
$ ls $XDG_RUNTIME_DIR/podman/podman.sock
/run/user/1000/podman/podman.sock
$

The socket can later be used by for instance docker-compose that needs a Docker-compatible API

$ export DOCKER_HOST=unix://$XDG_RUNTIME_DIR/podman/podman.sock
$ docker-compose up

Socket-activated echo server container in a systemd service

More recently, in version 3.4.0, Podman received support for another type of socket activation, namely, socket activation of containers. Such socket activation can be used in the systemd services that are generated with the command podman generate systemd --new --name CTR.

I created a container image ghcr.io/eriksjolund/socket-activate-echo of an echo server that supports socket activation. The echo server currently has limited functionality. It was written for the sole purpose of demonstrating socket activation. Source code is available in the GitHub repo eriksjolund/socket-activate-echo where also more examples can be found.

Let's try it out. Start the echo server sockets

git clone https://github.com/eriksjolund/socket-activate-echo.git
mkdir -p ~/.config/systemd/user
cp -r socket-activate-echo/systemd/echo* ~/.config/systemd/user
systemctl --user daemon-reload
systemctl --user start echo@demo.socket

List the listening sockets that we will connect to

$ ss -lnp | grep 3000
udp   UNCONN 0      0                                       127.0.0.1:3000             0.0.0.0:*    users:(("systemd",pid=2516,fd=33))
udp   UNCONN 0      0                                           [::1]:3000                [::]:*    users:(("systemd",pid=2516,fd=35))
tcp   LISTEN 0      4096                                    127.0.0.1:3000             0.0.0.0:*    users:(("systemd",pid=2516,fd=28))
tcp   LISTEN 0      4096                                        [::1]:3000                [::]:*    users:(("systemd",pid=2516,fd=34))
v_str LISTEN 0      0                                               *:3000                   *:*    users:(("systemd",pid=2516,fd=36))
$ ss -lx | grep echo | grep u_str
u_str LISTEN 0      4096          /home/eriksjolund/echo_stream_sock.demo 49486            * 0
$

Test the echo server with the program socat

$ echo hello | socat - tcp4:127.0.0.1:3000
hello
$ echo hello | socat - tcp6:[::1]:3000
hello
$ echo hello | socat - udp4:127.0.0.1:3000
hello
$ echo hello | socat - udp6:[::1]:3000
hello
$ echo hello | socat - unix:$HOME/echo_stream_sock.demo
hello
$ echo hello | socat - VSOCK-CONNECT:1:3000
hello

Improve security by disabling the network

In case the echo server would get compromised due to a security vulnerability, the container might be used to launch attacks against other PCs or devices on the network. An echo server does not need the ability to establish outgoing connections. It just needs to accept incoming connections on the socket-activated socket it inherited. Luckily, the command-line option --network=none, given to podman run in the service unit file, provides those restrictions.

$ grep -A 9 ExecStart= ~/.config/systemd/user/echo@.service
ExecStart=/usr/bin/podman run \
  --cidfile=%t/%n.ctr-id \
  --cgroups=no-conmon \
  --rm \
  --sdnotify=conmon \
  --replace \
  --name echo-%i \
  --detach \
  --network none \
    ghcr.io/eriksjolund/socket-activate-echo

Assume an intruder has shell access in the container. The situation can be simulated by executing commands with podman exec.

Only the loopback interface is available

$ podman exec -ti echo-demo /bin/bash -c "ip -brief addr"
lo               UNKNOWN        127.0.0.1/8 ::1/128

curl is not able to download any web page

$ podman exec -ti echo-demo /bin/bash -c "curl https://podman.io"
curl: (6) Could not resolve host: podman.io
$

If we instead remove the option --network=none and run the same commands we see that the network interface tap0 is also available

$ podman exec -ti echo-demo /bin/bash -c "ip -brief addr"
lo               UNKNOWN        127.0.0.1/8 ::1/128
tap0             UNKNOWN        10.0.2.100/24 fd00::9847:3aff:fe5d:97ea/64 fe80::9847:3aff:fe5d:97ea/64
$

and that curl is able to download the web page.

$ podman exec -ti echo-demo /bin/bash -c "curl https://podman.io" | head -2
<!doctype html>
<html lang="en-US">
$

By using the option --network=none, we thus limit the possibilities for an intruder to use the compromised container as a starting point for attacks on other PCs.

Network throughput and latency

Using socket activation comes with another advantage. The communication in the socket-activated socket has native network throughput. Other network traffic needs to pass through slirp4netns and gets the performance penalty that comes with it.

Unfortunately, using socket activation also comes with a disadvantage. The very first connection to a socket-activated container will have more latency due to container startup. To minimize this latency, consider adding the podman run option --pull=never and instead pull the container image beforehand.

Restrict Podman with RestrictAddressFamilies

It is possible to restrict Podman from accessing AF_INET and AF_INET6 sockets with the systemd directive RestrictAddressFamilies. Socket-activated sockets are unaffected by the directive.

If the --pull=never option is added to podman run, the echo container will continue to work even with the very restricted setting

RestrictAddressFamilies=AF_UNIX AF_NETLINK

All types of sockets are then inaccessible except AF_UNIX sockets, AF_NETLINK sockets and the socket-activated sockets.

In case there would be a security vulnerability in Podman, conmon or runc, this configuration limits the possibilities an intruder has to launch attacks on other PCs on the network.

The echo-restrict.service is configured with RestrictAddressFamilies=AF_UNIX AF_NETLINK. The service is activated with echo-restrict.socket

$ grep Listen ~/.config/systemd/user/echo-restrict.socket
ListenStream=127.0.0.1:9000

To try it out, start the socket

$ systemctl --user start echo-restrict.socket

and see that it works

$ echo hello | socat - tcp4:127.0.0.1:9000
hello
$

Caveat 1: Currently, runc supports RestrictAddressFamilies=AF_UNIX AF_NETLINK, but the number of socket-activated sockets are limited to max 2 (see bug: opencontainers/runc#3488).

Caveat 2: At the time of this writing, crun does not support RestrictAddressFamilies=AF_UNIX AF_NETLINK (see feature request: containers/crun#929).

If we would have used --pull=always instead of --pull=never, the service fails as expected because Podman is blocked from establishing connections to the container registry.

journalctl would then show such error messages

$ journalctl --user -xe -u echo.service | grep -A2 "Trying to pull" | tail -3
May 26 10:09:54 asus podman[28272]: Trying to pull ghcr.io/eriksjolund/socket-activate-echo:latest...
May 26 10:09:54 asus podman[28272]: Error: initializing source docker://ghcr.io/eriksjolund/socket-activate-echo:latest: pinging container registry ghcr.io: Get "https://ghcr.io/v2/": dial tcp 140.82.121.34:443: socket: address family not supported by protocol
May 26 10:09:54 asus systemd[10686]: test.service: Main process exited, code=exited, status=125/n/a
$

Socket activate an Apache HTTP server with systemd-socket-activate

Instead of setting up a systemd service to test out socket activation, an alternative is to use the command-line tool systemd-socket-activate.

As an example let us use the container image ghcr.io/eriksjolund/socket-activate-httpd that contains an Apache HTTP server.

In one shell, start systemd-socket-activate.

$ systemd-socket-activate -l 8080 podman run --rm --network=none ghcr.io/eriksjolund/socket-activate-httpd

The TCP port number 8080 is given as an option to systemd-socket-activate. The --publish (-p) option for podman run is not used.

In another shell, fetch a web page from localhost:8080

$ curl -s localhost:8080 | head -6
<!doctype html>
<html>
  <head>
<meta charset='utf-8'>
<meta name='viewport' content='width=device-width, initial-scale=1'>
<title>Test Page for the HTTP Server on Fedora</title>
$

Note about SElinux

If your computer is running SELinux, you need to have container-selinux 2.183.0 or newer installed. If container socket activation via Podman does not work and you are using an older version of container-selinux, add --security-opt label=disable to podman run as a work around.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published