Skip to content

Commit

Permalink
Knative on CoCo (sc2-sys#12)
Browse files Browse the repository at this point in the history
  • Loading branch information
csegarragonz authored Sep 28, 2023
1 parent 40cca98 commit 81614ab
Show file tree
Hide file tree
Showing 17 changed files with 433 additions and 79 deletions.
11 changes: 11 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,15 @@ inv operator.install
inv operator.install-cc-runtime
```

Third, update the `initrd` file to include our patched `kata-agent`:

```bash
inv kata.replace-agent
```

if it is the first time, you will have to manually build the agent following
[these instructions](./docs/kata.md#replacing-the-kata-agent).

Then, you are ready to run one of the supported apps:
* [Hello World! (Py)](./docs/helloworld_py.md) - simple HTTP server running in Python to test CoCo and Kata.
* [Hello World! (Knative)](./docs/helloworld_knative.md) - same app as before, but invoked over Knatvie.
Expand Down Expand Up @@ -79,5 +88,7 @@ inv kubeadm.destroy

For further documentation, you may want to check these other documents:
* [K8s](./docs/k8s.md) - documentation about configuring a single-node Kubernetes cluster.
* [Kata](./docs/kata.md) - instructions to build our custom Kata fork and `initrd` images.
* [Knative](./docs/knative.md) - documentation about Knative, our serverless runtime of choice.
* [SEV](./docs/sev.md) - speicifc documentation to get the project working with AMD SEV machines.
* [Troubleshooting](./docs/troubleshooting.md) - tips to debug when things go sideways.
20 changes: 8 additions & 12 deletions apps/helloworld-knative/service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,25 +2,21 @@ apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: helloworld-knative
# annotations:
# "features.knative.dev/podspec-volumes-emptydir": "enabled"
# "features.knative.dev/podspec-persistent-volume-claim": "enabled"
# "features.knative.dev/podspec-persistent-volume-claim-write": "enabled"
# "features.knative.dev/podspec-runtimeclassname": "enabled"
annotations:
"features.knative.dev/podspec-runtimeclassname": "enabled"
spec:
# ConfigurationSpec (or RevisionTemplateSpec?)
template:
metadata:
labels:
apps.coco-serverless/name: helloworld-py
# io.katacontainers.config.pre_attestation.enabled: "false"
io.katacontainers.config.pre_attestation.enabled: "false"
spec:
# runtimeClassName: kata-qemu
runtimeClassName: kata-qemu-sev
# coco-knative: need to run user container as root
securityContext:
runAsUser: 1000
containers:
# - image: ghcr.io/knative/helloworld-go:latest
# - image: csegarragonz/coco-helloworld-py:latest
# - image: csegarragonz/coco-helloworld-py:latest
- image: csegarragonz/coco-helloworld-py@sha256:af0fec55e9aed9a259e8da9dcaa28ab3fc1277dc8db4b8883265f98272cef11d
- image: csegarragonz/coco-helloworld-py:latest
ports:
- containerPort: 8080
env:
Expand Down
1 change: 1 addition & 0 deletions apps/helloworld-py/deployment.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -19,5 +19,6 @@ spec:
containers:
- name: helloworld-py
image: csegarragonz/coco-helloworld-py:latest
imagePullPolicy: Always
ports:
- containerPort: 8080
6 changes: 1 addition & 5 deletions conf-files/knative_config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,5 @@ metadata:
name: config-features
namespace: knative-serving
data:
kubernetes.podspec-volumes-emptydir: "enabled"
kubernetes.podspec-persistent-volume-claim: "enabled"
kubernetes.podspec-persistent-volume-claim-write: "enabled"
kubernetes.podspec-runtimeclassname: "enabled"
kubernetes.containerspec-addcapabilities: "enabled"
registries-skipping-tag-resolving: docker.io
kubernetes.podspec-securitycontext: "enabled"
23 changes: 23 additions & 0 deletions docs/helloworld_knative.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,13 @@
This application runs the same `Hello World!` sample than [`helloworld-py`](
./helloworld_py.md), but through Knative Serving.

This sample application does not use any attestation or image encryption, so
you should disable it by running:

```bash
inv coco.disable-attestation
```

To deploy it, you may run:

```bash
Expand All @@ -25,3 +32,19 @@ To remove the application, you can run:
```bash
kubectl delete -f ./apps/helloworld-knative
```

## Knative on CoCo

For the time being, CoCo requires the image to _always_ be pulled on the guest.
If the image is present on the host, Knative will try to cache it (as it is
not possible to specify `imagePullPolicy: Always`), and the pod won't start
complaining about problems mounting the root file-system.

To remove the image from the host's cache, you can use `crictl`:

```bash
sudo crictl rmi <image_id>
```

note that, if _only_ using CoCo, the images are _never_ on the host, so they
should never be cached.
46 changes: 46 additions & 0 deletions docs/kata.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
# Kata Containers

Most of the Kata development happens in our [Kata fork](
https://github.com/csegarragonz/kata-containers). The reason why we use a fork
is to pin to an older, but stable, CC release, and add patches on top when
necessary. Down the road (and particularly when CoCo uses Kata's main), we'd
get rid of the fork.

## Tweaking Kata

To get a working environment to modify Kata, clone our fork and build/exec into
the workon container. For convenience, it is recommended to clone the fork at
the same directory level that this repo lives (i.e. ../kata-containers).

```bash
git clone https://github.com/csegarragonz/kata-containers
cd kata-containers
./csg-bin/build_docker.sh
./csg-bin/cli.sh
```

## Replacing the Kata Agent

Replacing the Kata Agent is something we may do regularly, and is a fairly
automated process.

First, from our Kata fork, rebuild the `kata-agent` binary:

```bash
cd ../kata-containers
./csg-bin/cli.sh
cd src/agent
make
exit
cd -
```

Second, from this repository, bake the new agent into the `initrd` image used
by `qemu-sev` and update the config path:

```bash
inv kata.replace-agent
```

The new VMs you start should use the new `initrd` (and thus the updated
`kata-agent`).
6 changes: 6 additions & 0 deletions docs/knative.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,3 +28,9 @@ inv kubeadm.destroy
inv kubeadm.create
inv knative.install
```

## Knative on CoCo

To run Knative on CoCo, we need to enable two feature flags when configuring
Knative. Check out the [`ConfigMap`](../conf-files/knative_config.yaml) for
more details.
55 changes: 55 additions & 0 deletions docs/troubleshooting.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
# Troubleshooting

In this document we include a collection of tips to help you debug the system
in case something is not working as expected.

## K8s Monitoring with K9s

Gaining visibility into the state of a Kubernetes cluster is hard. Thus we can
not stress enough how useful `k9s` is to debug what is going on.

We strongly recommend you using it, you may install it with:

```bash
inv k9s.install
export KUBECONFIG=$(pwd)/.config/kubeadm_kubeconfig
k9s
```

## Enabling debug logging in the system journal

Another good observability tool are the journal logs. Both `containerd` and
`kata-agent` send logs to the former's systemd journal log. You may inspect
the logs using:

```bash
sudo journalctl -xeu containerd
```

To enable debug logging you may run:

```bash
inv containerd.set-log-level [debug,info]
inv kata.set-log-level [debug,info]
```

naturally, run the commands again with `info` to reset the original log level.

## Nuking the whole cluster

When things really go wrong, resetting the whole cluster is usually a good way
to get a clean start:

```bash
inv kubeadm.destroy kubeadm.create
```

If you want a really clean start, you can re-install cotnainerd and all the
`k8s` tooling:

```bash
inv kubeadm.destroy
inv containerd.build containerd.install
inv k8s.install --clean
inv kubeadm.create
```
16 changes: 0 additions & 16 deletions docs/uk8s.md

This file was deleted.

4 changes: 4 additions & 0 deletions tasks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,21 +1,25 @@
from invoke import Collection

from . import apps
from . import coco
from . import containerd
from . import format_code
from . import k8s
from . import k9s
from . import kata
from . import kbs
from . import knative
from . import kubeadm
from . import operator

ns = Collection(
apps,
coco,
containerd,
format_code,
k8s,
k9s,
kata,
kbs,
knative,
kubeadm,
Expand Down
17 changes: 17 additions & 0 deletions tasks/coco.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
from invoke import task
from os.path import join
from tasks.util.env import KATA_CONFIG_DIR
from tasks.util.toml import update_toml


@task
def disable_attestation(ctx):
"""
Disable attestation for CoCo
"""
conf_file_path = join(KATA_CONFIG_DIR, "configuration-qemu-sev.toml")
updated_toml_str = """
[hypervisor.qemu]
guest_pre_attestation = false
"""
update_toml(conf_file_path, updated_toml_str)
66 changes: 50 additions & 16 deletions tasks/containerd.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,13 +3,20 @@
from os.path import join
from subprocess import CalledProcessError, run
from tasks.util.env import CONF_FILES_DIR, PROJ_ROOT
from toml import load as toml_load, dump as toml_dump
from tasks.util.toml import update_toml

CONTAINERD_IMAGE_TAG = "containerd-build"
CONTAINERD_SOURCE_CHECKOUT = join(PROJ_ROOT, "..", "containerd")
CONTAINERD_CONFIG_FILE = "/etc/containerd/config.toml"


def restart_containerd():
"""
Utility function to gracefully restart the containerd service
"""
run("sudo service containerd restart", shell=True, check=True)


@task
def build(ctx):
"""
Expand Down Expand Up @@ -47,6 +54,10 @@ def configure_devmapper_snapshotter():
data_dir = "/var/lib/containerd/devmapper"
pool_name = "containerd-pool"

# --------------------------
# Thin Pool device configuration
# --------------------------

# First, remove the device if it already exists
try:
run("sudo dmsetup remove --force {}".format(pool_name), shell=True, check=True)
Expand Down Expand Up @@ -113,24 +124,47 @@ def configure_devmapper_snapshotter():
dmsetup_cmd = " ".join(dmsetup_cmd)
run(dmsetup_cmd, shell=True, check=True)

devmapper_conf = {
"root_path": data_dir,
"pool_name": pool_name,
"base_image_size": "8192MB",
"discard_blocks": True,
}
# --------------------------
# Update containerd's config file to use the devmapper snapshotter
# --------------------------

# Note: we currently don't use the devmapper snapshot, so this just
# _configures_ it (but doesn't select it as snapshotter)
updated_toml_str = """
[plugins."io.containerd.snapshotter.v1.devmapper"]
root_path = "{root_path}"
pool_name = "{pool_name}"
base_image_size = "8192MB"
discard_blocks = true
""".format(
root_path=data_dir, pool_name=pool_name
)
update_toml(CONTAINERD_CONFIG_FILE, updated_toml_str)

conf_file = toml_load(CONTAINERD_CONFIG_FILE)
conf_file["plugins"]["io.containerd.snapshotter.v1.devmapper"] = devmapper_conf

tmp_conf = "/tmp/containerd_config.toml"
with open(tmp_conf, "w") as fh:
toml_dump(conf_file, fh)
@task
def set_log_level(ctx, log_level):
"""
Set containerd's log level, must be one in: info, debug
"""
allowed_log_levels = ["info", "debug"]
if log_level not in allowed_log_levels:
print(
"Unsupported log level '{}'. Must be one in: {}".format(
log_level, allowed_log_levels
)
)
return

# Finally, copy in place
run(
"sudo cp {} {}".format(tmp_conf, CONTAINERD_CONFIG_FILE), shell=True, check=True
updated_toml_str = """
[debug]
level = {log_level}
""".format(
log_level=log_level
)
update_toml(CONTAINERD_CONFIG_FILE, updated_toml_str)

restart_containerd()


@task
Expand Down Expand Up @@ -187,4 +221,4 @@ def cleanup():
configure_devmapper_snapshotter()

# Restart containerd service
run("sudo service containerd restart", shell=True, check=True)
restart_containerd()
Loading

0 comments on commit 81614ab

Please sign in to comment.