Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Runtime secret release via split attester/kbs-client #1237

Closed
wants to merge 11 commits into from

Conversation

mkulke
Copy link
Contributor

@mkulke mkulke commented Jul 24, 2023

Secret release at Runtime

This is a PoC implementation of a design for secure key release (SKR) during runtime. It's mainly to gather further feedback and refine further. The current state attempts to address the requirements we identified in discussions and the learnings we made from the shortcomings of earlier iterations:

  • To perform remote attestation in SKR we need to access infrastructure resources (like hw devices and IMDS) of the podvm hosting the peerpod containers.
  • Peerpod containers are intentionally isolated from the underyling podvm infrastructure. Containers are spawned in a discrete podns network namespace and workload should not have flat access to the podvm's resources like IMDS, since a podvm aims to be an opaque implementation detail of the runtime class.
  • We need to bridge the network namespace gap from the infrastructure of a podvm to a peerpod container.
  • In SKR we need to be able to access KBS deployments via the same name resolution/networking means as the workload (a KBS endpoint might be deployed in cluster)

Description

image

AA is split in 2 processes

attester and skr-api are two aspects that are both served by Attestation Agent (AA) at the moment. AA retrieves evidence from the podvm's infrastructure and facilitates the key release using the RCAR protocol with KBS deployment.

To serve the above concerns, we are able to split those aspects into an "attester" part, retrieving evidence from the TEE platform and a "kbs-client" part that is able to perform the SKR exchange with a KBS and the "attester". The kbs-client process doesn't need access to infrastructure resources and can be deployed as a simple sidecar. The attester is deployed as a system daemon on the podvm in the host network ns.

The skr-api process is an implementation of such a "kbs-client". It exposes an http endpoint (localhost:50080/getresource/x/y/z). It leverages a KbsProtocolWrapper of GuestComponents/kbs_protocol to handle the interactions with a KBS and retrieves the TEE evidence from the attester process via Unix Domain Socket RPC.

UDS mount injection namespace bridging

In a K8S, mounting a resource into a pod implies mounting resources from the K8S node. We need to mount resources (a socket from the attester process) from the podvm into containers, however. I'm not sure whether we are able to express this intent in k8s somehow.

We cannot rely on a direct mount to work, but we are able to proxy the UDS to an [abstract socket] (https://man7.org/linux/man-pages/man7/unix.7.html) that is available in the podns namespace.

The skr-api container is able to access the namespaced abstract socket, we could do similar things with a tcp network port, but blocking a network port this is probably the less intrusive option.

So, pragmatically, a mount to /run/confidential-containers/attester.sock is injected in every container, similar to how we inject the podns network namespace. That seems crude, but maybe it can be considered as a documented property of a kata-remote runtime class to have that mount in every container?

Testing it

The branch is based on the 0.7 release of Cloud Api Adaptor. The changes only affect the podvm. The podvm image can be built using the usual way. A local build will require tdx attestation headers (libtdx-attest-dev on debian or similar packages elsewhere), since the attester module bundles all TEE platforms at the moment.

The skr-api container image can be built via docker in ./skr-api. The container can be added to a pod without requiring any special privileges.

It will expose an endpoint: http://localhost:50080/$name/$type/$tag to the sibling containers in the pod.

Azure pre-built images

There are pre-built artifacts available that can be used in an existing CAA deployment (tested with v0.7):

Podvm

/CommunityGalleries/cococommunity-42d8482d-92cd-415b-b332-7648bd978eff/Images/peerpod-podvm_skr-api-draft/versions/0.7.5

You can configure an existing deployment by editing the peer pod configmap (AZURE_IMAGE_ID: ...) and restarting the daemonsets. note: all existing peerpods should be removed prior to this, to make sure infra resources will be cleaned up properly.

kubectl edit cm peer-pods-cm -n confidential-containers-system
kubectl delete po -l app=cloud-api-adaptor -n confidential-containers-system

kbs

A sample kbs image, that has a secret /one/two/key provisioned.

kubectl create deploy kbs --image ghcr.io/mkulke/kbs:81f3de7 --port 8080
kubectl expose deploy kbs

skr-api

A kbs client container image: ghcr.io/mkulke/skr-api:d11a655

Test

We assume kbs is available in the k8s cluster via http://kbs:8080 and deploy a pod with an skr-api sidecar using a CAA runtime class:

cat <<EOF> nginx-caa.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: nginx-caa
  name: nginx-caa
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx-caa
  template:
    metadata:
      labels:
        app: nginx-caa
    spec:
      runtimeClassName: kata-remote
      containers:
      - image: nginx:stable
        name: nginx
      - image: ghcr.io/mkulke/skr-api:d11a655
        name: skr-api
        command: ["/skr-api", "-k", "http://kbs:8080"]
EOF 
kubectl apply -f nginx-caa.yaml

We wait until the nginx-caa peerpod is provisioned correctly:

$ kubectl get po -l app=nginx-caa
NAME                         READY   STATUS    RESTARTS   AGE
kbs-7646989ddc-z2fws         1/1     Running   0          20m
nginx-caa-85c7bd4f49-7jcb7   2/2     Running   0          7m2s

We request a secret via skr-api:

$ POD_NAME=$(kubectl get po -l app=nginx-caa -o jsonpath='{.items[0].metadata.name}')
$ kubectl exec $POD_NAME -c nginx -- curl localhost:50080/getresource/one/two/key
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100     5  100     5    0     0     13      0 --:--:-- --:--:-- --:--:--    12
ohai

We can check the logs for kbs and skr-api to comprehend the attestation flow:

$ kubectl logs -l app=kbs
...
$ k logs -l app=nginx-caa -c skr-api
...

skr-api/Dockerfile Outdated Show resolved Hide resolved
skr-api/Dockerfile Outdated Show resolved Hide resolved
skr-api/Dockerfile Outdated Show resolved Hide resolved
skr-api/README.md Show resolved Hide resolved
@mkulke
Copy link
Contributor Author

mkulke commented Jul 25, 2023

cc @iaguis (i'm not able to add you as a reviewer, maybe you need to part of the coco github org?)

@fitzthum
Copy link
Member

How are you measuring the skr-api sidecar?

skr-api/README.md Show resolved Hide resolved
skr-api/README.md Show resolved Hide resolved
mkulke and others added 5 commits July 26, 2023 04:08
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
@mkulke
Copy link
Contributor Author

mkulke commented Jul 26, 2023

How are you measuring the skr-api sidecar?

Intuitively I would say: it's measured as part of the application workload (subject to policy, etc). In this deployment model the kbs-client (skr-api) is logically (and practically) part of the application in the TEE, provided for convenience. A user could just include KbsProtocolWrapper in their application directly and retrieve evidence from the TEE (whether we want that is a different question).

@mkulke mkulke force-pushed the mkulke/skr-api branch 2 times, most recently from ff3a63e to d1e79cd Compare July 26, 2023 13:55
@mkulke
Copy link
Contributor Author

mkulke commented Jul 26, 2023

Iterated on the original proposal a bit. Instead of mounting a UDS we can create an abstract socket in the pod namespace and have socat proxy it.

@surajssd
Copy link
Member

Initially, the skr-api build failed with the following error:

$ PODVM_DISTRO=ubuntu make image && cd -
...
...
   Compiling serde_urlencoded v0.7.1
   Compiling axum-core v0.3.4
error: failed to run custom build command for `tdx-attest-sys v0.1.0 (https://github.com/intel/SGXDataCenterAttestationPrimitives?tag=DCAP_
1.16#71557c7d)`

Caused by:
  process didn't exit successfully: `/home/surajaz/code/work/cloud-api-adaptor/skr-api/target/release/build/tdx-attest-sys-cb7f5d9821985a53
/build-script-build` (exit status: 101)
  --- stdout
  cargo:rustc-link-lib=tdx_attest
  cargo:rerun-if-changed=bindings.h

  --- stderr
  bindings.h:32:10: fatal error: 'tdx_attest.h' file not found
  bindings.h:32:10: fatal error: 'tdx_attest.h' file not found, err: true
  thread 'main' panicked at 'Unable to generate bindings: ()', /home/surajaz/.cargo/git/checkouts/sgxdatacenterattestationprimitives-d6934a
418e6beae0/71557c7/QuoteGeneration/quote_wrapper/tdx-attest-sys/build.rs:79:10
  note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
make[1]: *** [Makefile:3: build] Error 101
make[1]: Leaving directory '/home/surajaz/code/work/cloud-api-adaptor/skr-api'
make: *** [../../podvm/Makefile.inc:113: /home/surajaz/code/work/cloud-api-adaptor/podvm/files/usr/local/bin/attester] Error 2

So installed the following package:

sudo apt install -y libtdx-attest-dev

Problems with kata if anybody else is trying, I had the latest CCv0 from kata repo at commit: 61cbae6c39f62be1be5e5a05e11dff06d6e68630 and it seems broken so had to modify code to get it working.

diff --git src/agent/src/signal.rs src/agent/src/signal.rs
index d67000b80..401ded953 100644
--- src/agent/src/signal.rs
+++ src/agent/src/signal.rs
@@ -57,7 +57,7 @@ async fn handle_sigchild(logger: Logger, sandbox: Arc<Mutex<Sandbox>>) -> Result
                 continue;
             }

-            let mut p = process.unwrap();
+            let p = process.unwrap();

             let ret: i32 = match wait_status {
                 WaitStatus::Exited(_, c) => c,
diff --git src/libs/kata-types/src/annotations/mod.rs src/libs/kata-types/src/annotations/mod.rs
index f094ddd70..89da372de 100644
--- src/libs/kata-types/src/annotations/mod.rs
+++ src/libs/kata-types/src/annotations/mod.rs
@@ -474,8 +474,8 @@ impl Annotation {
         let u32_err = io::Error::new(io::ErrorKind::InvalidData, "parse u32 error".to_string());
         let u64_err = io::Error::new(io::ErrorKind::InvalidData, "parse u64 error".to_string());
         let i32_err = io::Error::new(io::ErrorKind::InvalidData, "parse i32 error".to_string());
-        let mut hv = config.hypervisor.get_mut(hypervisor_name).unwrap();
-        let mut ag = config.agent.get_mut(agent_name).unwrap();
+        let hv = config.hypervisor.get_mut(hypervisor_name).unwrap();
+        let ag = config.agent.get_mut(agent_name).unwrap();
         for (key, value) in &self.annotations {
             if hv.security_info.is_annotation_enabled(key) {
                 match key.as_str() {

Copy link
Member

@surajssd surajssd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was able to successfully do the SKR. This is a great work @mkulke Now it is matter of figuring out where does all the pieces fit.

skr-api/README.md Outdated Show resolved Hide resolved
mkulke and others added 6 commits July 28, 2023 12:18
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
This partially reverts commit 32ffc71.

Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Co-authored-by: Suraj Deshmukh <surajd.service@gmail.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
@mkulke
Copy link
Contributor Author

mkulke commented Jul 28, 2023

The pending changes in the foundation of coco/guest-components have been merged, so we don't require forks any more.

I also added a CommunityGallery Image, built some container images and added a section w/ steps to follow for testing this in the description above.

@bpradipt
Copy link
Member

bpradipt commented Aug 4, 2023

@mkulke is this ready for review ?

@mkulke
Copy link
Contributor Author

mkulke commented Aug 4, 2023

@mkulke is this ready for review ?

not quite. there are some discussions around a revised kbs-client design (see this and this ) that are relevant for this. At worst the proposed design collides with a coco design tenet of 2-tiered attestation.

@mkulke
Copy link
Contributor Author

mkulke commented Aug 8, 2023

updated the prebuilt image in the instructions to

/CommunityGalleries/cococommunity-42d8482d-92cd-415b-b332-7648bd978eff/Images/peerpod-podvm_skr-api-draft/versions/0.7.5

@mkulke
Copy link
Contributor Author

mkulke commented Oct 4, 2023

The above scenario (key retrieval at runtime and being able to retrieve them from a k8s deployed KBS) should now be covered by the Confidential Data Hub Architecture that has been merged to kata CCv0 and guest-components. It roughly follows the following sketch:

image

There is a key difference to the proposed skr-api architecture: The cc_kbc model mandates that unprivileged pod workloads should not be able to retrieve TEE evidence (in fact they still are able to do via api-server-rest's /aa/evidence endpoint, but we can turn this off) and facilitate attestation exchanges themselves.

Since KBS supports passport mode now, a pod can now request a token from api-server-rest via the /aa/token endpoint. this triggers evidence retrieval and remote attestation in the privileged podvm components and, if successful, yields a token back to the pod. The pod can then use the token to request the secret from a KBS exposed in k8s.

Alternatively a pod can request a secret resource directly via the api-server-rest's /cdh/resource endpoint.

Once the remaining issues have been addresses we can close this PR:

@mkulke
Copy link
Contributor Author

mkulke commented Oct 5, 2023

Ok, the remaining patches have been merged, we should be good now. To use runtime key retrieval w/ cc_kbc, you need to:

  • build a PodVM w/ the proper AA_KBC (e.g. AA_KBC=cc_kbc_az_snp_vtpm)
  • configure AA_KBC_PARAMS in the CAA config map like this cc_kbc::http://my-kbs-host:8080
  • ensure that my-kbs-host is reachable from the PodVM's root namespace (i.e. not k8s)

Get passport token

$ k exec -it deploy/busybox-caa -- wget -qO- http://127.0.0.1:8006/aa/token\?token_type\=kbs | jq .
{
  "token": "eyJhbGciOiJSUzM4NCIsInR5cCI6IkpXVCJ9.eyJldmFsdWF0aW9uLXJlcG9ydCI6IntcImFsbG93XCI6dHJ1ZX0iLCJleHAiOjE2OTY1MDA2OTcsImlzcyI6IkNvQ28tQXR0ZXN0YXRpb24tU2VydmljZSIsImp3ayI6eyJhbGciOiJSUzM4NCIsImUiOiJBUUFCIiwia3R5IjoiUlNBIiwibiI6InNrak1vNXA2WDdnWVFGcUZTZWc1cTRaMVZLaDl0VTBGeUJxNWRsU2tOVFNBR3g2RkRWaTlpZjRMdmpRdHVGSTJ2VjNCT2Etdnc5emNkblFpdk9JbVRQaXY3dEdhdVdfcFlxX3k2U2VRb3VGbW5YSWVUNUw4cDNFQmQ3ZEU5YXNQNTdJNmVFRjNNMWl4RlZ3ektUVF9WN05sSWRwTDN4bFpaNEFsTXBvWG83ZWdKNHVwYXBZMU5XTktzZWlWX1ZDbEZjQ0dvb1pVTkFxS19wVjZJSVhfbnROSGI4cU9ia2FvN1Q0aExkZmMzOFVNSEhNWjl1SVJRTmpJTkp2MjhUWG0yTkduUmp3WnczbzRsSmNlOVBXdm1EZUJMWnNrUnk5ak9Ga0MzNXIxeEdKNUR1OXhsc1dMOW1odjlEMjlfcXBPOEl1bHlIb0RoZmNFNnoxcUF5VFdZUSJ9LCJuYmYiOjE2OTY1MDAzOTcsInRjYi1zdGF0dXMiOnsiYXpzbnB2dHBtLm1lYXN1cmVtZW50IjoiVm5WZEkxVnRvZTFpdzBzRWIvVUpzSUdUK3lkK3JYM2pxTUxRS0lWL1Frek1UYVZla3FoaldnSVN1RVlNbVZOSyIsImF6c25wdnRwbS5wbGF0Zm9ybV9zbXRfZW5hYmxlZCI6IjAiLCJhenNucHZ0cG0ucGxhdGZvcm1fdHNtZV9lbmFibGVkIjoiMSIsImF6c25wdnRwbS5wb2xpY3lfYWJpX21ham9yIjoiMCIsImF6c25wdnRwbS5wb2xpY3lfYWJpX21pbm9yIjoiMzEiLCJhenNucHZ0cG0ucG9saWN5X2RlYnVnX2FsbG93ZWQiOiIwIiwiYXpzbnB2dHBtLnBvbGljeV9taWdyYXRlX21hIjoiMCIsImF6c25wdnRwbS5wb2xpY3lfc2luZ2xlX3NvY2tldCI6IjAiLCJhenNucHZ0cG0ucG9saWN5X3NtdF9hbGxvd2VkIjoiMSIsImF6c25wdnRwbS5yZXBvcnRlZF90Y2JfYm9vdGxvYWRlciI6IjMiLCJhenNucHZ0cG0ucmVwb3J0ZWRfdGNiX21pY3JvY29kZSI6IjExNSIsImF6c25wdnRwbS5yZXBvcnRlZF90Y2Jfc25wIjoiOCIsImF6c25wdnRwbS5yZXBvcnRlZF90Y2JfdGVlIjoiMCJ9LCJ0ZWUtcHVia2V5Ijp7ImFsZyI6IlJTQTFfNSIsImUiOiJBUUFCIiwia3R5IjoiUlNBIiwibiI6InpCeWwyMm5jYnFCRHR5cUV0OXpWYWFrczB2RHRkYlNHLVgydFh1UGcyQzJfSGtjdk5JMUo5OUk4UFdrR0FYNjFtRmFXd3k0OFpQQTRKVzEyUW93MXRvVUlSbFJhUVAycUxMbTYwRlVIMjBFTVhZQ2NwRDMzQWFnVXVUcTFVWTUtMmRjTkZiamhWNFVsOUhzOE9LZS1fYWg4SDR1Qm10Q2xmd0ZTMWNCZWR5QWNqNVhVMXdIaUFRY3hlNXJyaHpEYm1XZHZxSzFOQWJmXy13U085OHdfT0NxSzlYVDJuaklsTVZKRGtUcXhvY1BDVmRDaldXcXNjSkZVX0VOenBDS0MxZkFMdTM3eEtZRGpzbENEbV9vX0h2RlJpeklvXzdqZUJKN1Ridm9vRVFTWncxOGUzOGE2dmt0ZnFjTmxhZXJFeUJ3c0pyeDliWUNmc3FtS3FRRjFjUSJ9fQ.ebgD2QsnzPjsxwI2qJah7u53lgPchfrqIrkLiMn2IBlLBgzHQcJ3Z5jGLpM6kNTlZa5ZpKXzX-88lXGnQTCOzFAviEgDhzeFtDloHAIoT5PGIWHFQuzUXYDSXazmwKaUN3dTHmaRVU-SqEp_F4GAbh6HCob26JXgUPnlZh5BRq2q4qix7N7a9qtxfthTsSALwrjweR6CRkObU1bYgGEADGk8fx3OOpinUaZjRkiXSZctSUegUE-YIqByC6_aQmOUgKXZs1zmOoHtBtYqY-VmlXa6Leb8wmwPZPQsoTEvTXbs-ZipWPyxA2CwK6Z1PuH8CTDiFjB97rzzoHkm7Ouyhw",
  "tee_keypair": "-----BEGIN RSA PRIVATE KEY-----\nMIIEowIBAAKCAQEAzByl22ncbqBDtyqEt9zVaaks0vDtdbSG+X2tXuPg2C2/Hkcv\nNI1J99I8PWkGAX61mFaWwy48ZPA4JW12Qow1toUIRlRaQP2qLLm60FUH20EMXYCc\npD33AagUuTq1UY5+2dcNFbjhV4Ul9Hs8OKe+/ah8H4uBmtClfwFS1cBedyAcj5XU\n1wHiAQcxe5rrhzDbmWdvqK1NAbf/+wSO98w/OCqK9XT2njIlMVJDkTqxocPCVdCj\nWWqscJFU/ENzpCKC1fALu37xKYDjslCDm/o/HvFRizIo/7jeBJ7TbvooEQSZw18e\n38a6vktfqcNlaerEyBwsJrx9bYCfsqmKqQF1cQIDAQABAoIBAD6k5DqNKPxC78V9\npTIQ8ub05y7uhtLDT1GvQtCGu/FdSPTwAAru+i63NYnbe95llzJkEO1ieWK5X2IN\nUGhoQ+v6tGlxZingMKR9dFqQXlLqifMAkBLQecjmX0XiQNgBFemh2QA7t912ngmE\n8RyqTzHmzgGYfXSYaNKsA1JbMiL5CbpxArT/3K5wcs1wDMWLZFqUbWSQLeraERZs\nN/C6uF9u/a5iWaF4r8Tohn1LlzVITHlFpdeJlZIYbETrdd1IXne3LRfLXhYGNEeN\ng5Oy3IOVOZFzA5ULutpDyuVjSwblb370E2nhwA6w9o9pkOMeI6e7aExpHRjfFZiw\n7uj6WeECgYEA5ELyenxw8rUjg+CmCB9ovVAeXyffAG23XEtRlilrkBRZCa3Pqm//\nmvkdizwEEJbtsWH2h9xfUUHsXst5IEGNrnV5UTmcYpVI06BCdn7l3nss7bjpRlO4\nIGXU8/iqh0ZiwiTbA2VU5/uPNIhLvujNpNNV4+hQRZRoQJydhNrPLDUCgYEA5Ops\ndSHYaouWr8u+KFeqqxPlgrtvF3dQpRsGnLLALUrf0mmgqiMEyBsH5tDYy76a4vyH\n0qweJzuZ29PiJDuu7I7SWx2Ps+dTWLjQB1hozwgwqRNdkTChuAUveV95wCzOTpcq\nPzErq5r8p8DQQU/5BRpVG/rl+alw+ViAw5Kqs80CgYB4+fplbHq4R8SQ6olUmMD8\nRPAz4n/QTFX39ntBKKa3b/FYreP4Iu/HhOxhlOdam4NSlecBToy+FkBeZVzG+bdL\nlTs9D1mQ7inw72kKQGs4JPRE8dHA0jIuCYp523sVwvoohzwEaro7URou72WlwuDq\n0I8fAUs59VPjmp3pgcZ3WQKBgE2lzsA0iMorKyPaQlhA1F1PVGxx047sI+i9MBL6\n9wDmAuHGfn73fem6cYWzlbYWo0cXTaMCSwAX0WqlhnGv5PfMwkGx10q4zqarmbTE\nIlkHeCoBrZ1QF6rp516OKigripdR4zyoGx4MZmMonftpexhmBDSHeHalKPMLODIe\nj9SJAoGBAM5/6/xX/suVpTLu5UCIIRJqQthtSHhWaAD40AXDorRUD8bD756t4vHk\nUlx6rBWMM/ZR8fU14kXYGM0c6Khar12rJK3EcAYMSe1/m3JpjmdHhunjAEvr/Kmx\nu/hbNlbYraHR3OciwAuqGBVCd1YNlrl99gJjkVIRSTkshk0RQfB5\n-----END RSA PRIVATE KEY-----\n"
}

Get secret resource

k exec -it deploy/busybox-caa -- wget -qO- http://127.0.0.1:8006/cdh/resource/default/key/one
ohai

@mkulke mkulke closed this Oct 5, 2023
@mkulke
Copy link
Contributor Author

mkulke commented Oct 5, 2023

cc @kartikjoshi21

@mkulke mkulke deleted the mkulke/skr-api branch September 17, 2024 06:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants