-
Notifications
You must be signed in to change notification settings - Fork 88
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
podvm: remove cdh,attestation-agent units #1499
podvm: remove cdh,attestation-agent units #1499
Conversation
As we build kata with SEALED_SECRET=yes by default this implies that kata-agent will attempt to spawn attestation-agent, cdh and api-server-rest. We'll end up with duplicate processes and contention over the sockets they need to create. We can/need to keep api-server-rest as a systemd unit since this one needs to run in the podns network namespace and since it's exposing a tcp socket there is no contention. Signed-off-by: Magnus Kulke <magnuskulke@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
For context, this is how the processes look like on a podvm:
The pod can then retrieve a passport token via remote attestation:
|
From an image that was built with the changes merged to main, I don't see that the CDH or AA-rest are working services. Here is the process tree for kata-agent: root 921 0.1 0.8 90524 64216 ? Ssl 23:32 0:02 /usr/local/bin/kata-agent --config /etc/agent-config.toml
root 1024 0.0 0.1 153912 8600 ? Sl 23:32 0:00 \_ /usr/local/bin/attestation-agent --keyprovider_sock unix:///run/confidential-containers/attestation-agent/keyprovider.sock --getresource_sock unix:///run/confidential-containers/attestation-agent/getresource.sock --attestation_sock unix:///run/c
65535 1160 0.0 0.0 996 4 ? S 23:33 0:00 \_ /pause
root 1214 0.0 0.0 11380 7540 ? S 23:34 0:00 \_ nginx: master process nginx -g daemon off;
systemd+ 1241 0.0 0.0 11844 2808 ? S 23:34 0:00 | \_ nginx: worker process
systemd+ 1242 0.0 0.0 11844 2808 ? S 23:34 0:00 | \_ nginx: worker process
root 1266 0.0 0.0 4188 3456 pts/0 Ss+ 23:35 0:00 \_ bash You can find the image here: Also I have added the kustomization file generated as follows, notice that cat <<EOF >install/overlays/azure/kustomization.yaml
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
bases:
- ../../yamls
images:
- name: cloud-api-adaptor
newName: "${registry}/cloud-api-adaptor"
newTag: latest
generatorOptions:
disableNameSuffixHash: true
configMapGenerator:
- name: peer-pods-cm
namespace: confidential-containers-system
literals:
- CLOUD_PROVIDER="azure"
- AZURE_SUBSCRIPTION_ID="${AZURE_SUBSCRIPTION_ID}"
- AZURE_REGION="${AZURE_REGION}"
- AZURE_INSTANCE_SIZE="Standard_DC2as_v5"
- AZURE_RESOURCE_GROUP="${AZURE_RESOURCE_GROUP}"
- AZURE_SUBNET_ID="${AZURE_SUBNET_ID}"
- AZURE_IMAGE_ID="${AZURE_IMAGE_ID}"
- AA_KBC_PARAMS="cc_kbc::http://10.0.211.55:8080"
secretGenerator:
- name: peer-pods-secret
namespace: confidential-containers-system
literals: []
- name: ssh-key-secret
namespace: confidential-containers-system
files:
- id_rsa.pub
patchesStrategicMerge:
- workload-identity.yaml
EOF |
I have a suspicion that this is due to the kata-agent binary being cached. In versions.yaml we have set
|
apparentely that worked. I rebuilt an image on this repo. Using the resulting |
Thanks @mkulke this worked for me. But now on to the next roadblock. I see that some parsing fails with the following error: # curl http://127.0.0.1:8006/aa/token\?token_type\=kbs
rpc status: Status { code: INTERNAL, message: "[ERROR:attestation-agent] AA-KBC get token failed: RCAR handshake failed: KBS attest unauthorized, Error Info: ErrorInformation { error_type: \"https://github.com/confidential-containers/kbs/errors/AttestationFailed\", detail: \"Attestation failed: Verifier evaluate failed: json parse error\\n\\nCaused by:\\n trailing characters at line 1 column 420\" }", details: [], special_fields: SpecialFields { unknown_fields: UnknownFields { fields: None }, cached_size: CachedSize { size: 0 } } } Things look fine on the KBS side. Could it be because the AA did not pick up the latest change? I will try to build the image on my own and then verify this again. |
No, this is a KBS error (Attestation Service actually, I assume you use the as-builtin option) verification was broken due to changes in the HCL report, those were fixed in az-snp-vtpm crate. You need to rebuild/redeploy KBS and make sure Cargo.lock is recreated, so it won't reuse an older revision of AS#main. |
Gotcha, found your PR: confidential-containers/trustee#165 |
Thanks Magnus, it all worked! |
fixes #1495
As we build kata with SEALED_SECRET=yes by default this implies that kata-agent will attempt to spawn attestation-agent, cdh and api-server-rest. We'll end up with duplicate processes and contention over the sockets they need to create.
We can/need to keep api-server-rest as a systemd unit since this one needs to run in the podns network namespace and since it's exposing a tcp socket there is no contention.