CLI Option to delete SPIRE agent directory/reset existing creds #5442

faali1 · 2024-08-29T23:20:43Z

We use SPIRE agents in our k8s clusters that connect to SPIRE servers. We have multiple trust domains and, some times, users create a cluster and put in the wrong trust domain (accidentally or they were mistaken etc.) To fix this, we have to perform multiple steps:

Change the SPIRE agent config to point to the correct server and bundle. This is easy enough
If the SPIRE agent had already attested to the first spire server and gotten a SVID, we need to do a node scaledown so the SPIRE agent can lose it's original SVID. Or else, we see an error that says
x509svid: could not verify leaf certificate: x509: certificate signed by unknown authority

The current way we solve this is by doing a node scale down and then up. This resets the data for SPIRE agent. I propose adding an option to the SPIRE agent CLI that essentially resets the data directory/resets the spire agent so it can connect to the correct server.

Another benefit we can see is that, we started of by using keeping our keys on disk. If/when we move to KMS to manager our keys, our root signing key will change and we will have to deal with the above error. It's a lot nicer to ask teams to run a CLI command (while keeping their other services running) rather than doing a full node scale down/up on all clusters.

Version: NA
Platform: K8s
Subsystem: NA

The text was updated successfully, but these errors were encountered:

faali1 · 2024-08-30T00:06:57Z

https://spiffe.slack.com/archives/CBNCC2V17/p1724959960812299

Some context, this is already possible for re-attestable attestors like k8s_psat using the new emptyDir config in the hardened helm charts. However, for non re-attestable attestors like aws_iid, this is not possible as the spire-agent needs to be persistent.

azdagron · 2024-09-03T18:55:58Z

I agree that we need to provide at least documentation on the best way to wipe agent state in Kubernetes. If that becomes too hard, we'll consider adding a command as a last resort (we're worried about that command being invoked accidentally).

MarcosDY added the triage/in-progress Issue triage is in progress label Sep 3, 2024

azdagron added priority/backlog Issue is approved and in the backlog and removed triage/in-progress Issue triage is in progress labels Sep 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLI Option to delete SPIRE agent directory/reset existing creds #5442

CLI Option to delete SPIRE agent directory/reset existing creds #5442

faali1 commented Aug 29, 2024

faali1 commented Aug 30, 2024

azdagron commented Sep 3, 2024

CLI Option to delete SPIRE agent directory/reset existing creds #5442

CLI Option to delete SPIRE agent directory/reset existing creds #5442

Comments

faali1 commented Aug 29, 2024

faali1 commented Aug 30, 2024

azdagron commented Sep 3, 2024