Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLI Option to delete SPIRE agent directory/reset existing creds #5442

Open
faali1 opened this issue Aug 29, 2024 · 2 comments
Open

CLI Option to delete SPIRE agent directory/reset existing creds #5442

faali1 opened this issue Aug 29, 2024 · 2 comments
Labels
priority/backlog Issue is approved and in the backlog

Comments

@faali1
Copy link

faali1 commented Aug 29, 2024

We use SPIRE agents in our k8s clusters that connect to SPIRE servers. We have multiple trust domains and, some times, users create a cluster and put in the wrong trust domain (accidentally or they were mistaken etc.) To fix this, we have to perform multiple steps:

  • Change the SPIRE agent config to point to the correct server and bundle. This is easy enough
  • If the SPIRE agent had already attested to the first spire server and gotten a SVID, we need to do a node scaledown so the SPIRE agent can lose it's original SVID. Or else, we see an error that says
    x509svid: could not verify leaf certificate: x509: certificate signed by unknown authority

The current way we solve this is by doing a node scale down and then up. This resets the data for SPIRE agent. I propose adding an option to the SPIRE agent CLI that essentially resets the data directory/resets the spire agent so it can connect to the correct server.

Another benefit we can see is that, we started of by using keeping our keys on disk. If/when we move to KMS to manager our keys, our root signing key will change and we will have to deal with the above error. It's a lot nicer to ask teams to run a CLI command (while keeping their other services running) rather than doing a full node scale down/up on all clusters.

  • Version: NA
  • Platform: K8s
  • Subsystem: NA
@faali1
Copy link
Author

faali1 commented Aug 30, 2024

https://spiffe.slack.com/archives/CBNCC2V17/p1724959960812299

Some context, this is already possible for re-attestable attestors like k8s_psat using the new emptyDir config in the hardened helm charts. However, for non re-attestable attestors like aws_iid, this is not possible as the spire-agent needs to be persistent.

@MarcosDY MarcosDY added the triage/in-progress Issue triage is in progress label Sep 3, 2024
@azdagron
Copy link
Member

azdagron commented Sep 3, 2024

I agree that we need to provide at least documentation on the best way to wipe agent state in Kubernetes. If that becomes too hard, we'll consider adding a command as a last resort (we're worried about that command being invoked accidentally).

@azdagron azdagron added priority/backlog Issue is approved and in the backlog and removed triage/in-progress Issue triage is in progress labels Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/backlog Issue is approved and in the backlog
Projects
None yet
Development

No branches or pull requests

3 participants