Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to configure spire-agent on cluster-A to talk to spire-server on root-spire-server #371

Closed
PeterSR opened this issue May 27, 2024 · 5 comments

Comments

@PeterSR
Copy link

PeterSR commented May 27, 2024

I would like a setup as follows:

I have many different Kubernetes clusters that all should use the same trust domain. I have configured spire-server (and in fact the entire spire stack) on a server, let's call it root-spire-server. I have also configured spire-agent (in fact all spire stack except spire-server and spire-oidc-provider) on another Kubernetes cluster, let's call it cluster-A, and I would now like the agent to talk to the server, as depicted here.

root-spire-server happens to run in a Kubernetes cluster with one node that is dedicated to running Spire. I have an ingress for the oidc-provider that works correctly, with HTTPS certificates provisioned by cert-manager. I also have an ingress for spire-server with an HTTPS certificate by cert-manager.

I am getting this error on the agent:

time="2024-05-27T19:24:50Z" level=error msg="Agent crashed" error="create attestation client: failed to dial dns:///spire.my.domain.org:443: context deadline exceeded: connection error: desc = \"transport: authentication handshake failed: x509svid: could not get leaf SPIFFE ID: certificate contains no URI SAN\""

(trust domain domain.org, root-spire-server hostname is spire.my.domain.org - of course here using an imaginary domain, but my actual domains are pretty similar).

I think there are a couple of things lacking in my understanding:

  • As far as I know, the server speaks grpc. How does that work over HTTPS? Should the ingress even terminate the TLS or should the pod handle that with it's own certificates?
  • Inspecting the cert-manager certificate, of course there is no URI SAN, and it seems there's is not going to be one. So clearly I am doing something wrong. But using the
    spire-server:
      ingress:
        enabled: true
    
    really seems to involve some kind of cert-manager setup when looking at the helm chart logic.
  • Am I approaching this from the wrong angle? Should I be using the nested setup?
    • But according to the documentation the server then talks with the agent - but how can that be possible when the agent is not exposed? Does that mean running two agents on cluster-A. One purely for talking to root-spire-server and one for the workload API on the cluster itself.
@PeterSR
Copy link
Author

PeterSR commented May 27, 2024

Note: Perhaps this is more suitable as a discussion instead of issue, but I don't see that tab.

@kfox1111
Copy link
Collaborator

It is possible to configure multiple instances of the spire helm chart to properly support a nested configuration. But, we're getting really close to releasing 0.21.0. It will contain a new chart, spire-nested that will allow for easy deployment of nested spire configurations directly. It should be out in the next few days.

There is initial documentation for it in this PR:
spiffe/spiffe.io#293

The current rendered documentation for that documentation PR can be found here:
https://deploy-preview-293--spiffe.netlify.app/docs/latest/spire-helm-charts-hardened-advanced/nested-spire/

@PeterSR
Copy link
Author

PeterSR commented May 28, 2024

Thank you for the link 😊

Will stay tuned for the update.

Can you briefly explain why it needs to be such a complex architecture? Conceptually, why does the root cluster need multiple servers and agents?

@kfox1111
Copy link
Collaborator

lots of little reasons. its likely at some point to want to have workloads running on the cluster that the root-spire server is running in also bound to spire setup. the internal-spire server in the root k8s cluster does that part. You also may want to isolate the real root ca from the internet/intranet. In this setup, the root-spire instance only is a root ca server and only accessible from within the root k8s cluster for maximum security. If the external-server needs to be rekeyed or scaled up, its much easier to do without changing the main root server in this arrangement.

@PeterSR
Copy link
Author

PeterSR commented May 28, 2024

Than you for clarifying! I think the reasons makes sense. I would probably think that these reasons are a bit too advanced for our use case - We value getting up an running quickly with something that is production ready and okay on security, not necessarily needing maximum security. But if this new helm chart delivers both better security and more convenience, then that sounds like worth waiting for.

@PeterSR PeterSR closed this as completed May 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants