How to configure spire-agent on cluster-A to talk to spire-server on root-spire-server #371

PeterSR · 2024-05-27T20:16:54Z

I would like a setup as follows:

I have many different Kubernetes clusters that all should use the same trust domain. I have configured spire-server (and in fact the entire spire stack) on a server, let's call it root-spire-server. I have also configured spire-agent (in fact all spire stack except spire-server and spire-oidc-provider) on another Kubernetes cluster, let's call it cluster-A, and I would now like the agent to talk to the server, as depicted here.

root-spire-server happens to run in a Kubernetes cluster with one node that is dedicated to running Spire. I have an ingress for the oidc-provider that works correctly, with HTTPS certificates provisioned by cert-manager. I also have an ingress for spire-server with an HTTPS certificate by cert-manager.

I am getting this error on the agent:

time="2024-05-27T19:24:50Z" level=error msg="Agent crashed" error="create attestation client: failed to dial dns:///spire.my.domain.org:443: context deadline exceeded: connection error: desc = \"transport: authentication handshake failed: x509svid: could not get leaf SPIFFE ID: certificate contains no URI SAN\""

(trust domain domain.org, root-spire-server hostname is spire.my.domain.org - of course here using an imaginary domain, but my actual domains are pretty similar).

I think there are a couple of things lacking in my understanding:

As far as I know, the server speaks grpc. How does that work over HTTPS? Should the ingress even terminate the TLS or should the pod handle that with it's own certificates?
Inspecting the cert-manager certificate, of course there is no URI SAN, and it seems there's is not going to be one. So clearly I am doing something wrong. But using the
```
spire-server:
  ingress:
    enabled: true
```
really seems to involve some kind of cert-manager setup when looking at the helm chart logic.
Am I approaching this from the wrong angle? Should I be using the nested setup?
- But according to the documentation the server then talks with the agent - but how can that be possible when the agent is not exposed? Does that mean running two agents on cluster-A. One purely for talking to root-spire-server and one for the workload API on the cluster itself.

The text was updated successfully, but these errors were encountered:

PeterSR · 2024-05-27T20:23:49Z

Note: Perhaps this is more suitable as a discussion instead of issue, but I don't see that tab.

kfox1111 · 2024-05-28T01:51:54Z

It is possible to configure multiple instances of the spire helm chart to properly support a nested configuration. But, we're getting really close to releasing 0.21.0. It will contain a new chart, spire-nested that will allow for easy deployment of nested spire configurations directly. It should be out in the next few days.

There is initial documentation for it in this PR:
spiffe/spiffe.io#293

The current rendered documentation for that documentation PR can be found here:
https://deploy-preview-293--spiffe.netlify.app/docs/latest/spire-helm-charts-hardened-advanced/nested-spire/

PeterSR · 2024-05-28T18:17:06Z

Thank you for the link 😊

Will stay tuned for the update.

Can you briefly explain why it needs to be such a complex architecture? Conceptually, why does the root cluster need multiple servers and agents?

kfox1111 · 2024-05-28T19:00:08Z

lots of little reasons. its likely at some point to want to have workloads running on the cluster that the root-spire server is running in also bound to spire setup. the internal-spire server in the root k8s cluster does that part. You also may want to isolate the real root ca from the internet/intranet. In this setup, the root-spire instance only is a root ca server and only accessible from within the root k8s cluster for maximum security. If the external-server needs to be rekeyed or scaled up, its much easier to do without changing the main root server in this arrangement.

PeterSR · 2024-05-28T19:38:19Z

Than you for clarifying! I think the reasons makes sense. I would probably think that these reasons are a bit too advanced for our use case - We value getting up an running quickly with something that is production ready and okay on security, not necessarily needing maximum security. But if this new helm chart delivers both better security and more convenience, then that sounds like worth waiting for.

PeterSR closed this as completed May 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to configure spire-agent on cluster-A to talk to spire-server on root-spire-server #371

How to configure spire-agent on cluster-A to talk to spire-server on root-spire-server #371

PeterSR commented May 27, 2024

PeterSR commented May 27, 2024

kfox1111 commented May 28, 2024

PeterSR commented May 28, 2024

kfox1111 commented May 28, 2024

PeterSR commented May 28, 2024

How to configure spire-agent on cluster-A to talk to spire-server on root-spire-server #371

How to configure spire-agent on cluster-A to talk to spire-server on root-spire-server #371

Comments

PeterSR commented May 27, 2024

PeterSR commented May 27, 2024

kfox1111 commented May 28, 2024

PeterSR commented May 28, 2024

kfox1111 commented May 28, 2024

PeterSR commented May 28, 2024