Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add upstream TLS trust from CM bundles #14717

Merged
merged 6 commits into from
Mar 8, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
118 changes: 97 additions & 21 deletions pkg/activator/certificate/cache.go
Original file line number Diff line number Diff line change
Expand Up @@ -26,21 +26,27 @@ import (

"go.uber.org/zap"
corev1 "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/selection"
v1 "k8s.io/client-go/informers/core/v1"
"k8s.io/client-go/tools/cache"
"knative.dev/networking/pkg/apis/networking"
"knative.dev/pkg/reconciler"

"knative.dev/networking/pkg/certificates"
netcfg "knative.dev/networking/pkg/config"
"knative.dev/pkg/controller"
secretinformer "knative.dev/pkg/injection/clients/namespacedkube/informers/core/v1/secret"
nsconfigmapinformer "knative.dev/pkg/injection/clients/namespacedkube/informers/core/v1/configmap"
nssecretinformer "knative.dev/pkg/injection/clients/namespacedkube/informers/core/v1/secret"
"knative.dev/pkg/logging"
"knative.dev/pkg/system"
)

// CertCache caches certificates and CA pool.
type CertCache struct {
secretInformer v1.SecretInformer
logger *zap.SugaredLogger
secretInformer v1.SecretInformer
configmapInformer v1.ConfigMapInformer
logger *zap.SugaredLogger

certificate *tls.Certificate
TLSConf tls.Config
Expand All @@ -50,68 +56,138 @@ type CertCache struct {

// NewCertCache creates and starts the certificate cache that watches Activators certificate.
func NewCertCache(ctx context.Context) (*CertCache, error) {
secretInformer := secretinformer.Get(ctx)
nsSecretInformer := nssecretinformer.Get(ctx)
nsConfigmapInformer := nsconfigmapinformer.Get(ctx)

cr := &CertCache{
secretInformer: secretInformer,
logger: logging.FromContext(ctx),
secretInformer: nsSecretInformer,
configmapInformer: nsConfigmapInformer,
Comment on lines +63 to +64
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - I'm wondering if it's simpler to just watch (or poll) a certain directory for certs instead of using informers here.

Then in theory we wouldn't necessarily be tied to 'trust-manager` and could swap to use k8s cluster trust bundles (ref).

It's going beta in 1.30

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, we do it in Queue-Proxy already. Having multiple config-maps is not tying only to trust-manager.We also have downstream solutions with the same pattern (https://github.com/ReToCode/knative-encryption/tree/main/8-trust-sources), so we'll have multiple ConfigMaps to look at.
If we do it via filesystem, we'd need a way to configure multiple mount-configs somehow. But I'm up for looking at this again once cluster trust bundles landed (we'd also need support for this downstream).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do it via filesystem, we'd need a way to configure multiple mount-configs somehow. But I'm up for looking at this again once cluster trust bundles landed (we'd also need support for this downstream).

I think it might be simple enough for operators to add a projected volume mount (ref) coalescing all the CA's into the same volume mount. If we use the standard SSL_CERT_DIR then I think the activator might simply need to call x509.SystemCertPool() every few minutes (https://pkg.go.dev/crypto/x509#SystemCertPool) to get updates.

One thing to verify is if projected volumes receive updates when config maps change - if they don't we should file a k8s bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be simple enough for operators to add a projected volume mount (ref) coalescing all the CA's into the same volume mount.

I think that would work. But how would a user define the volume-mounts? He'd need to customize our yamls on installation (and on every update) to add the volume mounts. Also operator would probably need some form of extra-mounts attributes for all containers. Isn't that more complicated that what we have in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We require operators do this for tag-to-digest resolution - https://knative.dev/docs/serving/tag-resolution/#custom-certificates

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR we also trust the CA in a Secret (if present), I think it's a good thing to make the "easy" setup work without additional configuration. That certificate will not be in the system pool by default, so users always would need to give Activator the CA in that case. So I'm not (yet) really convinced that is good UX for our users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just add a projected volume to the activator and add that ca secret as an file entry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CA is not always populated, also what component would manage that volume in the YAML? The secret only exists when encryption is enabled and cert-manager is deployed. I'd assume the deployment would fail if no Secret exists. Even when it exists, the CA can be empty depending on the cert-manager issuer type (only CA and self-signed populate the ca.crt field). So we'd mount an empty file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about this a bit more and have another reason to keep this PR. This affects not just serving but also the net-* implementations. With the current solution, a user just provides the CM (ideally also using trust-manager to distribute it in multiple namespaces) and it works directly. No changes necessary to the YAML (that one has to do on every upgrade), no further changes on the operator and the KnativeServing CR.

With the mount approach, a user would need to customize the volumes of controller and net-* controller for now. We might at some point add mTLS, then that would also be needed on every KnativeService in the QP manifest. So I'm not sure this is really a good alternative as it has way worse UX.

Copy link
Member

@dprotaso dprotaso Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still mixed on this - cause I'm seeing two technologies on 'how do we distribute certificates' and ideally it would be great if we didn't overfit towards one solution. Informing on secrets/certificates seems like we're overfitting for trust-manager. Projected volumes seems like it would be the most flexible since you can combine both certificates from trust-manager and ClusterTrustBundles.

ClustersTrustBundles are right around the corner (beta in K8s 1.30 - April). And they only work with projected volumes - so I think I'd attempt to future proof this design.

I think an in-tree solution (ClusterTrustBundles) will be more popular than an external one (trust-manager).

The CA is not always populated... I'd assume the deployment would fail if no Secret exists... So we'd mount an empty file.

Mounting an empty file should be fine - we can ignore empty files when we refresh the certificates.

With the mount approach, a user would need to customize the volumes of controller and net-* controller for now.

I don't think so - we could just have projected volumes as part of our controller/net-* default deployment.

In the future if a user wants to use ClusterTrustBundles they can patch this project volume to include it.

We might at some point add mTLS, then that would also be needed on every KnativeService in the QP manifest.

I don't understand concern your'e trying to surface here - we already use volumes for certificates in the queue proxy here

if cfg.Network.SystemInternalTLSEnabled() {
queueContainer.VolumeMounts = append(queueContainer.VolumeMounts, varCertVolumeMount)
extraVolumes = append(extraVolumes, certVolume(networking.ServingCertName))
}

logger: logging.FromContext(ctx),
}

secret, err := cr.secretInformer.Lister().Secrets(system.Namespace()).Get(netcfg.ServingRoutingCertName)
if err != nil {
return nil, fmt.Errorf("failed to get activator certificate, secret %s/%s was not found: %w. Enabling system-internal-tls requires the secret to be present and populated with a valid certificate and CA",
return nil, fmt.Errorf("failed to get activator certificate, secret %s/%s was not found: %w. Enabling system-internal-tls requires the secret to be present and populated with a valid certificate",
system.Namespace(), netcfg.ServingRoutingCertName, err)
}

cr.updateCache(secret)
cr.updateCertificate(secret)
cr.updateTrustPool()

secretInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{
nsSecretInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{
FilterFunc: controller.FilterWithNameAndNamespace(system.Namespace(), netcfg.ServingRoutingCertName),
Handler: cache.ResourceEventHandlerFuncs{
UpdateFunc: cr.handleCertificateUpdate,
AddFunc: cr.handleCertificateAdd,
},
})

nsConfigmapInformer.Informer().AddEventHandler(cache.FilteringResourceEventHandler{
FilterFunc: reconciler.ChainFilterFuncs(
reconciler.LabelExistsFilterFunc(networking.TrustBundleLabelKey),
),
Handler: controller.HandleAll(func(obj interface{}) {
cr.updateTrustPool()
}),
})

return cr, nil
}

func (cr *CertCache) handleCertificateAdd(added interface{}) {
if secret, ok := added.(*corev1.Secret); ok {
cr.updateCache(secret)
cr.updateCertificate(secret)
cr.updateTrustPool()
}
}

func (cr *CertCache) updateCache(secret *corev1.Secret) {
func (cr *CertCache) handleCertificateUpdate(_, new interface{}) {
cr.handleCertificateAdd(new)
cr.updateTrustPool()
}

func (cr *CertCache) updateCertificate(secret *corev1.Secret) {
cr.certificatesMux.Lock()
defer cr.certificatesMux.Unlock()

cert, err := tls.X509KeyPair(secret.Data[certificates.CertName], secret.Data[certificates.PrivateKeyName])
if err != nil {
cr.logger.Warnw("failed to parse secret", zap.Error(err))
cr.logger.Warnf("failed to parse certificate in secret %s/%s: %v", secret.Namespace, secret.Name, zap.Error(err))
return
}
cr.certificate = &cert
}

// CA can optionally be in `ca.crt` in the `routing-serving-certs` secret
// and/or configured using a trust-bundle via ConfigMap that has the defined label `knative-ca-trust-bundle`.
func (cr *CertCache) updateTrustPool() {
pool := x509.NewCertPool()
block, _ := pem.Decode(secret.Data[certificates.CaCertName])
ca, err := x509.ParseCertificate(block.Bytes)
if err != nil {
cr.logger.Warnw("failed to parse CA", zap.Error(err))
return
}
pool.AddCert(ca)

cr.addSecretCAIfPresent(pool)
cr.addTrustBundles(pool)

// Use the trust pool in upstream TLS context
cr.certificatesMux.Lock()
defer cr.certificatesMux.Unlock()

cr.TLSConf.RootCAs = pool
cr.TLSConf.ServerName = certificates.LegacyFakeDnsName
cr.TLSConf.MinVersion = tls.VersionTLS13
}

func (cr *CertCache) handleCertificateUpdate(_, new interface{}) {
cr.handleCertificateAdd(new)
func (cr *CertCache) addSecretCAIfPresent(pool *x509.CertPool) {
secret, err := cr.secretInformer.Lister().Secrets(system.Namespace()).Get(netcfg.ServingRoutingCertName)
if err != nil {
cr.logger.Warnf("Failed to get secret %s/%s: %v", system.Namespace(), netcfg.ServingRoutingCertName, zap.Error(err))
return
}
if len(secret.Data[certificates.CaCertName]) > 0 {
block, _ := pem.Decode(secret.Data[certificates.CaCertName])
ca, err := x509.ParseCertificate(block.Bytes)
if err != nil {
cr.logger.Warnf("CA from Secret %s/%s[%s] is invalid and will be ignored: %v",
system.Namespace(), netcfg.ServingRoutingCertName, certificates.CaCertName, err)
} else {
pool.AddCert(ca)
}
}
}

func (cr *CertCache) addTrustBundles(pool *x509.CertPool) {
selector, err := getLabelSelector(networking.TrustBundleLabelKey)
if err != nil {
cr.logger.Error("Failed to get label selector", zap.Error(err))
return
}
cms, err := cr.configmapInformer.Lister().ConfigMaps(system.Namespace()).List(selector)
if err != nil {
cr.logger.Warnf("Failed to get ConfigMaps %s/%s with label %s: %v", system.Namespace(),
netcfg.ServingRoutingCertName, networking.TrustBundleLabelKey, zap.Error(err))
return
}

for _, cm := range cms {
for _, bundle := range cm.Data {
ok := pool.AppendCertsFromPEM([]byte(bundle))
if !ok {
cr.logger.Warnf("Failed to add CA bundle from ConfigMaps %s/%s as it contains invalid certificates. Bundle: %s", system.Namespace(),
cm.Name, bundle)
}
}
}
}

// GetCertificate returns the cached certificates.
func (cr *CertCache) GetCertificate(_ *tls.ClientHelloInfo) (*tls.Certificate, error) {
return cr.certificate, nil
}

func getLabelSelector(label string) (labels.Selector, error) {
selector := labels.NewSelector()
req, err := labels.NewRequirement(label, selection.Exists, make([]string, 0))
if err != nil {
return nil, err
}
selector = selector.Add(*req)
return selector, nil
}
Loading
Loading