Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add upstream TLS trust from CM bundles #14717

Merged
merged 6 commits into from
Mar 8, 2024

Conversation

ReToCode
Copy link
Member

@ReToCode ReToCode commented Dec 7, 2023

Changes

  • Adds support for trust-bundles in Activators cert-cache

Fixes #14609

@knative-prow knative-prow bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Dec 7, 2023
@ReToCode ReToCode mentioned this pull request Dec 7, 2023
4 tasks
@knative-prow knative-prow bot added area/autoscale area/networking size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Dec 7, 2023
@knative-prow knative-prow bot requested review from kauana and krsna-m December 7, 2023 13:15
@knative-prow knative-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 7, 2023
@ReToCode ReToCode changed the title [wip] add upstream TLS trust from CM bundles add upstream TLS trust from CM bundles Jan 26, 2024
@knative-prow knative-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jan 26, 2024
@knative-prow knative-prow bot added the area/test-and-release It flags unit/e2e/conformance/perf test issues for product features label Jan 26, 2024
Copy link

codecov bot commented Jan 29, 2024

Codecov Report

Attention: Patch coverage is 49.31507% with 37 lines in your changes are missing coverage. Please review.

Project coverage is 85.66%. Comparing base (51b0337) to head (d4c28a4).
Report is 107 commits behind head on main.

Files Patch % Lines
pkg/activator/certificate/cache.go 49.31% 31 Missing and 6 partials ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #14717      +/-   ##
==========================================
- Coverage   85.83%   85.66%   -0.18%     
==========================================
  Files         198      198              
  Lines       15117    15171      +54     
==========================================
+ Hits        12976    12996      +20     
- Misses       1819     1848      +29     
- Partials      322      327       +5     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ReToCode
Copy link
Member Author

PTAL
/assign @dprotaso
/assign @skonto

@ReToCode
Copy link
Member Author

/unhold

@knative-prow knative-prow bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 29, 2024
test/e2e-common.sh Outdated Show resolved Hide resolved
pkg/activator/certificate/cache.go Show resolved Hide resolved
Comment on lines +63 to +64
secretInformer: nsSecretInformer,
configmapInformer: nsConfigmapInformer,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Curious - I'm wondering if it's simpler to just watch (or poll) a certain directory for certs instead of using informers here.

Then in theory we wouldn't necessarily be tied to 'trust-manager` and could swap to use k8s cluster trust bundles (ref).

It's going beta in 1.30

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, we do it in Queue-Proxy already. Having multiple config-maps is not tying only to trust-manager.We also have downstream solutions with the same pattern (https://github.com/ReToCode/knative-encryption/tree/main/8-trust-sources), so we'll have multiple ConfigMaps to look at.
If we do it via filesystem, we'd need a way to configure multiple mount-configs somehow. But I'm up for looking at this again once cluster trust bundles landed (we'd also need support for this downstream).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do it via filesystem, we'd need a way to configure multiple mount-configs somehow. But I'm up for looking at this again once cluster trust bundles landed (we'd also need support for this downstream).

I think it might be simple enough for operators to add a projected volume mount (ref) coalescing all the CA's into the same volume mount. If we use the standard SSL_CERT_DIR then I think the activator might simply need to call x509.SystemCertPool() every few minutes (https://pkg.go.dev/crypto/x509#SystemCertPool) to get updates.

One thing to verify is if projected volumes receive updates when config maps change - if they don't we should file a k8s bug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it might be simple enough for operators to add a projected volume mount (ref) coalescing all the CA's into the same volume mount.

I think that would work. But how would a user define the volume-mounts? He'd need to customize our yamls on installation (and on every update) to add the volume mounts. Also operator would probably need some form of extra-mounts attributes for all containers. Isn't that more complicated that what we have in this PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We require operators do this for tag-to-digest resolution - https://knative.dev/docs/serving/tag-resolution/#custom-certificates

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this PR we also trust the CA in a Secret (if present), I think it's a good thing to make the "easy" setup work without additional configuration. That certificate will not be in the system pool by default, so users always would need to give Activator the CA in that case. So I'm not (yet) really convinced that is good UX for our users.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we just add a projected volume to the activator and add that ca secret as an file entry?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CA is not always populated, also what component would manage that volume in the YAML? The secret only exists when encryption is enabled and cert-manager is deployed. I'd assume the deployment would fail if no Secret exists. Even when it exists, the CA can be empty depending on the cert-manager issuer type (only CA and self-signed populate the ca.crt field). So we'd mount an empty file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did think about this a bit more and have another reason to keep this PR. This affects not just serving but also the net-* implementations. With the current solution, a user just provides the CM (ideally also using trust-manager to distribute it in multiple namespaces) and it works directly. No changes necessary to the YAML (that one has to do on every upgrade), no further changes on the operator and the KnativeServing CR.

With the mount approach, a user would need to customize the volumes of controller and net-* controller for now. We might at some point add mTLS, then that would also be needed on every KnativeService in the QP manifest. So I'm not sure this is really a good alternative as it has way worse UX.

Copy link
Member

@dprotaso dprotaso Feb 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still mixed on this - cause I'm seeing two technologies on 'how do we distribute certificates' and ideally it would be great if we didn't overfit towards one solution. Informing on secrets/certificates seems like we're overfitting for trust-manager. Projected volumes seems like it would be the most flexible since you can combine both certificates from trust-manager and ClusterTrustBundles.

ClustersTrustBundles are right around the corner (beta in K8s 1.30 - April). And they only work with projected volumes - so I think I'd attempt to future proof this design.

I think an in-tree solution (ClusterTrustBundles) will be more popular than an external one (trust-manager).

The CA is not always populated... I'd assume the deployment would fail if no Secret exists... So we'd mount an empty file.

Mounting an empty file should be fine - we can ignore empty files when we refresh the certificates.

With the mount approach, a user would need to customize the volumes of controller and net-* controller for now.

I don't think so - we could just have projected volumes as part of our controller/net-* default deployment.

In the future if a user wants to use ClusterTrustBundles they can patch this project volume to include it.

We might at some point add mTLS, then that would also be needed on every KnativeService in the QP manifest.

I don't understand concern your'e trying to surface here - we already use volumes for certificates in the queue proxy here

if cfg.Network.SystemInternalTLSEnabled() {
queueContainer.VolumeMounts = append(queueContainer.VolumeMounts, varCertVolumeMount)
extraVolumes = append(extraVolumes, certVolume(networking.ServingCertName))
}

@ReToCode
Copy link
Member Author

ReToCode commented Feb 7, 2024

@dprotaso gentle ping.

@ReToCode
Copy link
Member Author

@dprotaso gentle ping.

@ReToCode
Copy link
Member Author

ReToCode commented Mar 7, 2024

@dprotaso as discussed yesterday on the SIG, are you fine with merging this as is?
For the cluster-trust-bundles I created #14990.

@dprotaso
Copy link
Member

dprotaso commented Mar 8, 2024

/lgtm
/approve

nit: unsure if you want to handle pool regeneration on deletion.

eg. if I'm using a configmap and I'm transitioning to a secret. Then when I add the secret and then delete the config map the old CA could still be in the computed x509.Pool

@knative-prow knative-prow bot added the lgtm Indicates that a PR is ready to be merged. label Mar 8, 2024
Copy link

knative-prow bot commented Mar 8, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: dprotaso, ReToCode

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@dprotaso
Copy link
Member

dprotaso commented Mar 8, 2024

/retest

@knative-prow knative-prow bot merged commit 57dc06d into knative:main Mar 8, 2024
49 checks passed
@ReToCode
Copy link
Member Author

nit: unsure if you want to handle pool regeneration on deletion.
eg. if I'm using a configmap and I'm transitioning to a secret. Then when I add the secret and then delete the config map the old CA could still be in the computed x509.Pool

@dprotaso I'm not sure I understand this case. We do update on deletion of a CM, right? The Secret comes from Serving installation, so that one should always be there (it might be empty, but should be there).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/autoscale area/networking area/test-and-release It flags unit/e2e/conformance/perf test issues for product features lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

implement trusting a CA bundle
3 participants