Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable caching of secrets and configmaps #513

Merged
merged 1 commit into from
Dec 20, 2022
Merged

Disable caching of secrets and configmaps #513

merged 1 commit into from
Dec 20, 2022

Conversation

mac-chaffee
Copy link
Contributor

Fixes #512 .

By default, controller-runtime gives you a cache-backed client, which will perform a LIST operation on any API type that you fetch, even if you use a GET request like we do:

if err := r.Get(ctx, namespacedName, resource); err != nil {

So unless you restrict helm-controller to a specific namespace with --watch-all-namespaces=false, then it will attempt to populate its cache by listing all secrets in the cluster, even if it only needs to access a few in specific namespaces. That requires giving helm-controller the ability to read every secret in the cluster.

This PR allows users to avoid that restriction by disabling caching of secrets. Now users can deploy HelmReleases that use valuesFrom without having to grant Flux access to more secrets than necessary.

@pjbgf
Copy link
Member

pjbgf commented Aug 12, 2022

Overall, I do see the benefits of offering as an option the capability of decreasing the RBAC requirements in which the controller runs under. As discussed here, by default, even when impersonation is in place, the controller service account still requires list access at cluster-level.

There are some considerations worth taking around the impact of this. But giving Flux users the option to decide based on their priorities and security vs performance trade-offs, does not seem like a terrible idea.

I am just not sure about the API. Potentially, users may want to disable cache for other objects in the future, so --disable-cache-for may provide more flexibility. But, I would be keen on hearing other maintainers take on this.

@mac-chaffee
Copy link
Contributor Author

@pjbgf I was thinking the same thing originally about --disable-cache-for, but then we'd have to parse strings like "core/v1/Secrets" into Kubernetes objects, which didn't seem straightforward to me (although I am a weak golang coder 😅 )

If you know of a helper function that does the job, I'd be happy to include it!

@mac-chaffee
Copy link
Contributor Author

Hello again! Just following up on this. Given the complexity of parsing API types from strings and the lack of a defined use-case for disabling caches for things other than secrets, I think we ought to stick with just --no-cache-secrets

@pjbgf
Copy link
Member

pjbgf commented Dec 13, 2022

@mac-chaffee after some thought on this and discussions with @darkowlzz, we think that ConfigMaps and Secrets could be tied in together, as this can lower the running memory requirements, on top of the RBAC requirements benefits you are looking for.

This has been done in CAPI a while ago and should work well for us too.

Therefore, can you please:

  1. Bundle both ConfigMaps and Secrets as part of this PR.
  2. Add an opt-out Feature Gate instead of a flag: CacheSecretsAndConfigMaps. Source controller is a good example for feature gates in Flux: https://github.com/fluxcd/source-controller/blob/885c9f2cba44bb8a9130ef839a1dfcf0bead2c94/internal/features/features.go#L35-L37.
  3. Rebase the PR.

@mac-chaffee
Copy link
Contributor Author

Sounds good. To clarify, should CacheSecretsAndConfigMaps be set to true by default? or false?

Copy link
Member

@pjbgf pjbgf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mac-chaffee please align the documentation to the other controllers (as per comments below).

internal/features/features.go Outdated Show resolved Hide resolved
internal/features/features.go Show resolved Hide resolved
@pjbgf
Copy link
Member

pjbgf commented Dec 14, 2022

To clarify, should CacheSecretsAndConfigMaps be set to true by default? or false?

false by default looks good, this will help us understand the impact more quickly whilst also providing a way out for users.
I am running a few tests with this version. Can you confirm the RBAC needs aligns with the expectations (as per linked issue)?

@mac-chaffee
Copy link
Contributor Author

Can you confirm the RBAC needs aligns with the expectations (as per linked issue)?

Yes I ran it locally and was unpleasantly surprised to see it connect to my production cluster, but at least I was able to confirm the RBAC was working with the flag enabled/disabled!

@mac-chaffee mac-chaffee changed the title Add ability to disable caching secrets Disable caching of secrets and configmaps Dec 14, 2022
@pjbgf
Copy link
Member

pjbgf commented Dec 14, 2022

Tested this against my setup, all works as expected. In terms of memory footprint, I don't have many secrets and configmaps in the cluster (~105 objects), but it already makes a dent. At a larger cluster this should be even more substantial:

Cache Enabled Cache Disabled
Peak 112MiB 66MiB
Valley 43MiB 17MiB

Copy link
Member

@pjbgf pjbgf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

But would be good to get other maintainer's view too.

Copy link
Contributor

@darkowlzz darkowlzz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few minor comments. Overall, it looks good to me.
I think we should add this to all the other controllers as well.
Also, we need to update the docs about this, similar to how we document such feature gates in source-controller, refer GitRepository docs.
We can add it in the helm-controller v2beta1 docs.

internal/features/features.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
main.go Outdated Show resolved Hide resolved
@darkowlzz darkowlzz requested a review from a team December 15, 2022 17:17
@souleb
Copy link
Member

souleb commented Dec 16, 2022

I really like this addition thanks @mac-chaffee. I agree that this should be added to the other controllers behind feature gate.

@hiddeco
Copy link
Member

hiddeco commented Dec 16, 2022

To sort the issue in the e2e tests, please run make fmt vet. Issue with the fuzz tests is sorted by rebasing.

@mac-chaffee
Copy link
Contributor Author

Oh wait I forgot the docs. Seems we don't have docs for the existing CLI flags, so I can add that too. Thinking the most logical place to put that would be https://github.com/fluxcd/helm-controller/blob/main/docs/spec/README.md instead of the page on helmreleases since spec/README is linked from the top-level README as "controller".

@hiddeco
Copy link
Member

hiddeco commented Dec 19, 2022

@mac-chaffee we do have flag documentation, but they're maintained in the options.md files in the directories found here: https://github.com/fluxcd/website/tree/main/content/en/flux/components

@souleb
Copy link
Member

souleb commented Dec 19, 2022

Is it really fine to have it false by default? This changes the controller behaviour so I would and I wonder what is the impact on a cluster where everything is deployed with Flux.

@mac-chaffee
Copy link
Contributor Author

@mac-chaffee we do have flag documentation, but they're maintained in the options.md files in the directories found here: https://github.com/fluxcd/website/tree/main/content/en/flux/components

Ah thanks. I removed the docs from this PR (we don't want two sources of truth for the cli flags) and will update fluxcd/website later

You can re-enabled caching of secrets by starting the
controller with the argument '--feature-gates=CacheSecretsAndConfigMaps=true'

Signed-off-by: Mac Chaffee <machaffe@renci.org>
@hiddeco
Copy link
Member

hiddeco commented Dec 19, 2022

Is it really fine to have it false by default? This changes the controller behaviour so I would and I wonder what is the impact on a cluster where everything is deployed with Flux.

Think based on the (realistic) assumption that we only need a tiny subset of Secrets in any cluster makes this a fair choice. Helm itself maintains its own (extremely expensive) cache anyway, so in terms of API request overhead this should be negligible.

Copy link
Member

@hiddeco hiddeco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We discussed this among maintainers and agreed this would be beneficial to all controllers. We would like the feature flag (CacheSecretsAndConfigMaps) to be the same across controllers (even if they do not necessarily make use of a ConfigMap at present). For which I will create an umbrella issue.

Thanks a lot @mac-chaffee 🥇

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Flux unnecessarily fetching all secrets in the cluster
5 participants