Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security Self Assessment: [STRIDE-INFODISCLOSE-1] RFE cluster addons #5491

Closed
fabriziopandini opened this issue Oct 25, 2021 · 12 comments
Closed
Assignees
Labels
area/security Issues or PRs related to security kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/security Categorizes an issue or PR as relevant to SIG Security. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@fabriziopandini
Copy link
Member

fabriziopandini commented Oct 25, 2021

User Story

As a user/operator I would like Cluster API to get a comprehensive solution for Cluster addons lifecycle

Detailed Description

This builds up on #4166 and from recent discussions at SIG level.

There is a set of addons that have a lifecycle strictly linked to the cluster lifecycle managed by CAPI, afterwards Cluster addons.

Some of this addons should be lifecycle managed according with a combination of following requirements that can differ from addon to addon:

  • They inherits some configurations from cluster configuration (e.g service or pod CIDR)
  • They should be created during cluster creation (e.g immediately after the API server is installed)
  • They should be upgraded before/during/after cluster (and more specifically control plane) upgrade
  • There should be support support for out-of band upgrade, e.g. for CVE fix (not linked to the cluster lifecycle)
  • They should be deleted before/during cluster deletion
  • Possibly more...

Current answer for this problem space is CustomResourceSets, but this covers only some of the requirements above.
However this is falling short now that the number and the needs of Cluster addons are growing due to CSI/CPI plugins being moved out of three.

On top of that, most users have their own solution for addon management, and we should consider if/how to integrate with those solutions too

IMPORTANT: we are not seeking to reinvent an addon management solution with this issue, but instead we should focus on finding a way to lifecycle manage addons within the context of CAPI Cluster lifecycle

/kind feature

@k8s-ci-robot k8s-ci-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Oct 25, 2021
@vincepri
Copy link
Member

/milestone v1.1
/priority important-soon

Bucketing in v1.1, although this probably needs more time for proposal, etc

@randomvariable
Copy link
Member

Just adding to this one related to #3176. In EKS, GKE, AKS, VMware TKG-I (formerly Enterprise PKS), and VMware vSphere with Tanzu (formerly TKG-S), workload clusters do not require access to the cloud provider APIs. In the case of everything but TKG-S, cloud provider integrations and CSI are provisioned externally to the cluster and exist as a black box as far as the workload cluster goes. This prevents users of a workload cluster from stealing cloud provider credentials to cause damage on the underlying infrastructure, including elevation of privilege (workload --> management cluster) and arbitrary access to VMs and block storage.

We may want to investigate if certain addons can be run on the management cluster, or whether this should be a feature of CAPN, where CAPI can optionally generate workload cluster kubeconfigs for addons running on the mgmt cluster.

This is noted in the security self-assessment STRIDE-INFODISCLOSE-1.

@randomvariable
Copy link
Member

/area security

@joejulian
Copy link
Contributor

On top of that, most users have their own solution for addon management, and we should consider if/how to integrate with those solutions too.

We have a lot of experience in this space. Integrating with customer addon-tools is problematic as customers can configure those tools however they want and can easily conflict with the needs of an installer tool. We've encountered these problems in the wild with both flux and argocd. Limiting the installer to a specific namespace isn't sufficient as the user may configure their addon management tool to be cluster-wide. Additionally, who manages the CRDs? What about CRD APIVersion skew?

These are a couple of the problems that should be considered when analyzing possible solutions.

@fabriziopandini
Copy link
Member Author

Started Cluster API Addon Orchestration proposal to brainstorm a little bit

@Jont828
Copy link
Contributor

Jont828 commented May 2, 2022

Just to add on, Jack and I have been working on this repo as a prototype based on Fabrizio's add on orchestration proposal.

@PushkarJ
Copy link
Member

/retitle Security Self Assessment: [STRIDE-INFODISCLOSE-1] RFE cluster addons

@k8s-ci-robot k8s-ci-robot changed the title RFE cluster addons Security Self Assessment: [STRIDE-INFODISCLOSE-1] RFE cluster addons May 13, 2022
@PushkarJ
Copy link
Member

/sig security

@k8s-ci-robot k8s-ci-robot added the sig/security Categorizes an issue or PR as relevant to SIG Security. label May 13, 2022
@fabriziopandini fabriziopandini added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini fabriziopandini removed this from the v1.2 milestone Jul 29, 2022
@fabriziopandini fabriziopandini removed the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Jul 29, 2022
@fabriziopandini
Copy link
Member Author

/triage accepted
this is being address by #6905 plus companies using Cluster API have their own solution for addon management

@PushkarJ PTAL to the linked proposal

@k8s-ci-robot k8s-ci-robot added the triage/accepted Indicates an issue or PR is ready to be actively worked on. label Oct 3, 2022
@PushkarJ
Copy link
Member

@fabriziopandini Unfortunately, I do not have bandwidth anymore to go in depth into the linked CAEP but the overall proposal seems to do what is intended in this issue description!

@fabriziopandini
Copy link
Member Author

thanks for the feedback
/close

we can eventually make a follow-up issue if more work is required from a security standpoint

@k8s-ci-robot
Copy link
Contributor

@fabriziopandini: Closing this issue.

In response to this:

thanks for the feedback
/close

we can eventually make a follow-up issue if more work is required from a security standpoint

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/security Issues or PRs related to security kind/feature Categorizes issue or PR as related to a new feature. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. sig/security Categorizes an issue or PR as relevant to SIG Security. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

8 participants