Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AWS CCM] Permission to create SA token #11368

Merged
merged 1 commit into from
May 14, 2021

Conversation

nckturner
Copy link
Contributor

@nckturner nckturner commented May 2, 2021

Permission to create servcice account tokens

  • We need the ability to create service account token because this is required by clientbuilder/controller-manager framework which we will be using in 1.21.
  • This is required for the CCM to use 1 SA per controller, which follows principle of least privilege and makes audit logs easier to understand

Note: edited to remove change from daemonset -> deployment. I think that change might still be valuable, at least as an option, but should be distinct from this change.

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 2, 2021
@nckturner
Copy link
Contributor Author

/hold

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 2, 2021
@nckturner
Copy link
Contributor Author

I'm not 100% about these yet. I was thinking it might be better to use a deployment so that replicas is independent from master node count, and a deployment of 1 seems to match the kops default of other components like KCM slightly better (although I'm not sure what the governing controller is). This means we will need to disable hostnetwork though, and I notice KCM is running with it enabled, so I'd like to understand whether there's any reason we need to run CCM the same way (getting metrics without a service?). Or, maybe it should be configurable.

Regarding the need for creating the service account token, I have a PR open to move the CCM to use the provided cloud provider libraries, and this, for some reason, requires permissions to create a service account token. We might want to make this permission optional (we shouldn't need it in all cases, like if we're using certificates for client identity, and even if we are using a service account, we aren't the token controller so.. why do we need this permission? ...). I need to look into this one a bit more but wanted to open this as a placeholder.

metadata:
name: aws-cloud-controller-manager
namespace: kube-system
labels:
k8s-app: aws-cloud-controller-manager
spec:
replicas: 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't we want some redundancy for this, like we do for controller-manager?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, maybe start with a default of 2.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can set repicas to 1 or 2 based on the amount of CP nodes here: pkg/model/components/cloudconfiguration.go
You'd just have to add instance groups to the struct. These are created in upup/pkg/fi/cloudup/populate_cluster_spec.go

@johngmyers
Copy link
Member

The concern I have with turning off host network would be whether we would get a dependency loop between this and the CNI.

@rifelpet
Copy link
Member

rifelpet commented May 2, 2021

The advantage of using a DaemonSet is that the level of redundancy for the CCM pod matches the redundancy for the control plane hosts. Users with 1 control plane host get 1 of each control plane pod.

We could also settle on using a template function to set the Deployment replicas to either 1 or 2 based on the number of control plane instance groups.

We'll also want to use topologySpreadConstraints if we do go with a Deployment.

@johngmyers
Copy link
Member

With a topologySpreadConstraints preventing two on the same node we wouldn't have to turn off host network.

@olemarkus
Copy link
Member

The concern I have with turning off host network would be whether we would get a dependency loop between this and the CNI.

I don't think this should be a problem. CNIs shouldn't be dependent on CCM. One thing that may be a concern is if it access the metadata API and IMDSv2 max-hop-limit is set to 1.

The advantage of using a DaemonSet is that the level of redundancy for the CCM pod matches the redundancy for the control plane hosts. Users with 1 control plane host get 1 of each control plane pod.

KCM etc are static pods, which is why they have hostnetworking and why we have higher redundancy than what we need for controllers with leader election.

We could also settle on using a template function to set the Deployment replicas to either 1 or 2 based on the number of control plane instance groups.

Not a huge fan of template functions here technically. But there is a number of controllers/operators where we could use this functionality.

@johngmyers
Copy link
Member

My main concerns for a cyclic dependency would be the volume controller and the route controller. If the CNI can't get nodes to run its operators to get itself running that would be a problem.

@nckturner nckturner force-pushed the aws-ccm-fixes-5-2 branch from 780f239 to 82ca7d7 Compare May 10, 2021 18:44
@k8s-ci-robot k8s-ci-robot added size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 10, 2021
@nckturner nckturner changed the title Improvements to aws ccm yaml [AWS CCM] Permission to create SA token May 10, 2021
@nckturner
Copy link
Contributor Author

I decided to remove the change from Daemonset -> Deployment from this PR, and if we decide its still desired we'll create a separate PR. The more important change is giving the permissions required for the controller's 1.21 changes, where we use the upstream libraries for controller initialization. These libraries (clientbuilder) require SA token create permissions because they create a SA token per controller. This is a good thing, because it gives each controller its own identity, which helps for audit logs, and reduces the blast radius of potential vulnerabilities present in a single controller. SA per controller is enabled/disabled with --use-service-account-credentials. See #33310 and #99291 for more context.

@johngmyers
Copy link
Member

in kubernetes/kubernetes#99291 Jordan Ligget states:

It makes sense for CCM to be able to request tokens specifically for the service accounts for the control loops in the CCM. That is possible to grant in a narrow way (grant "serviceaccounts/tokens" on just the resourceNames for the expected service accounts), but I couldn't find where the CCM role is defined.

But it seems this PR is granting broad "serviceaccounts/tokens" permissions, not just on the resourceNames for the expected service accounts.

@nckturner nckturner force-pushed the aws-ccm-fixes-5-2 branch from 82ca7d7 to 202f590 Compare May 13, 2021 20:49
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels May 13, 2021
@nckturner
Copy link
Contributor Author

Thanks @johngmyers, I've restricted it to be just create token permissions on "node-controller", "service-controller" and "route-controller" resourceNames. This is sufficient for the cloud provider (there are technically 2 node controllers but they both use the same name). [1] [2] [3] [4]

@johngmyers
Copy link
Member

@nckturner Thanks. You need to run hack/update-expected.sh and include those changes in your PR.

* We need the ability to create service account token
  because this is required by clientbuilder/controller-manager
  framework which we will be using in 1.21.
* This is required for the CCM to use 1 SA per controller, which
  follows principle of least privilege and makes audit logs easier
  to understand
* Restricts token creation to resource names "node-controller",
  "service-controller", and "route-controller".
@nckturner nckturner force-pushed the aws-ccm-fixes-5-2 branch from 202f590 to 0239dc1 Compare May 13, 2021 21:17
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 13, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: johngmyers

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 13, 2021
@nckturner
Copy link
Contributor Author

/unhold

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 14, 2021
@k8s-ci-robot k8s-ci-robot merged commit b9382b5 into kubernetes:master May 14, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.22 milestone May 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/addons cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants