Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Security: Protect a ServiceAccount token within a namespace #2962

Closed
jlpettersson opened this issue Jul 17, 2020 · 14 comments
Closed

Security: Protect a ServiceAccount token within a namespace #2962

jlpettersson opened this issue Jul 17, 2020 · 14 comments
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.

Comments

@jlpettersson
Copy link
Member

Feature request

Provide documentation on how to protect sensitive tokens within a CI/CD namespace

Use case

For a CI/CD-system for a larger company, a few things are important:

  • That developers does not have access to production environment
  • That the CI/CD-system does have access to production environment
  • That developer-teams are autonomous - e.g. can tweak their CI/CD Pipeline for their needs

To fulfill the points listed above, some security mechanism needs to be added on-top of a Tekton installation. Using OpenPolicyAgent may potentially be a part of such a solution. It would be good to have a solution documented.

More details:

A CI/CD-system that fulfills the requirements above, may be installed as a Pipeline + Trigger within a namespace. Also a token or a ServiceAccountToken with access to production environment must probably exists within the namespace. Since the namespace contains Secrets/tokens that developers must not have access to, developers must not have access to the namespace - but to the Git repository that is configured to initiate the Trigger.

More specific security problem:

The token or ServiceAccountToken with production access, must only be used for a "verified" deployment ClusterTask. E.g. developers must not be able to create a print-token-task and use the ServiceAccountToken with that task.

@jlpettersson jlpettersson added the kind/feature Categorizes issue or PR as related to a new feature. label Jul 17, 2020
@jlpettersson
Copy link
Member Author

@gabemontero I created this issue to document the security issue that I described in a design doc earlier.

I don't yet know how to solve it, but this is probably a security blocker for us.

@dlorenc
Copy link
Contributor

dlorenc commented Jul 18, 2020

The token or ServiceAccountToken with production access, must only be used for a "verified" deployment ClusterTask. E.g. developers must not be able to create a print-token-task and use the ServiceAccountToken with that task.

This part is kind of hard right now, and I agree might need some thinking. The basic solution for this with RBAC would be to give developers something like this:

  • read access to a set of vetted Tasks
  • create access to TaskRuns

However, the developers can just use TaskRuns with embedded Task definitions, which would not require them to create a new Task.

@skaegi
Copy link
Contributor

skaegi commented Jul 18, 2020

Granting a user the privilege to create a TaskRun or PipelineRun is really no different from granting the privilege to create a Job, Deployment, or StatefulSet. By that I mean that like it or not you've granted the ability to "indirectly" create a pod with any SystemAccount in the namespace and from there act as that user. Maybe OPA would be something that you could use to provide selective access however I've made my peace with this and just separate resources by namespace.

https://kubernetes.io/docs/reference/access-authn-authz/authorization/#privilege-escalation-via-pod-creation

@vdemeester
Copy link
Member

cc @font

@gabemontero
Copy link
Contributor

gabemontero commented Jul 24, 2020

@gabemontero I created this issue to document the security issue that I described in a design doc earlier.

I don't yet know how to solve it, but this is probably a security blocker for us.

Cool thanks @jlpettersson .... yeah I refrained from diving into your specific scenario when I was on the schedule in the API WG since you couldn't make it that day.

As to how to solve it, and admission webhook is the only thing that comes to mind for me. Whether that admission webhook leverages OPA to enforce which SAs are used (OPA integrates into k8s via admission webhooks), or some other specific use case logic in the webhook, could be considered an implementation detail.

@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 22, 2020
@gabemontero
Copy link
Contributor

/remove-lifecycle-stale

I touch upon this scenario, including my protoype to implement things along these lines, in my TEP tektoncd/community#152 , as well as a recent demo that @jlpettersson and I discussed briefly in Tekton slack

Once I complete the TEP and we start documeint how this could be down with say OPA (including pointing to upstream OPA catalogs with "productized" examples, though we are not going to host "productized" policies for any of the catalogs in tekton itself) I think that could be sufficient to close this out, our at least propose so to @jlpettersson

@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Dec 20, 2020
@gabemontero
Copy link
Contributor

/remove-lifecycle rotten

@tekton-robot tekton-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Jan 4, 2021
@tekton-robot
Copy link
Collaborator

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale with a justification.
Stale issues rot after an additional 30d of inactivity and eventually close.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle stale

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 4, 2021
@gabemontero
Copy link
Contributor

@jlpettersson any chance in your opinion my write up of solving this via policy engines like OPA/gatekeeper (or more recently kyverno as well as my feature requests to the to add support for CRDs has landed) is a sufficient enough explanation for saying this item is solved?

I'll qualify that question minimally with the community stance declared in that TEP around deferring RBAC type scenarios to third party policy engine solutions, where adding of "native" RBAC systems of this ilk were rejected.

@tekton-robot
Copy link
Collaborator

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
Rotten issues close after an additional 30d of inactivity.
If this issue is safe to close now please do so with /close with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/lifecycle rotten

Send feedback to tektoncd/plumbing.

@tekton-robot tekton-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 5, 2021
@tekton-robot
Copy link
Collaborator

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

@tekton-robot
Copy link
Collaborator

@tekton-robot: Closing this issue.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen with a justification.
Mark the issue as fresh with /remove-lifecycle rotten with a justification.
If this issue should be exempted, mark the issue as frozen with /lifecycle frozen with a justification.

/close

Send feedback to tektoncd/plumbing.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed.
Projects
None yet
Development

No branches or pull requests

6 participants