Bug 1729242: OCP 4 AWS privilege escalation vulnerability by running pods on masters #524

ravisantoshgudimetla · 2019-07-12T18:35:34Z

Make worker label as defaultNodeSelector to avoid pods getting scheduled to master nodes.

openshift-ci-robot · 2019-07-12T18:35:39Z

@ravisantoshgudimetla: This pull request references a valid Bugzilla bug. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bug 1729242: OCP 4 AWS privilege escalation vulnerability by running pods on masters

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

bindata/v4.1.0/kube-apiserver/defaultconfig.yaml

openshift-ci-robot · 2019-07-12T20:32:42Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ravisantoshgudimetla

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/operator/configobservation/scheduler/OWNERS~~ [ravisantoshgudimetla]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

enj · 2019-07-12T20:44:56Z

pkg/operator/configobservation/scheduler/observe_scheduler.go

@@ -36,7 +39,12 @@ func ObserveDefaultNodeSelector(genericListers configobserver.Listers, recorder
 	if err != nil {
 		return prevObservedConfig, errs
 	}
-
+	if schedulerConfig.Spec == (configv1.SchedulerSpec{}) {


I prefer the defaulting pattern used in:

cluster-kube-apiserver-operator/pkg/operator/configobservation/auth/auth_metadata.go

Line 52 in 3568cbf

authConfig := defaultAuthConfig(authConfigNoDefaults)

enj · 2019-07-12T20:47:17Z

/hold

Per David's request to continue review on Monday.

ravisantoshgudimetla · 2019-07-13T00:09:51Z

/test e2e-aws

openshift-ci-robot · 2019-07-13T06:20:38Z

@ravisantoshgudimetla: The following tests failed, say /retest to rerun them all:

Test name	Commit	Details	Rerun command
ci/prow/e2e-aws-operator	`dd61ee9`	link	`/test e2e-aws-operator`
ci/prow/e2e-aws	`dd61ee9`	link	`/test e2e-aws`
ci/prow/e2e-aws-upgrade	`dd61ee9`	link	`/test e2e-aws-upgrade`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

smarterclayton · 2019-07-14T17:56:34Z

pkg/operator/configobservation/scheduler/observe_scheduler.go

@@ -10,6 +11,8 @@ import (
 	"k8s.io/klog"
 )

+const workerNodeSelector = "node-role.kubernetes.io/worker=\"\""


This is not guaranteed to exist in all clusters.

Node roles are not an api and no business logic should based on node roles.

IIUC, there are 2 ways to distinguish master nodes from worker nodes:

role label

Master taints

So, are you suggesting, we cannot rely on role label which the defaultNodeSelector is based on? IIRC, we used to have something like type=Compute as defaultNodeSelector in 3.11.

This is not to say that we cannot have a validation against pods have master tolerations

@smarterclayton Can you explain what you mean by "Node roles are not an api and no business logic should based on node roles."? It has been common practice with OpenShift in the field to add role labels to nodes and separate workload based on these labels by using node selectors and the openshift.io/node-selector annotation. This has been a very effective design pattern, what is inappropriate about this? Should we encourage use of a different label?

deads2k · 2019-07-15T12:00:29Z

I'm a -1 on resolving the issue this way. The mechanism for steering workloads away from masters is taints and restricting tolerations. I thought this is what PodTolerationRestrictions did, but that plugin is very confused. Mapping tolerations to rbac permissions for serviceaccounts seems more practical.

Doing it like this will restrict our future choices of node topologies

sfowl · 2019-07-16T03:41:17Z

CVE-2019-10200 was assigned to this issue.

jkupferer · 2019-11-22T18:26:09Z

Can we get an update on where we are at on this issue?

We need this resolved both for the potential security impact and also to clarify how users should isolate workload when there is a security concern.

In my experience as an openshift consultant and instructor, taints and tolerations are not well suited for security isolation because most users find these difficult to understand and implement. While it may be possible to resolve this with podTolerationRestrictions, this is likely going to be hopelessly complicated for many OpenShift administrators. Node selectors are much easier for our users and administrators to understand and node selectors on namespaces are easy to audit with automation.

imho, we should label all masters as node-role.kubernetes.io/master everything else as workers, node-role.kubernetes.io/worker, and set the default for the defaultNodeSeloctor to run on workers as this PR was aiming to do. In the case where we want workload to run on the masters, we simply add the worker label to the masters, so it really shouldn't limit node topologies.

soltysh · 2019-12-12T11:28:42Z

I think https://bugzilla.redhat.com/show_bug.cgi?id=1730165#c8 has the answer. In short version 2 of https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html should be able to solve the issue.
/close

openshift-ci-robot · 2019-12-12T11:28:44Z

@soltysh: Closed this PR.

In response to this:

I think https://bugzilla.redhat.com/show_bug.cgi?id=1730165#c8 has the answer. In short version 2 of https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/configuring-instance-metadata-service.html should be able to solve the issue.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot requested review from deads2k, enj and sttts July 12, 2019 18:35

openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Jul 12, 2019

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Jul 12, 2019

deads2k reviewed Jul 12, 2019

View reviewed changes

bindata/v4.1.0/kube-apiserver/defaultconfig.yaml Outdated Show resolved Hide resolved

ravisantoshgudimetla force-pushed the defaultNodeSelector branch from 05d3c6b to 001e461 Compare July 12, 2019 20:32

openshift-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Jul 12, 2019

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 12, 2019

ravisantoshgudimetla force-pushed the defaultNodeSelector branch from 001e461 to 3568cbf Compare July 12, 2019 20:41

enj reviewed Jul 12, 2019

View reviewed changes

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 12, 2019

Set worker label in case of no scheduler spec

dd61ee9

ravisantoshgudimetla force-pushed the defaultNodeSelector branch from 3568cbf to dd61ee9 Compare July 13, 2019 03:54

smarterclayton suggested changes Jul 14, 2019

View reviewed changes

sttts assigned ingvagabund Nov 25, 2019

openshift-ci-robot closed this Dec 12, 2019

tatianab mentioned this pull request Nov 8, 2023

x/vulndb: potential Go vuln in github.com/openshift/cluster-kube-apiserver-operator: CVE-2019-10200 golang/vulndb#2225

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug 1729242: OCP 4 AWS privilege escalation vulnerability by running pods on masters #524

Bug 1729242: OCP 4 AWS privilege escalation vulnerability by running pods on masters #524

ravisantoshgudimetla commented Jul 12, 2019

openshift-ci-robot commented Jul 12, 2019

openshift-ci-robot commented Jul 12, 2019

enj Jul 12, 2019

enj commented Jul 12, 2019

ravisantoshgudimetla commented Jul 13, 2019

openshift-ci-robot commented Jul 13, 2019

smarterclayton Jul 14, 2019 •

edited

Loading

ravisantoshgudimetla Jul 15, 2019 •

edited

Loading

jkupferer Aug 7, 2019

deads2k commented Jul 15, 2019 •

edited

Loading

sfowl commented Jul 16, 2019

jkupferer commented Nov 22, 2019

soltysh commented Dec 12, 2019

openshift-ci-robot commented Dec 12, 2019

Bug 1729242: OCP 4 AWS privilege escalation vulnerability by running pods on masters #524

Bug 1729242: OCP 4 AWS privilege escalation vulnerability by running pods on masters #524

Conversation

ravisantoshgudimetla commented Jul 12, 2019

openshift-ci-robot commented Jul 12, 2019

openshift-ci-robot commented Jul 12, 2019

enj Jul 12, 2019

Choose a reason for hiding this comment

enj commented Jul 12, 2019

ravisantoshgudimetla commented Jul 13, 2019

openshift-ci-robot commented Jul 13, 2019

smarterclayton Jul 14, 2019 • edited Loading

Choose a reason for hiding this comment

ravisantoshgudimetla Jul 15, 2019 • edited Loading

Choose a reason for hiding this comment

jkupferer Aug 7, 2019

Choose a reason for hiding this comment

deads2k commented Jul 15, 2019 • edited Loading

sfowl commented Jul 16, 2019

jkupferer commented Nov 22, 2019

soltysh commented Dec 12, 2019

openshift-ci-robot commented Dec 12, 2019

smarterclayton Jul 14, 2019 •

edited

Loading

ravisantoshgudimetla Jul 15, 2019 •

edited

Loading

deads2k commented Jul 15, 2019 •

edited

Loading