Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expanded resource not evaluated for constraint #2485

Closed
ctml91 opened this issue Jan 3, 2023 · 11 comments · Fixed by #2529 or #2552
Closed

Expanded resource not evaluated for constraint #2485

ctml91 opened this issue Jan 3, 2023 · 11 comments · Fixed by #2529 or #2552
Labels
bug Something isn't working

Comments

@ctml91
Copy link

ctml91 commented Jan 3, 2023

What steps did you take and what happened:
Enable feature in helm chart via enableGeneratorResourceExpansion: true, confirm it is enabled by inspecting deployment container args for audit + webhook.

$ kubectl get deploy -o yaml | grep -i expansion
          - --enable-generator-resource-expansion=true
          - --enable-generator-resource-expansion=true


{"level":"info","ts":1672729062.7214437,"msg":"Starting Controller","controller":"expansion-template-controller"}
{"level":"info","ts":1672729062.7214608,"msg":"Starting workers","controller":"expansion-template-controller","worker count":1}
{"level":"info","ts":1672729062.7221086,"msg":"Starting EventSource","controller":"expansion-template-controller","source":"kind source: *v1alpha1.ExpansionTemplate"}
{"level":"info","ts":1672729062.8233595,"logger":"controller","msg":"Reconcile","kind":"ExpansionTemplate","process":"template_expansion_controller","request":"/expand-deployments","namespace":"","name":"expand-deployments"}
{"level":"info","ts":1672729062.823413,"logger":"controller","msg":"upserted template expansion","kind":"ExpansionTemplate","process":"template_expansion_controller","template name":"expand-deployments"}
{"level":"info","ts":1672729333.838785,"logger":"controller","msg":"Reconcile","kind":"ExpansionTemplate","process":"template_expansion_controller","request":"/expand-deployments","namespace":"","name":"expand-deployments"}
{"level":"info","ts":1672729333.8388343,"logger":"controller","msg":"upserted template expansion","kind":"ExpansionTemplate","process":"template_expansion_controller","template name":"expand-deployments"}

Configure the source field on a constraint template under the match criteria but the field is ignored/removed when applying edit.

Since docs say default is "All" for match source, apply update anyway to a Deployment which should result in a violation, the update succeeds. When scaling up the same deployment a violation is created on the pod directly instead of the generated resource.

What did you expect to happen:
The docs state the source field on the match API, present in the Mutation and ConstraintTemplate kinds so I assume it should be possible to configure the .spec.match.source field on a constraint and the violation should occur when updating the deployment since it is part of the ExpansionTemplate configured (I used the example from docs but changed enforcementAction to deny).

https://open-policy-agent.github.io/gatekeeper/website/docs/expansion/#match-source

https://open-policy-agent.github.io/gatekeeper/website/docs/expansion/#expansiontemplate

Anything else you would like to add:
The source field doesn't seem to be present in the constraint CRDs generated by gatekeeper. The source field under .spec.match disappears after applying an update to a constraint (or creating a fresh one). ExpansionTemplate doesn't seem to be evaluated for some reason.

  • Gatekeeper version:Gatekeeper v3.10.0
  • Kubernetes version: (use kubectl version): v1.23.5
@ctml91 ctml91 added the bug Something isn't working label Jan 3, 2023
@ritazh
Copy link
Member

ritazh commented Jan 3, 2023

@davis-haba can you PTAL?

@maxsmythe
Copy link
Contributor

maxsmythe commented Jan 4, 2023

One issue with the docs source should be present on the constraint, not the constraint template.

It looks like source is missing from the match schema for the constraint target:

https://github.com/open-policy-agent/gatekeeper/blob/master/pkg/target/match_schema.go

If possible, we should auto-generate that file from https://github.com/open-policy-agent/gatekeeper/blob/bed335f5c220e962922c887631a497f5bebe89ae/pkg/mutation/match/match.go

To avoid future code duplication issues.

Though a missing source field should default to All:

// An empty 'source' field will default to 'All'
if mSrc == "" {
mSrc = types.SourceTypeDefault
} else if !types.IsValidSource(mSrc) {
return false, fmt.Errorf("invalid source field %q", mSrc)
}

which means it should not lead to non-enforcement.

Can you post the contents of (I realize you posted links for some of this but actual data from the cluster is more authoritative):

  • The ExpansionTemplate you used
  • The constraint template you used
  • The constraint you used
  • The deployment for which you were expecting a violation

?

@maxsmythe
Copy link
Contributor

(actual resources as taken from the cluster with their status fields would be most helpful)

@ctml91
Copy link
Author

ctml91 commented Jan 4, 2023

(actual resources as taken from the cluster with their status fields would be most helpful)

Config

apiVersion: config.gatekeeper.sh/v1alpha1
kind: Config
metadata:
  name: config
  namespace: gatekeeper-system
spec:
  match:
  - excludedNamespaces:
    - kube-*
    - openshift-*
    processes:
    - audit
    - webhook
    - sync
  sync:
    syncOnly:
    - group: apps
      kind: Deployment
      version: v1
    - group: apps
      kind: StatefulSet
      version: v1
    - group: apps.openshift.io
      kind: DeploymentConfig
      version: v1
    - group: policy
      kind: PodDisruptionBudget
      version: v1
    - group: ""
      kind: Service
      version: v1

ExpansionTemplate

apiVersion: expansion.gatekeeper.sh/v1alpha1
kind: ExpansionTemplate
metadata:
  name: expand-deployments
spec:
  applyTo:
    - groups: ["apps"]
      kinds: ["DaemonSet", "Deployment", "Job", "ReplicaSet", "ReplicationController", "StatefulSet"]
      versions: ["v1"]
  templateSource: "spec.template"
  enforcementAction: "deny"
  generatedGVK:
    kind: "Pod"
    group: ""
    version: "v1"

ConstraintTemplate

apiVersion: templates.gatekeeper.sh/v1
kind: ConstraintTemplate
metadata:
  name: k8srequiredprobes
  annotations:
    description: Requires Pods to have readiness and/or liveness probes.
spec:
  crd:
    spec:
      names:
        kind: K8sRequiredProbes
      validation:
        openAPIV3Schema:
          type: object
          properties:
            probes:
              description: "A list of probes that are required (ex: `readinessProbe`)"
              type: array
              items:
                type: string
            probeTypes:
              description: "The probe must define a field listed in `probeType` in order to satisfy the constraint (ex. `tcpSocket` satisfies `['tcpSocket', 'exec']`)"
              type: array
              items:
                type: string
  targets:
    - target: admission.k8s.gatekeeper.sh
      rego: |
        package k8srequiredprobes

        probe_type_set = probe_types {
            probe_types := {type | type := input.parameters.probeTypes[_]}
        }

        violation[{"msg": msg}] {
            container := input.review.object.spec.containers[_]
            probe := input.parameters.probes[_]
            probe_is_missing(container, probe)
            msg := get_violation_message(container, input.review, probe)
        }

        probe_is_missing(ctr, probe) = true {
            not ctr[probe]
        }

        probe_is_missing(ctr, probe) = true {
            probe_field_empty(ctr, probe)
        }

        probe_field_empty(ctr, probe) = true {
            probe_fields := {field | ctr[probe][field]}
            diff_fields := probe_type_set - probe_fields
            count(diff_fields) == count(probe_type_set)
        }

        get_violation_message(container, review, probe) = msg {
            msg := sprintf("Container <%v> in your <%v> <%v> has no <%v>", [container.name, review.kind.kind, review.object.metadata.name, probe])
        }

Constraint

apiVersion: constraints.gatekeeper.sh/v1beta1
kind: K8sRequiredProbes
metadata:
  name: must-have-probes
spec:
  enforcementAction: deny
  match:
    source: Generated
    kinds:
      - apiGroups: [""]
        kinds: ["Pod"]
    namespaces:
      - "default"
  parameters:
    probes: ["readinessProbe"]
    probeTypes: ["tcpSocket", "httpGet", "exec"]

Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
  namespace: default
  labels:
    app: nginx
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

Helm values

replicas: 1
auditInterval: 60
metricsBackends: ["prometheus"]
auditMatchKindOnly: true
constraintViolationsLimit: 20
auditFromCache: false
disableMutation: false
disableValidatingWebhook: false
validatingWebhookTimeoutSeconds: 5
validatingWebhookFailurePolicy: Ignore
validatingWebhookExemptNamespacesLabels: {}
validatingWebhookObjectSelector: {}
validatingWebhookCheckIgnoreFailurePolicy: Fail
validatingWebhookCustomRules: {}
enableDeleteOperations: false
enableExternalData: false
enableTLSHealthcheck: false
enableGeneratorResourceExpansion: true
mutatingWebhookFailurePolicy: Ignore
mutatingWebhookReinvocationPolicy: Never
mutatingWebhookExemptNamespacesLabels: {}
mutatingWebhookObjectSelector: {}
mutatingWebhookTimeoutSeconds: 3
mutatingWebhookCustomRules: {}
mutationAnnotations: true
auditChunkSize: 500
logLevel: DEBUG
logDenies: true
logMutations: true
emitAdmissionEvents: true
emitAuditEvents: true
resourceQuota: false
postUpgrade:
  labelNamespace:
    enabled: false
    image:
      repository: openpolicyagent/gatekeeper-crds
      tag: v3.10.0
      pullPolicy: IfNotPresent
      pullSecrets: []
    extraNamespaces: []
  securityContext: {}
postInstall:
  labelNamespace:
    extraRules:
    - apiGroups:
      - security.openshift.io
      resourceNames:
      - anyuid
      resources:
      - securitycontextconstraints
      verbs:
      - use
    enabled: true
    image:
      repository: openpolicyagent/gatekeeper-crds
      tag: v3.10.0
      pullPolicy: IfNotPresent
      pullSecrets: []
    extraNamespaces: []
  probeWebhook:
    enabled: true
    image:
      repository: curlimages/curl
      tag: 7.83.1
      pullPolicy: IfNotPresent
      pullSecrets: []
    waitTimeout: 60
    httpTimeout: 2
    insecureHTTPS: false
  securityContext: {}
preUninstall:
  deleteWebhookConfigurations:
    extraRules:
    - apiGroups:
      - security.openshift.io
      resourceNames:
      - anyuid
      resources:
      - securitycontextconstraints
      verbs:
      - use
    enabled: false
    image:
      repository: openpolicyagent/gatekeeper-crds
      tag: v3.10.0
      pullPolicy: IfNotPresent
      pullSecrets: []
  securityContext: {}
image:
  repository: openpolicyagent/gatekeeper
  crdRepository: openpolicyagent/gatekeeper-crds
  release: v3.10.0
  pullPolicy: IfNotPresent
  pullSecrets: []
podAnnotations: {}
podLabels: {}
podCountLimit: 100
secretAnnotations: {}
enableRuntimeDefaultSeccompProfile: false
controllerManager:
  exemptNamespaces: []
  exemptNamespacePrefixes: []
  hostNetwork: false
  dnsPolicy: ClusterFirst
  port: 8443
  metricsPort: 8888
  healthPort: 9090
  priorityClassName: system-cluster-critical
  disableCertRotation: false
  affinity:
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
        - podAffinityTerm:
            labelSelector:
              matchExpressions:
                - key: gatekeeper.sh/operation
                  operator: In
                  values:
                    - webhook
            topologyKey: kubernetes.io/hostname
          weight: 100
  tolerations: []
  nodeSelector: {kubernetes.io/os: linux}
  resources:
    limits:
      cpu: 1000m
      memory: 5Gi
    requests:
      cpu: 100m
      memory: 2Gi
  securityContext: {}
  podSecurityContext: {}
  extraRules:
  - apiGroups:
    - security.openshift.io
    resourceNames:
    - anyuid
    resources:
    - securitycontextconstraints
    verbs:
    - use
audit:
  hostNetwork: false
  dnsPolicy: ClusterFirst
  metricsPort: 8888
  healthPort: 9090
  priorityClassName: system-cluster-critical
  disableCertRotation: true
  affinity: {}
  tolerations: []
  nodeSelector: {kubernetes.io/os: linux}
  writeToRAMDisk: false
  resources:
    limits:
      cpu: 1000m
      memory: 3Gi
    requests:
      cpu: 100m
      memory: 2Gi
  securityContext: {}
  podSecurityContext: {}
  extraRules:
  - apiGroups:
    - security.openshift.io
    resourceNames:
    - anyuid
    resources:
    - securitycontextconstraints
    verbs:
    - use
crds:
  resources: {}
  securityContext: {}
pdb:
  controllerManager:
    minAvailable: 1
service: {}
disabledBuiltins: ["{http.send}"]
psp:
  enabled: false
upgradeCRDs:
  enabled: true
  tolerations: []
  extraRules:
  - apiGroups:
    - security.openshift.io
    resourceNames:
    - anyuid
    resources:
    - securitycontextconstraints
    verbs:
    - use
rbac:
  create: true

@maxsmythe
Copy link
Contributor

Thanks for the data!

I think the issue is that the expanded resource has no metadata.name field, which causes this bit of Rego in the constraint template not to match:

msg := sprintf("Container <%v> in your <%v> <%v> has no <%v>", [container.name, review.kind.kind, review.object.metadata.name, probe])

specifically:

review.object.metadata.name

We should probably have something that generates object names, since I think this will be a common problem. In the interim, you could try:

  • removing the reference to metadata.name (and it's corresponding %v formatting directive)

or

  • adding a spec.template.metadata.name field to the deployment

If either of those work, we're on the right track.

@ctml91
Copy link
Author

ctml91 commented Jan 4, 2023

@maxsmythe you're right that did the trick! just my luck I picked one of the few policies that uses the objects name. I'm trying to think of what the best solution for that would be.

  • general disclaimer in the docs that constraints evaluated by Expansion controller cannot reference the expanded objects name or any other fields that won't be available during policy evaluation
  • generate a mock name to be included with the mock admission review
    • if policies reference the name, they usually also reference the kind so would need to keep that in mind. Initially I though why not substitute the parents name but it wouldn't make sense having the violation message [Implied by expand-deployments] container <name> in your <Pod> <my-deployment-name> has no <readinessProbe>
    • [Implied by expand-deployments] container <name> in your <Pod> <generated-by-deployment> has no <readinessProbe>
  • some kind of expansion test when constraint templates are applied that throw an error if the above condition will be hit

I didn't notice anything in the logs when this was occurring, perhaps there is an opportunity somewhere in the expansion controller to log something about the activity to make troubleshooting easier.

@ritazh
Copy link
Member

ritazh commented Jan 5, 2023

generate a mock name to be included with the mock admission review

+1 as there's less impact to policy authors and it further helps identify that violation is caused by an expanded/generated resource.

@ctml91
Copy link
Author

ctml91 commented Jan 5, 2023

It looks like source is missing from the match schema for the constraint target:

Would a suitable workaround until this is fixed to rely on the enforcementAction in the ExpansionTemplate to override the Constraints enforcement policy, and then set the Constraint to dry-run with the goal of only generation violations for the generated resources to reduce the number of violations.

Optionally, an enforcement action override can be used when validating resultant resources. If this field is set, any violations against the resultant resource will use this enforcement action. If an enforcement action is not specified by the ExpansionTemplate, the enforcement action set by the Constraint in violation will be used.

Edit: Nvm it looks like dryrun still generates violations & events for the non-expanded objects in dryrun, so I guess I will need to wait for an official patch.

@maxsmythe
Copy link
Contributor

Edit: Nvm it looks like dryrun still generates violations & events for the non-expanded objects in dryrun, so I guess I will need to wait for an official patch.

Yeah, I don't think there is currently a workaround for split enforcement action.

+1 as there's less impact to policy authors and it further helps identify that violation is caused by an expanded/generated resource.

Yeah, one thought: since deployments create pods with the deployment name as its prefix, using the parent resource name as a default value could play well with prefix-based name matching.

@stale
Copy link

stale bot commented Mar 6, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.

@stale stale bot added the stale label Mar 6, 2023
@ctml91
Copy link
Author

ctml91 commented Mar 6, 2023

Still active. The PRs with fixes are not merged yet.

@stale stale bot removed the stale label Mar 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants