-
Notifications
You must be signed in to change notification settings - Fork 784
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gatekeeper should fail safe during audit operation #163
Comments
Fail safe should applicable in green field (create) operation as well. I haven't tested create scenario. |
What is the significance of the links at the bottom of this bug? Is it possible to post the logs? Without more context, I'm not sure exactly what a "bad pod" is or how it's causing a failure. |
Also, please define "fail safe". What is an example desired behavior? |
@maxsmythe , Regd. fail safe,
Follow below steps to repro this issue:
|
Thanks for the clarifications. Is this the error in the logs?
|
@maxsmythe Yes, I saw similar log for audit operation. |
The failure is coming from the CPU spec. @tsandall this is a good example of an appropriate use of fail-fast/fail-loudly from my view. Unfortunately for the audit use case, we'd probably want some way to recover from the error and report that it happened, so that other audit responses could be returned. Had this failed silently, the CPU constraint would simply never apply in these cases, and we may never have caught that this edge case was occurring. @RamyasreeChakka I'm not sure Rego has a device to handle exceptions such as these. |
@tsandall another thing I noticed while digging into the to_number source code is that it casts to float. Are we worried about a loss of accuracy ever? The values of CPU/memory are meant to be integers to avoid rounding errors. @RamyasreeChakka in the interim, #167 should fix this specific error, but I think a more general construct is necessary. |
I will look into this soon. |
On a Kubernetes cluster, I had a bad pod https://raw.githubusercontent.com/RamyasreeChakka/RegoPolicy/master/GateKeeperV3/container-resource-limits/ContainerResourceLimits-bad2.yaml which has container CPU resource limit incorrectly specified.
I installed below templates and was trying to test audit feature - didn't observe any audit violations reported for a long time. Later @ritazh helped me debug and from the Gatekeeper logs, looks like audit function was bailing out with an exception due to a bad pod.
Gatekeeper should fail safe - should ignore bad policy, bad Kubernetes objects.
https://github.com/open-policy-agent/gatekeeper/blob/master/demo/agilebank/templates/k8scontainterlimits_template.yaml
https://github.com/open-policy-agent/gatekeeper/blob/master/demo/agilebank/templates/k8srequiredlabels_template.yaml
https://github.com/open-policy-agent/gatekeeper/blob/master/demo/agilebank/templates/k8sallowedrepos_template.yaml
The text was updated successfully, but these errors were encountered: