Gatekeeper validatingwebhook stops workloads scheduling #102
-
I have deployed gatekeeper in my cluster but now I can't deploy anything unless I delete the validatingwebhookconfiguration. Not even the replicasets are created
I have set validatingWebhookCheckIgnoreFailurePolicy: "Ignore" in my Helm values and validated this had set. might this be the cause
Unable to query some of the api resources Thanks |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 2 replies
-
Do you have any logs from the K8s API server? If the cause was rejection/failure from webhook, it should show up there. What is the timeout set to for the ValidatingWebhookConfiguration? If you set it to 1 second, does the webhook config's existence still interfere? Can you copy/paste the contents of your constraints (and constraint templates, if they are not library templates)? |
Beta Was this translation helpful? Give feedback.
-
The issue was with my CNI, Calico wasn't starting correctly on the master thus interrupting network traffic to the gatekeeper-webhook service. Fixed the issue with Calico (IP autodetection picking the "wrong" interface on a different subnet for those that are interested). It also fixed an issue I'd been ignoring with Linkerd service-mesh. It still seems that the failure wasn't being ignored though? |
Beta Was this translation helpful? Give feedback.
-
To be clear, I can't say for sure that the problem was cumulative webhook timeouts from the snippet of API server logs provided, but it's the most likely explanation. |
Beta Was this translation helpful? Give feedback.
The issue was with my CNI, Calico wasn't starting correctly on the master thus interrupting network traffic to the gatekeeper-webhook service. Fixed the issue with Calico (IP autodetection picking the "wrong" interface on a different subnet for those that are interested).
It also fixed an issue I'd been ignoring with Linkerd service-mesh.
It still seems that the failure wasn't being ignored though?