Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io" #5401

aduncmj · 2020-04-19T15:18:10Z

Hi all，

When I apply the ingress's configuration file named ingress-myapp.yaml by command kubectl apply -f ingress-myapp.yaml, there was an error. The complete error is as follows：

Error from server (InternalError): error when creating "ingress-myapp.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded

This is my ingress:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: ingress-myapp
  namespace: default
  annotations: 
    kubernetes.io/ingress.class: "nginx"
spec:
  rules: 
  - host: myapp.magedu.com
    http:
      paths:
      - path: 
        backend: 
          serviceName: myapp
          servicePort: 80

Has anyone encountered this problem？

The text was updated successfully, but these errors were encountered:

moljor · 2020-04-21T07:31:13Z

Hi,

I have.

The validatingwebhook service is not reachable in my private GKE cluster. I needed to open the 8443 port from the master to the pods.
On top of that, I then received a certificate error on the endpoint "x509: certificate signed by unknown authority". To fix this, I needed to include the caBundle from the generated secret in the validatingwebhookconfiguration.

A quick fix if you don't want to do the above and have the webhook fully operational is to remove the validatingwebhookconfiguration or setting the failurePolicy to Ignore.

I believe some fixes are needed in the deploy/static/provider/cloud/deploy.yaml as the webhooks will not always work out of the box.

moljor · 2020-04-21T08:13:59Z

A quick update on the above, the certificate error should be managed by the patch job that exists in the deployment so that part should be a non-issue.
Only the port 8443 needed to be opened from master to pods for me.

Cspellz · 2020-04-22T12:33:26Z

A quick update on the above, the certificate error should be managed by the patch job that exists in the deployment so that part should be a non-issue.
Only the port 8443 needed to be opened from master to pods for me.

Hi, I am a beginner in setting a k8s and ingress.
I am facing a similar issue. But more in a baremetal scenario. It would be very grateful if you can please share more details on what you mean by 'opening a port between master and pods'?

Update:
sorry, as I said, I am new to this. I checked there is a service (ingress-nginx-controller-admission) which is exposed to node 433 running from the ingress-nginx namespace. And for some reason my ingress resource trying to run from default namespace is not able to communicate to it. Please suggest on how I can resolve this.

error is :
Error from server (InternalError): error when creating "test-nginx-ingress.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded

johan-lejdung · 2020-04-27T12:16:51Z

I'm also facing this issue, on a fresh cluster from AWS where I only did

helm install nginx-ing ingress-nginx/ingress-nginx --set rbac.create=true

And deployed a react service (which I can port-forward to and it works fine).

I then tried to apply both my own ingress and the example ingress

  apiVersion: networking.k8s.io/v1beta1
  kind: Ingress
  metadata:
    annotations:
      kubernetes.io/ingress.class: nginx
    name: example
    namespace: foo
  spec:
    rules:
      - host: www.example.com
        http:
          paths:
            - backend:
                serviceName: exampleService
                servicePort: 80
              path: /

I'm getting this error:

Error from server (InternalError): error when creating "k8s/ingress/test.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://nginx-ing-ingress-nginx-controller-admission.default.svc:443/extensions/v1beta1/ingresses?timeout=30s: stream error: stream ID 7; INTERNAL_ERROR

I traced it down to this loc by looking at the logs in the controller:
https://github.com/kubernetes/ingress-nginx/blob/master/internal/ingress/controller/controller.go#L532

Logs:

I0427 11:52:35.894902       6 server.go:61] handling admission controller request /extensions/v1beta1/ingresses?timeout=30s
2020/04/27 11:52:35 http2: panic serving 172.31.16.27:39304: runtime error: invalid memory address or nil pointer dereference
goroutine 2514 [running]:
net/http.(*http2serverConn).runHandler.func1(0xc00000f2c0, 0xc0009a9f8e, 0xc000981980)
	/home/ubuntu/.gimme/versions/go1.14.2.linux.amd64/src/net/http/h2_bundle.go:5713 +0x16b
panic(0x1662d00, 0x27c34c0)
	/home/ubuntu/.gimme/versions/go1.14.2.linux.amd64/src/runtime/panic.go:969 +0x166
k8s.io/ingress-nginx/internal/ingress/controller.(*NGINXController).getBackendServers(0xc000119a40, 0xc00000f308, 0x1, 0x1, 0x187c833, 0x1b, 0x185e388, 0x0, 0x185e388, 0x0)
	/tmp/go/src/k8s.io/ingress-nginx/internal/ingress/controller/controller.go:532 +0x6d2
k8s.io/ingress-nginx/internal/ingress/controller.(*NGINXController).getConfiguration(0xc000119a40, 0xc00000f308, 0x1, 0x1, 0x1, 0xc00000f308, 0x0, 0x1, 0x0)
	/tmp/go/src/k8s.io/ingress-nginx/internal/ingress/controller/controller.go:402 +0x80
k8s.io/ingress-nginx/internal/ingress/controller.(*NGINXController).CheckIngress(0xc000119a40, 0xc000bfc300, 0x50a, 0x580)
	/tmp/go/src/k8s.io/ingress-nginx/internal/ingress/controller/controller.go:228 +0x2c9
k8s.io/ingress-nginx/internal/admission/controller.(*IngressAdmission).HandleAdmission(0xc0002d4fb0, 0xc000943080, 0x7f8ffce8b1b8, 0xc000942ff0)
	/tmp/go/src/k8s.io/ingress-nginx/internal/admission/controller/main.go:73 +0x924
k8s.io/ingress-nginx/internal/admission/controller.(*AdmissionControllerServer).ServeHTTP(0xc000219820, 0x1b05080, 0xc00000f2c0, 0xc000457d00)
	/tmp/go/src/k8s.io/ingress-nginx/internal/admission/controller/server.go:70 +0x229
net/http.serverHandler.ServeHTTP(0xc000119ce0, 0x1b05080, 0xc00000f2c0, 0xc000457d00)
	/home/ubuntu/.gimme/versions/go1.14.2.linux.amd64/src/net/http/server.go:2807 +0xa3
net/http.initALPNRequest.ServeHTTP(0x1b07440, 0xc00067f170, 0xc0002dc700, 0xc000119ce0, 0x1b05080, 0xc00000f2c0, 0xc000457d00)
	/home/ubuntu/.gimme/versions/go1.14.2.linux.amd64/src/net/http/server.go:3381 +0x8d
net/http.(*http2serverConn).runHandler(0xc000981980, 0xc00000f2c0, 0xc000457d00, 0xc000a81480)
	/home/ubuntu/.gimme/versions/go1.14.2.linux.amd64/src/net/http/h2_bundle.go:5720 +0x8b
created by net/http.(*http2serverConn).processHeaders
	/home/ubuntu/.gimme/versions/go1.14.2.linux.amd64/src/net/http/h2_bundle.go:5454 +0x4e1

Any ideas? Seems strange to get this on a newly setup cluster where I followed the instructions correctly.

johan-lejdung · 2020-04-27T12:24:33Z

I might have solved it..

I followed this guide for the helm installation: https://kubernetes.github.io/ingress-nginx/deploy/

But when I followed this guide instead: https://docs.nginx.com/nginx-ingress-controller/installation/installation-with-helm/

The error doesn't occur.

If you have this issue try it out by deleting your current helm installation.

Get the name:

helm list

Delete and apply stable release:

helm delete <release-name>
helm repo add nginx-stable https://helm.nginx.com/stable
helm install nginx-ing nginx-stable/nginx-ingress

aledbf · 2020-04-27T13:34:34Z

@johan-lejdung not really, that is a different ingress controller.

s977120 · 2020-05-01T07:17:07Z

@aledbf I use 0.31.1 still has same problem

bash-5.0$ /nginx-ingress-controller --version
-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       0.31.1
  Build:         git-b68839118
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.17.10

-------------------------------------------------------------------------------

Error: UPGRADE FAILED: failed to create resource: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded

nicholaspier · 2020-05-01T07:22:12Z

@aledbf Same error. Bare-metal installation.

-------------------------------------------------------------------------------
NGINX Ingress controller
  Release:       0.31.1
  Build:         git-b68839118
  Repository:    https://github.com/kubernetes/ingress-nginx
  nginx version: nginx/1.17.10

-------------------------------------------------------------------------------

Error from server (InternalError): error when creating "./**ommitted**.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded

aledbf · 2020-05-01T14:59:52Z

I added a note about the webhook port in https://kubernetes.github.io/ingress-nginx/deploy/ and the links for the additional steps in GKE

AbbetWang · 2020-05-04T11:38:02Z

i still have the problem

update

i disable the webhook, the error go away

fix workaround

helm install my-release ingress-nginx/ingress-nginx
--set controller.service.type=NodePort
--set controller.admissionWebhooks.enabled=false

Caution!!!! it's may not resolve the issue properly.

now status

use helm 3
helm install my-release ingress-nginx/ingress-nginx
--set controller.service.type=NodePort

exec kubectl get svc,pods

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/a-service ClusterIP 10.105.159.98 80/TCP 28h
service/b-service ClusterIP 10.106.17.65 80/TCP 28h
service/kubernetes ClusterIP 10.96.0.1 443/TCP 3d4h
service/my-release-ingress-nginx-controller NodePort 10.97.224.8 80:30684/TCP,443:32294/TCP 111m
service/my-release-ingress-nginx-controller-admission ClusterIP 10.101.44.242 443/TCP 111m

NAME READY STATUS RESTARTS AGE
pod/a-deployment-84dcd8bbcc-tgp6d 1/1 Running 0 28h
pod/b-deployment-f649cd86d-7ss9f 1/1 Running 0 28h
pod/configmap-pod 1/1 Running 0 54m
pod/configmap-pod-1 1/1 Running 0 3h33m
pod/my-release-ingress-nginx-controller-7859896977-bfrxp 1/1 Running 0 111m
pod/redis 1/1 Running 1 6h11m
pod/test 1/1 Running 1 5h9m

my ingress.yaml

apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
annotations:
kubernetes.io/ingress.class: nginx
name: example

namespace: foo

spec:
rules:
- host: b.abbetwang.top
http:
paths:
- path: /b
backend:
serviceName: b-service
servicePort: 80
- path: /a
backend:
serviceName: a-service
servicePort: 80

tls:
- hosts:
- b.abbetwang.top

what I Do

when i run kubectl apply -f new-ingress.yaml
i got Failed calling webhook, failing closed validate.nginx.ingress.kubernetes.io:

my apiserver log blow:

I0504 06:22:13.286582 1 trace.go:116] Trace[1725513257]: "Create" url:/apis/networking.k8s.io/v1beta1/namespaces/default/ingresses,user-agent:kubectl/v1.18.2 (linux/amd64) kubernetes/52c56ce,client:192.168.0.133 (started: 2020-05-04 06:21:43.285686113 +0000 UTC m=+59612.475819043) (total time: 30.000880829s):
Trace[1725513257]: [30.000880829s] [30.000785964s] END
W0504 09:21:19.861015 1 watcher.go:199] watch chan error: etcdserver: mvcc: required revision has been compacted
W0504 09:31:49.897548 1 watcher.go:199] watch chan error: etcdserver: mvcc: required revision has been compacted
I0504 09:36:17.637753 1 trace.go:116] Trace[615862040]: "Call validating webhook" configuration:my-release-ingress-nginx-admission,webhook:validate.nginx.ingress.kubernetes.io,resource:networking.k8s.io/v1beta1, Resource=ingresses,subresource:,operation:CREATE,UID:41f47c75-9ce1-49c0-a898-4022dbc0d7a1 (started: 2020-05-04 09:35:47.637591858 +0000 UTC m=+71256.827724854) (total time: 30.000128816s):
Trace[615862040]: [30.000128816s] [30.000128816s] END
W0504 09:36:17.637774 1 dispatcher.go:133] Failed calling webhook, failing closed validate.nginx.ingress.kubernetes.io: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://my-release-ingress-nginx-controller-admission.default.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded

eltonbfw · 2020-05-05T13:38:37Z

Why close this issue? What is the solution?

aledbf · 2020-05-05T14:01:21Z

@eltonbfw update to 0.32.0 and make sure the API server can reach the POD running the ingress controller

cnlong · 2020-05-15T03:27:38Z

@eltonbfw update to 0.32.0 and make sure the API server can reach the POD running the ingress controller

I have the same problem,and i use 0.32.0.
What's the solution?
Pleast, thanks!

nicholaspier · 2020-05-15T12:10:47Z

For the specific issue, my problem did turn out to be an issue with internal communication. @aledbf added notes to the documentation to verify connectivity. I had internal communication issues caused by Centos 8's move to nftables. In my case, I needed additional "rich" allow rules in firewalld for:

Docker network source (172.17.0.0/16)
CNI CIDR source
Cluster CIDR source
Host IP source
Masquerading

andrei-matei · 2020-05-25T16:52:20Z

I have the same issue, baremetal install with CentOS 7 worker nodes.

lesovsky · 2020-05-30T08:14:31Z

Have the same issue with 0.32.0 on HA baremetal cluster with strange behaviour:
Have two ingresses A and B:

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: service-alpha
  namespace: staging
  annotations:
    kubernetes.io/ingress.class: "nginx"
spec:
  rules:
    - host: alpha.example.org
      http:
        paths:
          - path: /
            backend:
              serviceName: service-alpha
              servicePort: 1080

---
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
  name: service-beta
  namespace: staging
  annotations:
    kubernetes.io/ingress.class: "nginx"
    nginx.ingress.kubernetes.io/rewrite-target: /$1
spec:
  rules:
    - host: beta.example.org
      http:
        paths:
          - path: /user/(.*)
            backend:
              serviceName: service-users
              servicePort: 1080
          - path: /data/(.*)
            backend:
              serviceName: service-data
              servicePort: 1080

ingress A the most of time created without errors, but in very rare cases create attempts return error
ingress B is not created and always returns error

# kubectl apply -f manifests/ingress-beta.yml 
Error from server (InternalError): error when creating "manifests/ingress-beta.yml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)

In the api-server logs errors look like that

I0530 08:05:56.884549       1 trace.go:116] Trace[898207247]: "Call validating webhook" configuration:ingress-nginx-admission,webhook:validate.nginx.ingress.kubernetes.io,resource:networking.k8s.io/v1beta1, Resource=ingresses,subresource:,operation:CREATE,UID:fdce95ab-e2a9-40f5-9ab3-73a85b603db6 (started: 2020-05-30 08:05:26.883895783 +0000 UTC m=+5434.178340436) (total time: 30.000569226s):
Trace[898207247]: [30.000569226s] [30.000569226s] END
W0530 08:05:56.884664       1 dispatcher.go:133] Failed calling webhook, failing closed validate.nginx.ingress.kubernetes.io: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://ingress-nginx-controller-admission.ingress-nginx.svc:443/extensions/v1beta1/ingresses?timeout=30s: context deadline exceeded (Client.Timeout exceeded while awaiting headers)
I0530 08:05:56.885303       1 trace.go:116] Trace[868353513]: "Create" url:/apis/networking.k8s.io/v1beta1/namespaces/staging/ingresses,user-agent:kubectl/v1.18.3 (linux/amd64) kubernetes/2e7996e,client:127.0.0.1 (started: 2020-05-30 08:05:26.882592405 +0000 UTC m=+5434.177037017) (total time: 30.002669278s):
Trace[868353513]: [30.002669278s] [30.002248351s] END

The main question is why the first ingress is created the most of times and the second is always failed to create?

Upd. Also this comment on SO might be useful in investigating causes of problems.

Upd 2. When rewrite annotation is removed, the manifest is applied without errors.

Upd 3. It fails in combination with multiple paths and with rewrite annotation.

@aledbf Looks like a bug.

tomoyk · 2020-06-09T10:02:00Z

We have this issue on baremetal k3s cluster. Our http proxy logged these traffic.

gost[515]: 2020/06/09 15:15:37 http.go:151: [http] 192.168.210.21:47396 -> http://:8080 -> ingress-nginx-controller-admission.ingress-nginx.svc:443
gost[515]: 2020/06/09 15:15:37 http.go:241: [route] 192.168.210.21:47396 -> http://:8080 -> ingress-nginx-controller-admission.ingress-nginx.svc:443
gost[515]: 2020/06/09 15:15:37 http.go:262: [http] 192.168.210.21:47396 -> 192.168.210.1:8080 : dial tcp: lookup ingress-nginx-controller-admission.ingress-nginx.svc on 192.168.210.1:53: no such host

yayoec · 2020-06-10T13:01:57Z

@eltonbfw update to 0.32.0 and make sure the API server can reach the POD running the ingress controller

I have the same problem,and i use 0.32.0.
What's the solution?
Pleast, thanks!

me too

andrei-matei · 2020-06-10T13:33:01Z

If you are using the baremetal install from Kelsey Hightower, my suggestion is to install kubelet on your master nodes, start calico/flannel or whatever you use for CNI, label your nodes as masters so you have no other pods started there and then your control-plane would be able to communicate with your nginx deployment and the issue should be fixed. At least this is how it worked for me.

metaversed · 2020-07-10T10:23:20Z

@aledbf This issue still occurs

mikalai-t · 2020-07-15T19:53:09Z

@andrei-matei Kelsey's cluster works perfectly even without additional CNI plugins and kubelet SystemD services installed on master nodes. All you need is to add a route to Services' CIDR 10.32.0.0/24 using worker node IPs as "next-hop" on master nodes only.
In this way I've got ingress-nginx (deployed from "bare-metal" manifest) and cert-manager webhooks working, but unfortunately not together :( still doesn't know why...

Updated: got both of them working

lbs-rodrigo · 2020-07-22T12:28:23Z

@aduncmj I found this solution https://stackoverflow.com/questions/61365202/nginx-ingress-service-ingress-nginx-controller-admission-not-found

kubectl delete -A ValidatingWebhookConfiguration ingress-nginx-admission

metaversed · 2020-07-22T16:40:39Z

@aduncmj i did the same, thank you for sharing the findings. I m curious if this can be handled without manual intervention.

bluehtt · 2020-07-25T05:35:56Z

@opensourceonly This worked for me, you can try it, you should add a pathType for Ingress configuration. #5445

jungrae-prestolabs · 2022-02-18T03:41:14Z

I don't think deleting all ValidatingWebhookConfiguration is the solution of this
In my case, The problem with the cause was that the version were mixed (e.g. 1.1.1 and 0.47.0)

Error: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": an error on the server ("") has prevented the request from succeeding

mihaigalos · 2022-04-06T19:20:18Z

In my GKE cluster I've manually increased timeoutSeconds to 30.

You can do it via Helm:
controller:
  admissionWebhooks:
    enabled: true
    timeoutSeconds: 45

Hi @tehkapa, what resource do you apply this to? Can you post a yaml containing the spec? Thank you.

marpada · 2022-04-06T23:54:12Z

On EKS, a security group rule needs to be added on the Node Security Group tcp/8443 from the Cluster Security Group.

matteovivona · 2022-04-07T07:04:15Z

In my GKE cluster I've manually increased timeoutSeconds to 30.
You can do it via Helm:
controller:
  admissionWebhooks:
    enabled: true
    timeoutSeconds: 45
Hi @tehkapa, what resource do you apply this to? Can you post a yaml containing the spec? Thank you.

@mihaigalos is the global configmap. you can apply it when you install ingress via helm. like this helm install ingress ingress-nginx/ingress-nginx -f values.yaml

values.yaml:

controller:
  admissionWebhooks:
    enabled: true
    timeoutSeconds: 45

Clasyc · 2022-05-11T20:25:42Z

On EKS, a security group rule needs to be added on the Node Security Group tcp/8443 from the Cluster Security Group.

In case using terraform:

resource "aws_security_group_rule" "webhook_admission_inbound" {
  type                     = "ingress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

resource "aws_security_group_rule" "webhook_admission_outbound" {
  type                     = "egress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

sbeaulie · 2022-06-22T14:41:52Z

I updated from nginx-ingress to ingress-nginx in GKE, so if this helps anyone I needed to add a FW rule to allow 8443 from the API server to my nodes.

As per deploy instructions:
https://kubernetes.github.io/ingress-nginx/deploy/#gce-gke

I'm not sure why it was NOT needed in nginx-ingress.

chance2021 · 2022-06-29T15:00:43Z

Double check if there is any networkpolicy has been set

Error I was getting...

Error from server (InternalError): error when creating "/tmp/ingress-test.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://ingress-nginx-controller-admission.ingress.svc:443/networking/v1/ingresses?timeout=10s": context deadline exceeded

Once below networkpolicy was applied, the issue was gone

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: networkpolicy
  namespace: default
spec:
  ingress:
  - {}
  podSelector:
    matchLabels:
      app.kubernetes.io/name: ingress-nginx
  policyTypes:
  - Ingress

chance2021 · 2022-07-27T14:28:15Z

Make sure both our your nginx-ingress pod and service work properly. My case was that I was assigning the wrong public IP which didn't exist in the corresponding resource group in AKS.

lgpasquale · 2022-08-04T14:15:07Z

On EKS, a security group rule needs to be added on the Node Security Group tcp/8443 from the Cluster Security Group.

In case using terraform:

resource "aws_security_group_rule" "webhook_admission_inbound" {
  type                     = "ingress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

resource "aws_security_group_rule" "webhook_admission_outbound" {
  type                     = "egress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

I don't think you need both an ingress and an egress rule but just the ingress one. The first of these two rules should be enough.

For anyone using the terraform-aws-modules/eks/aws module, you can add this to your configuration:

  node_security_group_additional_rules = {
    # nginx-ingress requires the cluster to communicate with the ingress controller
    cluster_to_node = {
      description      = "Cluster to ingress-nginx webhook"
      protocol         = "-1"
      from_port        = 8443
      to_port          = 8443
      type             = "ingress"
      source_cluster_security_group = true
    }
    # Add here any other rule you already have
    # ...
  }

adiii717 · 2022-09-07T07:40:53Z

node_security_group_additional_rules = {
# nginx-ingress requires the cluster to communicate with the ingress controller
cluster_to_node = {
description = "Cluster to ingress-nginx webhook"
protocol = "-1"
from_port = 8443
to_port = 8443
type = "ingress"
source_cluster_security_group = true
}
# Add here any other rule you already have
# ...
}

correct, and this resolved issue for me on EKS 1.23
https://github.com/terraform-aws-modules/terraform-aws-eks#input_node_security_group_additional_rules

Clasyc · 2022-09-09T12:08:04Z

On EKS, a security group rule needs to be added on the Node Security Group tcp/8443 from the Cluster Security Group.

In case using terraform:

resource "aws_security_group_rule" "webhook_admission_inbound" {
  type                     = "ingress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

resource "aws_security_group_rule" "webhook_admission_outbound" {
  type                     = "egress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

I don't think you need both an ingress and an egress rule but just the ingress one. The first of these two rules should be enough.

For anyone using the terraform-aws-modules/eks/aws module, you can add this to your configuration:

  node_security_group_additional_rules = {
    # nginx-ingress requires the cluster to communicate with the ingress controller
    cluster_to_node = {
      description      = "Cluster to ingress-nginx webhook"
      protocol         = "-1"
      from_port        = 8443
      to_port          = 8443
      type             = "ingress"
      source_cluster_security_group = true
    }
    # Add here any other rule you already have
    # ...
  }

You are right, ingress is enough.

RamyAllam · 2022-09-27T11:03:59Z

For GKE private nodes, this should help

gcloud compute firewall-rules create RULE-NAME-master-nginx-ingress \
    --action ALLOW \
    --direction INGRESS \
    --source-ranges CONTROL_PLANE_RANGE \
    --rules tcp:8443 \
    --target-tags TARGET \
    --project GCP_PROJECT

Example

gcloud compute firewall-rules create gke-private-cluster-01-f13afdc6-master-nginx-ingress \
    --action ALLOW \
    --direction INGRESS \
    --source-ranges 172.16.0.0/28 \
    --rules tcp:8443 \
    --target-tags gke-private-cluster-01-f13afdc6-node \
    --project mygcpproject

You can also list the existing rules for the cluster

gcloud compute firewall-rules list \
    --filter 'name~^CLUSTER_NAME' \
    --format 'table(
        name,
        network,
        direction,
        sourceRanges.list():label=SRC_RANGES,
        allowed[].map().firewall_rule().list():label=ALLOW,
        targetTags.list():label=TARGET_TAGS
    )' --project GCP_PROJECT

Reference: https://cloud.google.com/kubernetes-engine/docs/how-to/private-clusters#step_3_add_a_firewall_rule

…ingress controller Fix following error when deploying an exposed service in eks-public: > Error: release artifact-caching-proxy failed, and has been uninstalled due to atomic being set: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post "https://public-nginx-ingress-ingress-nginx-controller-admission.public-nginx-ingress.svc:443/networking/v1/ingresses?timeout=10s": context deadline exceeded Ref: kubernetes/ingress-nginx#5401 (comment)

xgt001 · 2022-10-13T16:13:34Z

thanks @Clasyc, your hint worked for me

sotiriougeorge · 2022-10-19T07:26:26Z

@Clasyc and everyone also:

On EKS created from terraform-aws-modules/eks/aws module (version 17.x though) a security group is automatically created by the module itself, for the Worker Nodes that has a rule which allows traffic from the Control Plane security group on ports 1025-65535 for TCP.

This rule also includes the pre-defined description "Allow worker pods to receive communication from the cluster control plane".

Does this not cover the case of the security group mentioned above?

If it does, I am still facing this issue but intermittently, especially when I am deploying massive workloads through Helm (the Ingresses have been checked and are OK as far as their correctness is concerned). It almost seems like a flood-protection mechanism because if I let it cooldown then I don't get it anymore.

Am I missing something here?

tbondarchuk · 2022-10-19T07:30:15Z

@sotiriougeorge Same here: eks created by tf module, from time to time see those errors. I think amount of errors decreases when controller is scaled up. At least it seems so for me on prod with two replicas compared to dev with one.

sotiriougeorge · 2022-10-19T07:32:56Z

@sotiriougeorge Same here: eks created by tf module, from time to time see those errors. I think amount of errors decreases when controller is scaled up. At least it seems so for me on prod with two replicas compared to dev with one.

Thank you for the sanity check! Appreciated. I will try to scale up to more replicas and see what comes of it. However it would be good if through this GitHub issue there was some consensus on how to fight it holistically or if there is anything that needs to be changed on the controller side.

jhodnett2 · 2022-11-12T01:40:27Z

On EKS, a security group rule needs to be added on the Node Security Group tcp/8443 from the Cluster Security Group.

In case using terraform:

resource "aws_security_group_rule" "webhook_admission_inbound" {
  type                     = "ingress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

resource "aws_security_group_rule" "webhook_admission_outbound" {
  type                     = "egress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

I don't think you need both an ingress and an egress rule but just the ingress one. The first of these two rules should be enough.

For anyone using the terraform-aws-modules/eks/aws module, you can add this to your configuration:

  node_security_group_additional_rules = {
    # nginx-ingress requires the cluster to communicate with the ingress controller
    cluster_to_node = {
      description      = "Cluster to ingress-nginx webhook"
      protocol         = "-1"
      from_port        = 8443
      to_port          = 8443
      type             = "ingress"
      source_cluster_security_group = true
    }
    # Add here any other rule you already have
    # ...
  }

Just a heads up here: when the protocol is set to "-1", it means "All Traffic". This opens up all ports, making the from_port/to_port values moot. This may be too permissive in some cases. Setting to"tcp" will allow you to limit/set the port range to 8443.

Having had the same issues noted above and finding this solution, I found the rule wasn't what I was expecting. Had troubles finding the rule because I was searching by port.

stevenyongzion · 2023-04-08T03:31:48Z

On EKS, a security group rule needs to be added on the Node Security Group tcp/8443 from the Cluster Security Group.

In case using terraform:
resource "aws_security_group_rule" "webhook_admission_inbound" {
  type                     = "ingress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}

resource "aws_security_group_rule" "webhook_admission_outbound" {
  type                     = "egress"
  from_port                = 8443
  to_port                  = 8443
  protocol                 = "tcp"
  security_group_id        = module.eks.node_security_group_id
  source_security_group_id = module.eks.cluster_primary_security_group_id
}
I don't think you need both an ingress and an egress rule but just the ingress one. The first of these two rules should be enough.
For anyone using the terraform-aws-modules/eks/aws module, you can add this to your configuration:
  node_security_group_additional_rules = {
    # nginx-ingress requires the cluster to communicate with the ingress controller
    cluster_to_node = {
      description      = "Cluster to ingress-nginx webhook"
      protocol         = "-1"
      from_port        = 8443
      to_port          = 8443
      type             = "ingress"
      source_cluster_security_group = true
    }
    # Add here any other rule you already have
    # ...
  }
Just a heads up here: when the protocol is set to "-1", it means "All Traffic". This opens up all ports, making the from_port/to_port values moot. This may be too permissive in some cases. Setting to"tcp" will allow you to limit/set the port range to 8443.

Having had the same issues noted above and finding this solution, I found the rule wasn't what I was expecting. Had troubles finding the rule because I was searching by port.

For those who is using GKE, this is the sample Terraform code I use to open port 8443:

resource "google_compute_firewall" "port_8443_nginx_controller" {
  name    = "port-nginx-controller-webhook-allow-8443"
  network = google_compute_network.vpc.name
  description = "Refer to https://stackoverflow.com/a/65675908/778932"

  allow {
    protocol = "tcp"
    ports    = ["8443"]
  }

  source_ranges = [var.private_cluster_cidr]
  target_tags   = ["${var.project_name}-pool"]
}

Refer to this to get target_tags.

PavithraKMR · 2023-08-03T05:39:13Z

I too faced the same error when I was applying the command ->

C:\Users\Pavithra Kanmaniraja\Documents\kubernetes-sample-apps>kubectl apply -f ingressdemons1.yaml -n demons1
Error from server (InternalError): error when creating "ingressdemons1.yaml": Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": failed to call webhook: Post "https://nginx-ingress-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1/ingresses?timeout=10s": service "nginx-ingress-ingress-nginx-controller-admission" not found

So I tried to remove the namespace from the cluster, but it does not remove everything that I created when I installed ingress to my cluster. Then, I deleted the existing ValidatingWebhookConfiguration by using the command ->

C:\Users\Pavithra Kanmaniraja>kubectl delete ValidatingWebhookConfiguration nginx-ingress-ingress-nginx-admission
validatingwebhookconfiguration.admissionregistration.k8s.io "nginx-ingress-ingress-nginx-admission" deleted

After that, I applied the command ->

C:\Users\Pavithra Kanmaniraja>kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.4/deploy/static/provider/cloud/deploy.yaml
namespace/ingress-nginx created
serviceaccount/ingress-nginx created
configmap/ingress-nginx-controller created
clusterrole.rbac.authorization.k8s.io/ingress-nginx configured
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx configured
role.rbac.authorization.k8s.io/ingress-nginx created
rolebinding.rbac.authorization.k8s.io/ingress-nginx created
service/ingress-nginx-controller-admission created
service/ingress-nginx-controller created
deployment.apps/ingress-nginx-controller created
ingressclass.networking.k8s.io/nginx unchanged
validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created
serviceaccount/ingress-nginx-admission created
clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission configured
clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission configured
role.rbac.authorization.k8s.io/ingress-nginx-admission created
rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created
job.batch/ingress-nginx-admission-create created
job.batch/ingress-nginx-admission-patch created

Next, again apply the command ->

C:\Users\Pavithra Kanmaniraja\Documents\kubernetes-sample-apps>kubectl apply -f ingressdemons1.yaml -n ingress-nginx
ingress.networking.k8s.io/doksexample-ingress created

Finally, the ingress is created now.

longwuyuan · 2023-08-03T05:43:11Z

Blocked port 8443 is one of the known root causes for this error

…

On Thu, 3 Aug, 2023, 11:09 am Pavithra Kanmanirajah, < ***@***.***> wrote: I too faced the same error when I was applying the command -> *C:\Users\Pavithra Kanmaniraja\Documents\kubernetes-sample-apps>kubectl apply -f ingressdemons1.yaml -n demons1* Error from server (InternalError): error when creating "ingressdemons1.yaml": Internal error occurred: failed calling webhook " validate.nginx.ingress.kubernetes.io": failed to call webhook: Post " https://nginx-ingress-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1/ingresses?timeout=10s <https://nginx-ingress-ingress-nginx-controller-admission.ingress-nginx.svc/networking/v1/ingresses?timeout=10s>": service "nginx-ingress-ingress-nginx-controller-admission" not found So I tried to remove the namespace from the cluster, but it does not remove everything that I created when I installed ingress to my cluster. Then, I deleted the existing ValidatingWebhookConfiguration by using the command -> *C:\Users\Pavithra Kanmaniraja>kubectl delete ValidatingWebhookConfiguration nginx-ingress-ingress-nginx-admission* validatingwebhookconfiguration.admissionregistration.k8s.io "nginx-ingress-ingress-nginx-admission" deleted After that, I applied the command -> *C:\Users\Pavithra Kanmaniraja>kubectl apply -f https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.4/deploy/static/provider/cloud/deploy.yaml <https://raw.githubusercontent.com/kubernetes/ingress-nginx/controller-v1.0.4/deploy/static/provider/cloud/deploy.yaml>* namespace/ingress-nginx created serviceaccount/ingress-nginx created configmap/ingress-nginx-controller created clusterrole.rbac.authorization.k8s.io/ingress-nginx configured clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx configured role.rbac.authorization.k8s.io/ingress-nginx created rolebinding.rbac.authorization.k8s.io/ingress-nginx created service/ingress-nginx-controller-admission created service/ingress-nginx-controller created deployment.apps/ingress-nginx-controller created ingressclass.networking.k8s.io/nginx unchanged validatingwebhookconfiguration.admissionregistration.k8s.io/ingress-nginx-admission created serviceaccount/ingress-nginx-admission created clusterrole.rbac.authorization.k8s.io/ingress-nginx-admission configured clusterrolebinding.rbac.authorization.k8s.io/ingress-nginx-admission configured role.rbac.authorization.k8s.io/ingress-nginx-admission created rolebinding.rbac.authorization.k8s.io/ingress-nginx-admission created job.batch/ingress-nginx-admission-create created job.batch/ingress-nginx-admission-patch created Next, again apply the command -> *C:\Users\Pavithra Kanmaniraja\Documents\kubernetes-sample-apps>kubectl apply -f ingressdemons1.yaml -n ingress-nginx* ingress.networking.k8s.io/doksexample-ingress created Finally, the ingress is created now. — Reply to this email directly, view it on GitHub <#5401 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABGZVWRSMI2VA3WOAQBTEQTXTM2Q5ANCNFSM4ML2P6LQ> . You are receiving this because you were mentioned.Message ID: ***@***.***>

akhfzl · 2023-09-12T16:40:04Z

ingress-nginx
ingress.networking.k8s.io/doksexample-ingress created

ingress-nginx ingress.networking.k8s.io/doksexample-ingress created is not command, how about it

hebabaze · 2023-11-17T21:03:23Z

you can resolve that by opening the port : 443 and 8443 in each machine of your cluster
if you are using ubuntu you can do as this :
sudo ufw allow 8443
sudo ufw allow proto tcp from any to any port 8443

devopstales · 2024-01-17T14:53:14Z

For me this policy slowed my issue:

# Cilium format:
---
apiVersion: cilium.io/v2
kind: CiliumNetworkPolicy
metadata:
  name: allow-ingress--ingress-nginx-private
  namespace: ingress-system
spec:
  endpointSelector:
    matchLabels:
      app.kubernetes.io/component: controller
      app.kubernetes.io/instance: ingress-private
  ingress:
    - fromEntities:
        - world
      toPorts:
        - ports:
            - port: "443"
        - ports:
            - port: "80"
    - fromEntities:
        - cluster
      toPorts:
        - ports:
            - port: "8443"
  egress:
    - {}
 ---
# Standard format:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-api-server-to-ingress-webhook-server
  namespace: ingress-system
spec:
  podSelector:
    matchLabels:
      app.kubernetes.io/component: controller
      app.kubernetes.io/instance: ingress-private
  policyTypes:
    - Ingress
    - Egress
  ingress:
    - from:
        - ipBlock:
            cidr: 0.0.0.0/0
      ports:
        - port: 443
        - port: 80
        - port: 8443
    - from:
        - namespaceSelector: {}
      ports:
        - port: 443
        - port: 80
        - port: 8443
  egress: []

You need to allow the communication from the api server in the cluster to the ingress-controller pod.

michaelRanivoEpitech · 2024-01-22T19:28:52Z

Hello everyone,

I try to set up a Kubernetes cluster on Azure VMs with Ubuntu.
I initiated my cluster with kubeadm.
For this initialization, I followed the Kube documentation but when I installed the Nginx Ingress controller I had the same problem as this issue. Not only Nginx Ingress Controller but also cert-manager. When I try to create an ingress for the ingress controller and when I try to create an Issuer for cert-manager, I always have the same error.
I bought a book on building a Kubernetes cluster and I followed what was written but I still have the same problem.
I went to see the errors of Cert-Manager and in the list of errors I see my error: https://cert-manager.io/docs/troubleshooting/webhook/#error-context-deadline-exceeded I did the few manipulations they ask to do and my webhook cert-manager responds well as in their example. I tried the same thing for Nginx's webhook I have an answer.
They then say that it is the api-server that cannot contact the webhooks: https://cert-manager.io/docs/troubleshooting/webhook/#error-io-timeout-connectivity-issue . What I don't understand is that I have recreated my cluster over and over again but I still have the same errors. Have I initiated my cluster wrongly? Or are there really connection problems between the api-server and the webhooks?
If someone have some articles, documentation or something else to help building kubernetes cluster step by step I appreciate.
By the way I try to learn kubernetes so ...

aduncmj added the kind/support Categorizes issue or PR as a support question. label Apr 19, 2020

aledbf mentioned this issue Apr 27, 2020

Ensure webhook validation ingress has a PathType #5445

Merged

8 tasks

aledbf closed this as completed in #5445 Apr 27, 2020

sadortun mentioned this issue Apr 27, 2020

After upgrade from 0.24.1 to 0.31 ... remote error: tls: bad certificate #5441

Closed

Berndinox mentioned this issue May 1, 2020

"Internal error occurred: failed calling webhook" GKE Private Cluster cert-manager/cert-manager#2866

Closed

onedr0p added a commit to onedr0p/home-ops that referenced this issue Jul 9, 2020

rollback nginx ingress kubernetes/ingress-nginx#5401

f2aa5eb

rajdhandus mentioned this issue May 18, 2022

latest chaos-mesh (master) not work on k3s chaos-mesh/chaos-mesh#344

Closed

MykolaBordakov mentioned this issue Aug 1, 2022

InternalError (failed calling webhook "ipaddresspoolvalidationwebhook.metallb.io") metallb/metallb#1540

Closed

lemeurherve mentioned this issue Oct 11, 2022

fix(eks-public): add ingress so the cluster can communicate with the ingress controller jenkins-infra/aws#234

Merged

miguelhar mentioned this issue Apr 26, 2023

PLAT-6548 Add a retry to rancher helm install dominodatalab/ranchhand#72

Merged

Venryx mentioned this issue Mar 12, 2024

Fix that the deploy of the "app-routes" resource fails sometimes debate-map/app#269

Closed

Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io" #5401

Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io" #5401

Comments

aduncmj commented Apr 19, 2020

moljor commented Apr 21, 2020

moljor commented Apr 21, 2020

Cspellz commented Apr 22, 2020 • edited Loading

johan-lejdung commented Apr 27, 2020

johan-lejdung commented Apr 27, 2020

aledbf commented Apr 27, 2020 • edited Loading

s977120 commented May 1, 2020

nicholaspier commented May 1, 2020

aledbf commented May 1, 2020

AbbetWang commented May 4, 2020 • edited Loading

update

fix workaround

Caution!!!! it's may not resolve the issue properly.

now status

exec kubectl get svc,pods

my ingress.yaml

namespace: foo

what I Do

my apiserver log blow:

eltonbfw commented May 5, 2020

aledbf commented May 5, 2020 • edited Loading

cnlong commented May 15, 2020

nicholaspier commented May 15, 2020

andrei-matei commented May 25, 2020

lesovsky commented May 30, 2020 • edited Loading

tomoyk commented Jun 9, 2020 • edited Loading

yayoec commented Jun 10, 2020

andrei-matei commented Jun 10, 2020

metaversed commented Jul 10, 2020

mikalai-t commented Jul 15, 2020 • edited Loading

lbs-rodrigo commented Jul 22, 2020

metaversed commented Jul 22, 2020

bluehtt commented Jul 25, 2020 • edited Loading

jungrae-prestolabs commented Feb 18, 2022 • edited Loading

mihaigalos commented Apr 6, 2022

marpada commented Apr 6, 2022

matteovivona commented Apr 7, 2022 • edited Loading

Clasyc commented May 11, 2022 • edited Loading

sbeaulie commented Jun 22, 2022

chance2021 commented Jun 29, 2022 • edited Loading

chance2021 commented Jul 27, 2022

lgpasquale commented Aug 4, 2022

adiii717 commented Sep 7, 2022

Clasyc commented Sep 9, 2022

RamyAllam commented Sep 27, 2022

xgt001 commented Oct 13, 2022

sotiriougeorge commented Oct 19, 2022 • edited Loading

tbondarchuk commented Oct 19, 2022

sotiriougeorge commented Oct 19, 2022

jhodnett2 commented Nov 12, 2022 • edited Loading

stevenyongzion commented Apr 8, 2023 • edited Loading

PavithraKMR commented Aug 3, 2023

longwuyuan commented Aug 3, 2023 via email

akhfzl commented Sep 12, 2023

hebabaze commented Nov 17, 2023

devopstales commented Jan 17, 2024 • edited Loading

michaelRanivoEpitech commented Jan 22, 2024

Cspellz commented Apr 22, 2020 •

edited

Loading

aledbf commented Apr 27, 2020 •

edited

Loading

AbbetWang commented May 4, 2020 •

edited

Loading

aledbf commented May 5, 2020 •

edited

Loading

lesovsky commented May 30, 2020 •

edited

Loading

tomoyk commented Jun 9, 2020 •

edited

Loading

mikalai-t commented Jul 15, 2020 •

edited

Loading

bluehtt commented Jul 25, 2020 •

edited

Loading

jungrae-prestolabs commented Feb 18, 2022 •

edited

Loading

matteovivona commented Apr 7, 2022 •

edited

Loading

Clasyc commented May 11, 2022 •

edited

Loading

chance2021 commented Jun 29, 2022 •

edited

Loading

sotiriougeorge commented Oct 19, 2022 •

edited

Loading

jhodnett2 commented Nov 12, 2022 •

edited

Loading

stevenyongzion commented Apr 8, 2023 •

edited

Loading

devopstales commented Jan 17, 2024 •

edited

Loading