Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting "Internal error occurred: failed calling webhook "webhook-server.webhook-demo.svc"" error #6

Open
unclerayray opened this issue Jul 20, 2020 · 1 comment

Comments

@unclerayray
Copy link

unclerayray commented Jul 20, 2020

Hello!

Thanks for prepare this example for me. I'm testing this on GKE and ran into this problem where the webhook service isn't reachable.

[root@gke-client-tf admission-controller-webhook-demo]# k create -f examples/pod-with-override.yaml -n webhook-demo
Error from server (InternalError): error when creating "examples/pod-with-override.yaml": Internal error occurred: failed calling webhook "webhook-server.webhook-demo.svc": Post https://webhook-server.webhook-demo.svc:443/mutate?timeout=30s: net/http: request canceled while waiting for connection (Client.Timeout exceeded while awaiting headers)

I struggled at finding the real root cause at first since we have failurePolicy: Ignore in the yaml. I had to update it into failurePolicy: Fail to eventually get where I'm now.

Here's what I've tried

get the pod ip of webhook service

[root@gke-client-tf leilichao]# k describe svc webhook-server -n webhook-demo
Name:              webhook-server
Namespace:         webhook-demo
Labels:            <none>
Annotations:       <none>
Selector:          app=webhook-server
Type:              ClusterIP
IP:                10.10.11.218
Port:              <unset>  443/TCP
TargetPort:        8443/TCP
Endpoints:         10.1.0.168:8443
Session Affinity:  None
Events:            <none>

10.1.0.168 is the pod IP

luanch a busybox pod

I launched a pod in the same namespace as the webhook and curl from it with the

[root@gke-client-tf leilichao]# kubectl -n webhook-demo run curl --image=radial/busyboxplus:curl -i --tty
If you don't see a command prompt, try pressing enter.
[ root@curl:/ ]$ hostname -i
10.1.0.37
[ root@curl:/ ]$ curl -k 10.1.0.168:8443

[ root@curl:/ ]$ curl -k 10.1.0.168:8443

[ root@curl:/ ]$ nslookup webhook-server.webhook-demo.svc
Server:    10.10.11.10
Address 1: 10.10.11.10 kube-dns.kube-system.svc.cluster.local

Name:      webhook-server.webhook-demo.svc
Address 1: 10.10.11.218 webhook-server.webhook-demo.svc.cluster.local
[ root@curl:/ ]$ ping webhook-server.webhook-demo.svc.cluster.local
PING webhook-server.webhook-demo.svc.cluster.local (10.10.11.218): 56 data bytes
^C
--- webhook-server.webhook-demo.svc.cluster.local ping statistics ---
6 packets transmitted, 0 packets received, 100% packet loss
[ root@curl:/ ]$ nslookup webhook-server.webhook-demo.svc.cluster.local
Server:    10.10.11.10
Address 1: 10.10.11.10 kube-dns.kube-system.svc.cluster.local

Name:      webhook-server.webhook-demo.svc.cluster.local
Address 1: 10.10.11.218 webhook-server.webhook-demo.svc.cluster.local
[ root@curl:/ ]$ ping 10.10.11.218
PING 10.10.11.218 (10.10.11.218): 56 data bytes
^C
--- 10.10.11.218 ping statistics ---
11 packets transmitted, 0 packets received, 100% packet loss
[ root@curl:/ ]$ ping 10.1.0.168
PING 10.1.0.168 (10.1.0.168): 56 data bytes
64 bytes from 10.1.0.168: seq=0 ttl=63 time=0.211 ms
64 bytes from 10.1.0.168: seq=1 ttl=63 time=0.126 ms
64 bytes from 10.1.0.168: seq=2 ttl=63 time=0.086 ms
64 bytes from 10.1.0.168: seq=3 ttl=63 time=0.074 ms
64 bytes from 10.1.0.168: seq=4 ttl=63 time=0.084 ms
^C
--- 10.1.0.168 ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max = 0.074/0.116/0.211 ms

A few observation here:

  1. the curl pod is launched at pod ip 10.1.0.37
  2. the FQDN webhook-server.webhook-demo.svc.cluster.local is able to be resolved by DNS
  3. the connection to 10.1.0.168 works
  4. the connection to 10.10.11.218 does not work(of course)

see the webhook pod log from another session

You can see the traffic is comming thru

[root@gke-client-tf admission-controller-webhook-demo]# k get pods -n webhook-demo
NAME                              READY   STATUS    RESTARTS   AGE
curl                              1/1     Running   0          54m
webhook-server-6696bf7b88-gnlkr   1/1     Running   0          8h
[root@gke-client-tf admission-controller-webhook-demo]# k logs -n webhook-demo webhook-server-6696bf7b88-gnlkr
2020/07/20 13:30:43 http: TLS handshake error from 10.1.0.37:36756: tls: first record does not look like a TLS handshake
2020/07/20 13:30:59 http: TLS handshake error from 10.1.0.37:36812: tls: first record does not look like a TLS handshake

Can confirm the traffic went thru from curl pod

FYI I'm on GKE

Now that can confirm the pod ip is reachable and DNS works fine, I suspect there's something wrong with the kube-proxy. However I'm on GKE where the master node isn't accessable so I can't check the kube-proxy for details.

[root@gke-client-tf leilichao]# gcloud container clusters describe my-private-cluster --zone europe-west2-c
addonsConfig:
  kubernetesDashboard:
    disabled: true
  networkPolicyConfig: {}
autoscaling: {}
binaryAuthorization: {}
clusterIpv4Cidr: 10.1.0.0/16
createTime: '2020-06-24T10:59:41+00:00'
currentMasterVersion: 1.15.11-gke.15
currentNodeCount: 3
currentNodeVersion: 1.15.11-gke.15

I'm on 1.15.11-gke.15

Could you please help advice where I could check to fix this issue?

Thanks in advance!

@TzlilSwimmer123
Copy link

Hello @unclerayray, long time passed, did you managed to figure it out? I found a poor solution where I create the Deployment and Service first and the webhook lsat(not all together). Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants