You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Creating this issue here so others don't waste time. Maybe this should be in the documentation.
Setup:
K8S
Calico
hcloud CCM
hcloud CSI
Istio
Annotated the istio-ingressgateway with all information needed to use a pre-provisioned (terraformed) load balancer. Each http, https, and tcp node port on the worker nodes were connectable, yet the HCloud load balancer heath checks kept saying they were unreachable / not healthy.
Checking tcpdump on the interface and filtering the node port, its clear the healthcheck packets don't get ack'd:
root@dev1-worker-2:~# tcpdump -i ens10 port 31945
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on ens10, link-type EN10MB (Ethernet), capture size 262144 bytes
22:49:07.439005 IP 10.9.8.5.46310 > dev1-worker-2.cluster.local.31945: Flags [S], seq 2304935557, win 64860, options [mss 1410,sackOK,TS val 1914248445 ecr 0,nop,wscale 7], length 0
22:49:08.461531 IP 10.9.8.5.46310 > dev1-worker-2.cluster.local.31945: Flags [S], seq 2304935557, win 64860, options [mss 1410,sackOK,TS val 1914249468 ecr 0,nop,wscale 7], length 0
22:49:10.477634 IP 10.9.8.5.46310 > dev1-worker-2.cluster.local.31945: Flags [S], seq 2304935557, win 64860, options [mss 1410,sackOK,TS val 1914251484 ecr 0,nop,wscale 7], length 0
6: kube-ipvs0: <BROADCAST,NOARP> mtu 1500 qdisc noop state DOWN group default
link/ether 0e:83:d9:ad:a1:e3 brd ff:ff:ff:ff:ff:ff
...
inet 10.9.8.5/32 scope global kube-ipvs0
valid_lft forever preferred_lft forever
It became clear why. A route was created on each of the worker nodes containing the HCloud load balancer IP, so the HCloud load balancer never received the response.
Using hcloud-cloud-controller-manager, LoadBalancer services get to know their external IPs. This IP gets added to the ipvs0 interface to allow cluster-internal access to the LoadBalancer. Also, a route will be created pointing to this IP on all nodes (ip route show table local). If now the (Hetzner) Load Balancer tries to send a health check packet, the cluster's reply will stay within the cluster, since the route is pointing to the ipvs0 interface instead of the internal network's network card.
There are a lot solutions in discussion, but as far as I know, nothing helpful so far. The only workaround seems to be to use iptables instead of ipvs as kube_proxy mode (didn't try yet with Hetzner Load Balancer). However, this will come with a drawback regarding performance (https://www.projectcalico.org/comparing-kube-proxy-modes-iptables-or-ipvs/).
As a very dirty hack and experiment, I temporarily removed the local route (ip route del local $internal_loadbalancer_ip dev kube-ipvs0 table local) and health checks started to lighten in green immediatly. However, this ugly workaround will not survive a reboot.
Currently reading about stuff that it might be possbile to replace kube_proxy/ipvs with cilium, but just started with trying to understand things there...For now, I guess, only iptables will "work". But I'm happy to discuss and work with you and Hetzner staff to find a solution.
And a fix: #58 (comment), which is to annotate the Service with load-balancer.hetzner.cloud/hostname: your-ingress.acme.corp
The text was updated successfully, but these errors were encountered:
Creating this issue here so others don't waste time. Maybe this should be in the documentation.
Setup:
Annotated the
istio-ingressgateway
with all information needed to use a pre-provisioned (terraformed) load balancer. Eachhttp
,https
, andtcp
node port on the worker nodes were connectable, yet the HCloud load balancer heath checks kept saying they were unreachable / not healthy.Checking
tcpdump
on the interface and filtering the node port, its clear the healthcheck packets don't get ack'd:It became clear why. A route was created on each of the worker nodes containing the HCloud load balancer IP, so the HCloud load balancer never received the response.
I came across this useful comment #58 (comment)
And a fix: #58 (comment), which is to annotate the Service with
load-balancer.hetzner.cloud/hostname: your-ingress.acme.corp
The text was updated successfully, but these errors were encountered: