Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exposed load balancer IP causes cluster internal traffic fail together with the proxy protocol #15

Closed
megian opened this issue Aug 20, 2024 · 14 comments

Comments

@megian
Copy link

megian commented Aug 20, 2024

The cloudscale cloud controller does set the IP in the service object status .status.loadBalancer.ingress.ip. This causes the Kubernetes cluster is routing the traffic internally. Which is a positive behavior, as it is faster and comes along with less overhead.

However this effect causes big headache, if the final system expects something the load balancer adds in between. In this case the proxy protocol. Internal traffic sent to the ingress controller is just invalid, because it will not be encapsulated by the proxy protocol.

This seems to be a long standing issue with Kubernetes. As soon as the IP is known by Kubernetes the internal path get's enabled. A solution is planned, but it will take time until this is stable and available in the production environments.

There is a workaround for example AWS has implemented. Just not set the IP but the hostname.

@href
Copy link
Contributor

href commented Aug 20, 2024

Thanks for reporting this, together with useful references. I'll analyze this and get back to you.

@megian
Copy link
Author

megian commented Aug 20, 2024

An effect out of it is that cert-manager is unable to issue certificates, because the challenge URL is checked ahead, but this verification step fails and prevent issuing the certificate, even if the challenge is outside accessible to Let's encrypt.

@href
Copy link
Contributor

href commented Aug 20, 2024

Okay I see the problem with the current implementation. I would probably fix this via an annotation, like Digital Ocean did, together with a similar disclaimer:

https://github.com/digitalocean/digitalocean-cloud-controller-manager/blob/2b8677c5c9f5a32a1ebe7b92cbd2b9687fee6eaa/docs/controllers/services/examples/README.md#https-or-http2-load-balancer-with-hostname

The only thing is that you would have to provide a domain name that points at the right IP address iiuc. It's worth mentioning that we have a DNS entry for each customer IPv4 address, which could be used (e.g. 1-2-3-4.cust.cloudscale.ch). If you want to automate that on your end, you likely could do it that way (if you do not have a domain ready in all cases).

Unless I discover something I'm not seeing I would therefore add an annotation for you to trigger the workaround-behavior. Does that sound reasonable? Do you have a way to work around this issue at the moment for your cluster? How high of a priority is this for you?

@megian
Copy link
Author

megian commented Aug 20, 2024

@href The approach to set the hostname is fine. I think it can be any arbitrary name. In difference to the IP the hostname shouldn't be used. As we currently have no clusters on v1.29 spending time right now for a final solution via .status.loadBalancer.ingress.ipMode wouldn't help right now.

Currently it blocks the setup of Keycloak, which we require for more depending services. As a short term solution we might remove the proxy protocol, which then removes the capability to have access to the source IP. I think we can work some days, but not weeks without.

@href
Copy link
Contributor

href commented Aug 20, 2024

Got it, I'll try to tackle the problem this or next week. I'll get back to you once I have something testable. I assume you can run a test-build against a cluster to try it out, once I have the feature ready.

@megian
Copy link
Author

megian commented Aug 20, 2024

I assume you can run a test-build against a cluster to try it out, once I have the feature ready.

This should be duable.

href added a commit that referenced this issue Aug 22, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
href added a commit that referenced this issue Aug 22, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
href added a commit that referenced this issue Aug 23, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
href added a commit that referenced this issue Aug 23, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
href added a commit that referenced this issue Aug 23, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
href added a commit that referenced this issue Aug 23, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
href added a commit that referenced this issue Aug 23, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
href added a commit that referenced this issue Aug 23, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
@href
Copy link
Contributor

href commented Aug 23, 2024

After a bit of a battle with GitHub CI this feature is ready for testing. I created a preview release that you can use on your cluster:
https://github.com/cloudscale-ch/cloudscale-cloud-controller-manager/releases/tag/1.1.0-rc.1

For your pre 1.30 clusters, you can now set the following annotation to a hostname that points at the load balancer:

k8s.cloudscale.ch/loadbalancer-force-hostname

This in turn cause status.loadBalancer.ingress[0].hostname to be set to the annotation value (no other ingress items will be added).

After 1.30, this should not be needed, as the new Proxy mode is going to be the default. If you require the old VIP mode, you would have to enforce that. See e3b8612

@megian Can you try this out on your end and get back to me?

@megian
Copy link
Author

megian commented Aug 23, 2024

@href Thanks for the fast update. Will try it beginning of next week!

@megian
Copy link
Author

megian commented Aug 26, 2024

@href From my point of view this works as expected.

After adding the annotation k8s.cloudscale.ch/loadbalancer-force-hostnameit no longer exposes the IP in the status:

$ kubectl -n openshift-ingress get svc router-public-lb -o yaml | yq .status
loadBalancer:
  ingress:
    - hostname: ingress-public.example.ch

Cilium doesn't have an cluster internal service endpoint anymore (which it had before):

$ k --as=cluster-admin  -n cilium exec cilium-jtnt7 -- cilium service list | grep -A1 x.x.x.x

All the connections to the proxy protocol enabled OpenShift ingress are working now. Many thanks!

Not sure on the new Proxymode is available on Kubernetes 1.30 by default, because it's still Beta and Kubernetes 1.24, new beta APIs are not enabled by default..

@href
Copy link
Contributor

href commented Aug 26, 2024

Thanks for your feedback. @alakae is doing a code review before we make an official release, but we think we should be able to release tomorrow or Wednesday.

Not sure on the new Proxymode is available on Kubernetes 1.30 by default, because it's still Beta and Kubernetes 1.24, new beta APIs are not enabled by default..

According to the release notes it is:
https://kubernetes.io/blog/2024/04/17/kubernetes-v1-30-release/#make-kubernetes-aware-of-the-loadbalancer-behaviour-sig-network-https-github-com-kubernetes-community-tree-master-sig-network

Also, in our integration tests, where we ran Vanilla 1.30.4, the IPMode setting was successfully tested:
https://github.com/cloudscale-ch/cloudscale-cloud-controller-manager/actions/runs/10522695378/job/29155932836

See

// On newer Kubernetes releases, the defaults just work
newer, err := kubeutil.IsKubernetesReleaseOrNewer(s.k8s, 1, 30)
s.Assert().NoError(err)
if newer {
s.ExposeDeployment("http-echo", 80, 80, map[string]string{
"k8s.cloudscale.ch/loadbalancer-pool-protocol": "proxy",
})
s.T().Log("Testing PROXY protocol on newer Kubernetes releases")
used = s.RunJob("curlimages/curl", 90*time.Second, "curl", "-s", url)
s.Assert().Equal("true\n", used)
}

Testing-bugs not-withstanding I think with 1.30 this should just work.

@megian
Copy link
Author

megian commented Aug 26, 2024

@href Good to hear, that it should work on Kubernets v1.30 by default. Can't proove it as OpenShift on Kubernetes v1.29 as latest yet.

href added a commit that referenced this issue Aug 27, 2024
This is accomplished with two new annotations:

- `k8s.cloudscale.ch/loadbalancer-force-hostname`
- `k8s.cloudscale.ch/loadbalancer-ip-mode`

The former forces a hostname to be reported for loadbalancer ingress,
the latter adds support for the new IPMode config available by default
on Kubernetes 1.30, and feature-gated on 1.29.

This is required for clusters that use the `proxy` or `proxyv2` protocol
for any of their loadbalancers, and send traffic from inside the cluster
to the loadbalancers.

In such a constellation, traffic may not be sent through the loadbalancer,
unless the hostname is set (for older clusters).

For newer cluster, the default "IP Mode" used is "Proxy", as that is the
least surprising setting.

References:

- https://kubernetes.io/blog/2023/12/18/kubernetes-1-29-feature-loadbalancer-ip-mode-alpha/
- #15
@href
Copy link
Contributor

href commented Aug 27, 2024

We have released 1.1.0: https://github.com/cloudscale-ch/cloudscale-cloud-controller-manager/releases/tag/1.1.0

Let me know if this solves the problem for you, so we can close this ticket.

@megian
Copy link
Author

megian commented Aug 28, 2024

@href Many thanks. Upgraded to the v1.1.0 and it seems to work as expected.

megian pushed a commit to projectsyn/component-cloudscale-cloud-controller-manager that referenced this issue Aug 28, 2024
megian pushed a commit to projectsyn/component-cloudscale-cloud-controller-manager that referenced this issue Aug 28, 2024
@href
Copy link
Contributor

href commented Aug 28, 2024

Nice, thanks for confirming!

@href href closed this as completed Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants