Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KubePrism does not append members discovered via the kubernetes registry #8143

Closed
Tracked by #8010
vaskozl opened this issue Jan 11, 2024 · 7 comments · Fixed by #8153
Closed
Tracked by #8010

KubePrism does not append members discovered via the kubernetes registry #8143

vaskozl opened this issue Jan 11, 2024 · 7 comments · Fixed by #8153
Assignees

Comments

@vaskozl
Copy link

vaskozl commented Jan 11, 2024

Bug Report

I decided to try out KubePrism today and enabled the kubernetes registry as the docs state that cluster discovery is used for the endopints

KubePrism controller iterates over members and only appends them if the ControlPlane struct is not nil

However, when using the kubernetes discovery service only machineType is set, so controlPlane is always nil.

The controlPlane struct with the port is only set when using the discovery service (and not the kubernetes discovery registry).

At present the metadata in the node annotations does not specify the port so that would need to be added.

Environment

  • Talos version: 1.6.1
      discovery:
        enabled: true
        registries:
          kubernetes:
            disabled: false
          service:
            disabled: true
@buroa
Copy link

buroa commented Jan 12, 2024

Confirmed, I am also seeing this.

@JJGadgets
Copy link

Can confirm I also see this, here's some (IPs censored, TS = Tailscale on Talos) outputs that seem to indicate so:

❯ talosctl get kubeprismconfig
NODE     NAMESPACE   TYPE              ID                        VERSION   HOST        PORT   ENDPOINTS
cp1-IP   k8s         KubePrismConfig   k8s-loadbalancer-config   3         localhost   7445   [{"host":"vip.fqdn","port":6443},{"host":"localhost","port":6443},{"host":"cp1-IP","port":6443},{"host":"cp1-TS-IPv4","port":6443}]
cp2-IP   k8s         KubePrismConfig   k8s-loadbalancer-config   3         localhost   7445   [{"host":"vip.fqdn","port":6443},{"host":"localhost","port":6443},{"host":"cp2-IP","port":6443},{"host":"cp2-TS-IPv4","port":6443}]
cp3-IP   k8s         KubePrismConfig   k8s-loadbalancer-config   4         localhost   7445   [{"host":"vip.fqdn","port":6443},{"host":"localhost","port":6443},{"host":"cp3-IP","port":6443},{"host":"VIP-IP","port":6443},{"host":"cp3-TS-IPv4","port":6443}]
❯ talosctl get kubeprismendpoint
NODE     NAMESPACE   TYPE                ID            VERSION   HOSTS                                           PORTS
cp1-IP   k8s         KubePrismEndpoint   k8s-cluster   3         vip.fqdn localhost cp1-IP cp1-TS-IPv4           6443 6443 6443 6443 6443
cp2-IP   k8s         KubePrismEndpoint   k8s-cluster   3         vip.fqdn localhost cp2-IP cp2-TS-IPv4           6443 6443 6443 6443 6443
cp3-IP   k8s         KubePrismEndpoint   k8s-cluster   4         vip.fqdn localhost cp3-IP VIP-IP cp3-TS-IPv4    6443 6443 6443 6443 6443 6443

@smira
Copy link
Member

smira commented Jan 12, 2024

Kubernetes service discovery itself requires Kubernetes API access via KubePrism, so it's a loop, and we never got it implemented in the Kubernetes registry. It can be done, but I wouldn't recommend to use it, just use discovery service.

@vaskozl
Copy link
Author

vaskozl commented Jan 12, 2024

I was wondering how a loop would be handled (would it remember the last discovered endpoints if your external lb failed?), but assumed it would work fine with the VIP.

I still see the benefit when the external cluster endpoint is set to the vip, you can benefit from the latency based lb-ing of the go-loadbalncer though right?

That way even if you lose internet you still share the load between active backends for the static hostnetwork pods.

@smira
Copy link
Member

smira commented Jan 12, 2024

KubePrism anyways uses the controlplane endpoint (e.g. VIP), but the problem is that it becomes unavailable, no updates will come from the discovery, so no way to learn other routes.

@vaskozl
Copy link
Author

vaskozl commented Jan 12, 2024

My reasoning is:

  • VIP can failover so it's better than setting the external endpoint to a single non high availability external LB (e.g. haproxy on one machine). If you already have a HA external LB you probably don't really need KubePrism anyway as the biggest value it provides for me is the smart tier based load balancing that's been implemented!
  • When the VIP fails over workers can still discover all the control plane nodes and LB between them
  • This gives both HA and load-balancing between the K8s APIs for free without having a HA external LB

In this way KubePrism effectively replaces a highly available external loadbalancer, and keeps working even if you are airgapped/lose internet.

By not supporting the k8s registry, when you don't have access to the public discovery service you lose balancing. E.g. gateway/router failure.

In other words it would be valuable to have kubeprism keep balancing traffic without external dependencies.

@smira
Copy link
Member

smira commented Jan 12, 2024

Makes sense, it should be relatively easy to implement.

This was referenced Jan 12, 2024
@smira smira self-assigned this Jan 16, 2024
smira added a commit to smira/talos that referenced this issue Jan 18, 2024
Fixes siderolabs#8143

Signed-off-by: Andrey Smirnov <andrey.smirnov@siderolabs.com>
(cherry picked from commit 9782319)
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 7, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants