Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow monitoring of etcd when etcd_deployment_type=host #8002

Closed
sathieu opened this issue Sep 22, 2021 · 9 comments · Fixed by #8203
Closed

Allow monitoring of etcd when etcd_deployment_type=host #8002

sathieu opened this issue Sep 22, 2021 · 9 comments · Fixed by #8203
Labels
kind/feature Categorizes issue or PR as related to a new feature.

Comments

@sathieu
Copy link
Contributor

sathieu commented Sep 22, 2021

What would you like to be added:

Currently, monitoring etcd is very hard for the following reasons:

  • etcd is listening on https with client cert verification
  • the keys are rotated
  • etcd endpoints are not known inside the cluster

Why is this needed:

A production cluster needs etcd to be properly monitored.

See also https://gitlab.com/kubitus-project/kubitus-installer/-/issues/13.

Proposal:

A possible way forward would be to expose the etcd metrics endpoint with a static pod on each etcd nodes (similar to /etc/kubernetes/manifests/nginx-proxy.yml).

@sathieu sathieu added the kind/feature Categorizes issue or PR as related to a new feature. label Sep 22, 2021
@maxpain
Copy link
Contributor

maxpain commented Oct 2, 2021

The same problem. I switched container_manager from docker to containerd and can't monitor the etcd anymore.

@sathieu
Copy link
Contributor Author

sathieu commented Oct 3, 2021

@maxpain How was monitoring working with container_manager=docker?

@maxpain
Copy link
Contributor

maxpain commented Oct 3, 2021

@maxpain How was monitoring working with container_manager=docker?

@sathieu Everything was working fine. etcd in docker can be found by Prometheus via service discovery, but it is impossible with etcd, running as a host process (not in a container).

@sathieu
Copy link
Contributor Author

sathieu commented Oct 4, 2021

@maxpain Reading the code, I couldn't find how this was working.

With etcd_deployment_type=docker, a raw container is created, and not a static Pod. A static pod is only created when etcd_kubeadm_enabled=true. A raw container is not seen in the API.

etcd_kubeadm_enabled is still experimental and has known bugs (#7667, #7765).

Another thing: etcd as installed by kubespray uses server and client certificates, this makes /metrics API unreachable.

@nikitka
Copy link

nikitka commented Nov 12, 2021

@sathieu You can define etcd_metrics_port and scrape metrics from it without TLS.

@sathieu
Copy link
Contributor Author

sathieu commented Nov 12, 2021

@nikitka Thanks 🙏! Now I only need a way to create endpoints or endpointslices!

Ref: etcd_metrics_port implemented in #6092

@floryut
Copy link
Member

floryut commented Nov 15, 2021

Closing then, thanks @nikitka

@floryut floryut closed this as completed Nov 15, 2021
@sathieu
Copy link
Contributor Author

sathieu commented Nov 15, 2021

@floryut Please reopen, only half of the answer is implemented. I'll write a PR for the second half (endpoints).

@floryut floryut reopened this Nov 15, 2021
sathieu added a commit to sathieu/kubespray that referenced this issue Nov 17, 2021
Fixes: kubernetes-sigs#8002

Signed-off-by: Mathieu Parent <mathieu.parent@insee.fr>
sathieu added a commit to sathieu/kubespray that referenced this issue Nov 17, 2021
Fixes: kubernetes-sigs#8002

Signed-off-by: Mathieu Parent <math.parent@gmail.com>
sathieu added a commit to sathieu/kubespray that referenced this issue Nov 17, 2021
Fixes kubernetes-sigs#8002

Signed-off-by: Mathieu Parent <math.parent@gmail.com>
@sathieu
Copy link
Contributor Author

sathieu commented Nov 17, 2021

Thanks for reopening @floryut. PR proposed in #8203.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature Categorizes issue or PR as related to a new feature.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants