Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Metrics not available at the moment" on minikube , prometheus installed via Lens > Settings > Lens metrics #5052

Closed
ecerulm opened this issue Mar 21, 2022 · 13 comments · Fixed by #6679
Labels
area/metrics All the things related to metrics bug Something isn't working

Comments

@ecerulm
Copy link

ecerulm commented Mar 21, 2022

Describe the bug

I get "metrics not available at the moment" for all pods, even though prometheus is installed using Lens itself.

To Reproduce

minikube delete
minikube start
Start Lens.app > Catalog > minikube
minikube > Settings > Lens Metrics

  • Enabled bundled Prometheus metrics stack (check)
  • Enable bundled kube-state-metrics stack (check)
  • Enable bundled node-exporter stack (check)
  • Apply

minikube > Settings > Metrics > Prometheus > Lens

minikube kubectl -- -n lens-metrics get all
NAME                                     READY   STATUS    RESTARTS   AGE
pod/kube-state-metrics-95ccdf888-tkqzz   1/1     Running   0          98s
pod/node-exporter-68fc7                  1/1     Running   0          98s
pod/prometheus-0                         1/1     Running   0          98s

NAME                         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/kube-state-metrics   ClusterIP   10.97.199.201   <none>        8080/TCP   98s
service/node-exporter        ClusterIP   None            <none>        80/TCP     98s
service/prometheus           ClusterIP   10.107.181.49   <none>        80/TCP     98s

NAME                           DESIRED   CURRENT   READY   UP-TO-DATE   AVAILABLE   NODE SELECTOR   AGE
daemonset.apps/node-exporter   1         1         1       1            1           <none>          98s

NAME                                 READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kube-state-metrics   1/1     1            1           98s

NAME                                           DESIRED   CURRENT   READY   AGE
replicaset.apps/kube-state-metrics-95ccdf888   1         1         1       98s

NAME                          READY   AGE
statefulset.apps/prometheus   1/1     98s

Run a pod

 kubectl run  -ti --rm "test-$RANDOM" --image=ecerulm/ubuntu-tools:latest
root@test-1720:/# apt-get install stress
root@test-1720:/# stress --cpu 1

image

Expected behavior
I expect it to show cpu metrics for the pods, or a debug log somewhere that tells me why there is no metrics. As far as I know Lens it's doing PromQL queries to prometheus-server but I don't know exactly which queries and why those queries come empty

Screenshots
image

Environment (please complete the following information):

  • Lens Version: 5.4.3-latest.20220317.1
  • OS: macOS Big Sur 11.6.2
  • Installation method (e.g. snap or AppImage in Linux): dmg

Logs:
When you run the application executable from command line you will see some logging output. Please paste them here:

Your logs go here...

Kubeconfig:
Quite often the problems are caused by malformed kubeconfig which the application tries to load. Please share your kubeconfig, remember to remove any secret and sensitive information.

your kubeconfig here

Additional context

 minikube kubectl -- port-forward -n lens-metrics service/prometheus 8080:80
Forwarding from 127.0.0.1:8080 -> 9090
Forwarding from [::1]:8080 -> 9090

From the browser I can see that container_cpu_usage_seconds_total{pod="test-1720"}
image

has results, so I guess that Lens is doing some other query , but it's not clear which one.

@ecerulm ecerulm added the bug Something isn't working label Mar 21, 2022
@ecerulm
Copy link
Author

ecerulm commented Mar 21, 2022

I think that Lens.app will do the following PromQL query

case "cpuUsage":
return `sum(rate(container_cpu_usage_seconds_total{container!="POD",container!="",pod=~"${opts.pods}",namespace="${opts.namespace}"}[${this.rateAccuracy}])) by (${opts.selector})`;

So I tried that against the prometheus server

minikube kubectl -- port-forward -n lens-metrics service/prometheus 8080:80

And performed sum(rate(container_cpu_usage_seconds_total{container!="", image!="", pod=~"test-32677", namespace="default"}[1m])) by (pod) and I get Empty result from prometheus server:

image

But if I try with an longer rateAccuracy, prometheus will actually return metrics:

sum(rate(container_cpu_usage_seconds_total{container!="", image!="", pod=~"test-32677", namespace="default"}[20m])) by (pod)

image

@Nokel81 Nokel81 added the area/metrics All the things related to metrics label Mar 21, 2022
@Nokel81
Copy link
Collaborator

Nokel81 commented Mar 21, 2022

It takes some time before we expect metrics to start appearing. However, we can certainly improve the UI here to make that more clear.

@Nokel81 Nokel81 added this to the 5.5.0 milestone Mar 21, 2022
@ecerulm
Copy link
Author

ecerulm commented Mar 21, 2022

that pod "test-32677" has been running 1 hour and it's not showing up in Lens. Also if I try the same thing but installing kube-prometheus (prometheus operator) then metrics appear in Lens.app after 2 minutes.

minikube delete && minikube start --kubernetes-version=v1.23.0 --memory=6g --bootstrapper=kubeadm --extra-config=kubelet.authentication-token-webhook=true --extra-config=kubelet.authorization-mode=Webhook --extra-config=scheduler.bind-address=0.0.0.0 --extra-config=controller-manager.bind-address=0.0.0.0
minikube addons disable metrics-server
minikube kubectl -- apply --server-side -f manifests/setup
minikube kubectl -- apply -f manifests/

and then change in Lens to prometheus operator monitoring/prometheus-k8s:9090 and the CPU chart in Lens.app works almost right away. But I haven't managed to get in minikube the Lens metrics or Helm option to work.

@Nokel81 Nokel81 modified the milestones: 5.5.0, 5.5.1 May 11, 2022
@Nokel81 Nokel81 modified the milestones: 5.5.1, 5.5.2, 5.5.3 May 26, 2022
@Nokel81 Nokel81 modified the milestones: 5.5.3, 5.5.4 Jun 2, 2022
@tapanhalani
Copy link

I am seeing the same behaviour for 1 of my clusters, whereas another cluster in the same workspace works perfectly fine, showing all metrics. Here are more details:

Lens Version: 5.4.6
OS: Ubuntu 20.04
Installation method : .deb package

Here are my observations with 2 of my clusters in lens:

(1.)
Cloud provider: Azure
Service: Azure kubernetes service
k8s version: 1.22
prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, I don't even need to setup Settings -> Metrics -> "helm". With the default setting of "auto detect" , all the metrics (i.e. cluster, node, pod etc) are visible.

(2.)
Cloud provider: AWS
Service: AWS EKS
k8s version: 1.22
prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, the default setting "auto-detect" caused the metrics chart to try loading for 5 mins, and then give message "metrics are not available at this moment"

Upon changing this setting to "helm", and providing prometheus service address as "metrics/prometheus:80" , same observation of "5 minute wait -> metrics not available"

I even tried removing the prometheus helm release, and installed lens-metrics stack (enabled prometheus, kube-state-metrics, node-exporter in "settings -> lens-metrics"), but still the same behaviour.

@Nokel81 Nokel81 modified the milestones: 5.5.4, 5.5.5 Jun 13, 2022
@Nokel81 Nokel81 modified the milestones: 5.5.5, 5.7.0 Jun 27, 2022
@Nokel81 Nokel81 removed this from the 6.1.0 milestone Sep 8, 2022
@nanirover
Copy link

nanirover commented Nov 9, 2022

I am seeing the same behaviour for 1 of my clusters, whereas another cluster in the same workspace works perfectly fine, showing all metrics. Here are more details:

Lens Version: 5.4.6 OS: Ubuntu 20.04 Installation method : .deb package

Here are my observations with 2 of my clusters in lens:

(1.) Cloud provider: Azure Service: Azure kubernetes service k8s version: 1.22 prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, I don't even need to setup Settings -> Metrics -> "helm". With the default setting of "auto detect" , all the metrics (i.e. cluster, node, pod etc) are visible.

(2.) Cloud provider: AWS Service: AWS EKS k8s version: 1.22 prometheus installation: Manually installed helm chart (https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus), in namespace "metrics"

When seeing this cluster in lens, the default setting "auto-detect" caused the metrics chart to try loading for 5 mins, and then give message "metrics are not available at this moment"

Upon changing this setting to "helm", and providing prometheus service address as "metrics/prometheus:80" , same observation of "5 minute wait -> metrics not available"

I even tried removing the prometheus helm release, and installed lens-metrics stack (enabled prometheus, kube-state-metrics, node-exporter in "settings -> lens-metrics"), but still the same behaviour.

I'm experiencing issues after deploying the same chart in Azure AKS since you mentioned you are able to pull all the metrics using the lens.
https://github.com/prometheus-community/helm-charts/tree/prometheus-14.12.0/charts/prometheus
Prometheus: 14.12.0
Lens Version: 6.0
K8 Version: 1.19.11
Do you think enabling/pulling metrics(CPU, memory, Disk), do we need to customize anything on the helm chart side?
Thanks in Advance.

@wizpresso-steve-cy-fan
Copy link

wizpresso-steve-cy-fan commented Nov 29, 2022

I observed that the query works if you removed container!="", image!=""

In my case it is

sum(rate(container_cpu_usage_seconds_total{pod=~"camel-k-operator-84c8c9d56b-s5l6k", namespace="default"}[1m])) by (pod)

image
image

@wizpresso-steve-cy-fan
Copy link

cc #5660 (comment)

@Nokel81
Copy link
Collaborator

Nokel81 commented Nov 29, 2022

@wizpresso-steve-cy-fan Which provider do you have set in your cluster preferences?

@Nokel81
Copy link
Collaborator

Nokel81 commented Nov 29, 2022

Thanks for bringing this up.

@Nokel81
Copy link
Collaborator

Nokel81 commented Nov 29, 2022

Though I think you probably should be using the "Helm 14.x" provider

@wizpresso-steve-cy-fan
Copy link

@Nokel81 I used Prometheus Operator since this is the way I installed it. I used an helm chart to install the operator

@Nokel81
Copy link
Collaborator

Nokel81 commented Nov 30, 2022

I guess my question was mostly towards @nanirover

@Nokel81
Copy link
Collaborator

Nokel81 commented Nov 30, 2022

@ecerulm Did you try and change the scrape_interval for your install? We have it set at 15s which should mean that the 1m rate interval can collect 2 or more scrapes.

I guess your problem is that the scrapes seem to be failing quite often.

NOTE: the above PR is for fixing @wizpresso-steve-cy-fan's issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/metrics All the things related to metrics bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants