Kube state metrics only shows metrics related to the namespace where it is running #2211

ThiagoScodeler · 2023-09-29T09:58:33Z

What happened:
Kube state metrics only shows metrics related to the namespace where it is running.

What you expected to happen:
Kube state metrics shows metrics related all Kubernetes cluster namespaces.

How to reproduce it (as minimally and precisely as possible):

I have AWS EKS cluster with kube-state-metrics installed in a namespace called "monitoring". This installation is using service monitor and other components (see yaml files below).
In this cluster, there is also a prometheus agent running and selecting the kube-state-metrics service monitor.

kube-state-metrics is listed on Prometheus targets property but when I add a dashboard on Grafana to visualize these metrics, I can only see kube-state-metrics related to the "monitoring" namespace. EKS cluster has other namespaces and kube-state-metrics should display metrics for all of them.

I have a similar setup for Cadvisor and it works fine by showing metrics related to all namespaces.

Any idea why kube-state-metrics is showing only data related to the namespace it is running?

prometheus.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: Prometheus
metadata:
  name: agent
  namespace: monitoring
spec:
  version: v2.39.1
  serviceAccountName: prometheus
  serviceMonitorSelector:
    matchLabels:
      component: prometheus-agent
  serviceMonitorNamespaceSelector:
    matchLabels:
      monitoring: prometheus-agent
  resources:
    requests:
      cpu: 500m
      memory: 1Gi
    limits:
      cpu: 500m
      memory: 1Gi
  replicas: 1
  logLevel: debug
  logFormat: logfmt
  scrapeInterval: 30s
  remoteWrite:
  - url: https://prometheus-workspace
    sigv4:
      region: us-east-1
    queueConfig:
      maxSamplesPerSend: 1000
      maxShards: 200
      capacity: 2500
  containers:
  - name: prometheus
    args:
    - --config.file=/etc/prometheus/config_out/prometheus.env.yaml
    - --storage.agent.path=/prometheus
    - --enable-feature=agent
    - --web.enable-lifecycle
  affinity:
    nodeAffinity:
      requiredDuringSchedulingIgnoredDuringExecution:
        nodeSelectorTerms:
        - matchExpressions:
          - key: role
            operator: In
            values:
            - monitoring
    podAntiAffinity:
      preferredDuringSchedulingIgnoredDuringExecution:
      - podAffinityTerm:
          labelSelector:
            matchExpressions:
            - key: app.kubernetes.io/name
              operator: In
              values:
              - prometheus
          topologyKey: kubernetes.io/hostname
        weight: 100

kube-state-metrics:

service-monitor.yaml

---
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  name: kube-state-metrics
  namespace: monitoring
  labels:
    component: prometheus-agent
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: kube-state-metrics
  endpoints:
  - port: http-metrics

service.yaml

---
apiVersion: v1
kind: Service
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.6.0
  name: kube-state-metrics
  namespace: monitoring
spec:
  clusterIP: None
  ports:
  - name: http-metrics
    port: 8080
    targetPort: http-metrics
  - name: telemetry
    port: 8081
    targetPort: telemetry
  selector:
    app.kubernetes.io/name: kube-state-metrics

deployment.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.6.0
  name: kube-state-metrics
  namespace: monitoring
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: kube-state-metrics
  template:
    metadata:
      labels:
        app.kubernetes.io/component: exporter
        app.kubernetes.io/name: kube-state-metrics
        app.kubernetes.io/version: 2.6.0
    spec:
      automountServiceAccountToken: true
      containers:
      - image: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.6.0
        livenessProbe:
          httpGet:
            path: /healthz
            port: 8080
          initialDelaySeconds: 5
          timeoutSeconds: 5
        name: kube-state-metrics
        ports:
        - containerPort: 8080
          name: http-metrics
        - containerPort: 8081
          name: telemetry
        readinessProbe:
          httpGet:
            path: /
            port: 8081
          initialDelaySeconds: 5
          timeoutSeconds: 5
        securityContext:
          allowPrivilegeEscalation: false
          capabilities:
            drop:
            - ALL
          readOnlyRootFilesystem: true
          runAsUser: 65534
      nodeSelector:
        kubernetes.io/os: linux
      serviceAccountName: kube-state-metrics

cluster-role-binding.yaml

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.6.0
  name: kube-state-metrics
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: kube-state-metrics
subjects:
- kind: ServiceAccount
  name: kube-state-metrics
  namespace: monitoring

cluster-role.yaml

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  labels:
    app.kubernetes.io/component: exporter
    app.kubernetes.io/name: kube-state-metrics
    app.kubernetes.io/version: 2.6.0
  name: kube-state-metrics
rules:
- apiGroups:
  - ""
  resources:
  - configmaps
  - secrets
  - nodes
  - pods
  - services
  - serviceaccounts
  - resourcequotas
  - replicationcontrollers
  - limitranges
  - persistentvolumeclaims
  - persistentvolumes
  - namespaces
  - endpoints
  verbs:
  - list
  - watch
- apiGroups:
  - apps
  resources:
  - statefulsets
  - daemonsets
  - deployments
  - replicasets
  verbs:
  - list
  - watch
- apiGroups:
  - batch
  resources:
  - cronjobs
  - jobs
  verbs:
  - list
  - watch
- apiGroups:
  - autoscaling
  resources:
  - horizontalpodautoscalers
  verbs:
  - list
  - watch
- apiGroups:
  - authentication.k8s.io
  resources:
  - tokenreviews
  verbs:
  - create
- apiGroups:
  - authorization.k8s.io
  resources:
  - subjectaccessreviews
  verbs:
  - create
- apiGroups:
  - policy
  resources:
  - poddisruptionbudgets
  verbs:
  - list
  - watch
- apiGroups:
  - certificates.k8s.io
  resources:
  - certificatesigningrequests
  verbs:
  - list
  - watch
- apiGroups:
  - storage.k8s.io
  resources:
  - storageclasses
  - volumeattachments
  verbs:
  - list
  - watch
- apiGroups:
  - admissionregistration.k8s.io
  resources:
  - mutatingwebhookconfigurations
  - validatingwebhookconfigurations
  verbs:
  - list
  - watch
- apiGroups:
  - networking.k8s.io
  resources:
  - networkpolicies
  - ingresses
  verbs:
  - list
  - watch
- apiGroups:
  - coordination.k8s.io
  resources:
  - leases
  verbs:
  - list
  - watch
- apiGroups:
  - rbac.authorization.k8s.io
  resources:
  - clusterrolebindings
  - clusterroles
  - rolebindings
  - roles
  verbs:
  - list
  - watch

service-account.yaml

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: kube-state-metrics
  namespace: monitoring

kube-state-metrics version: registry.k8s.io/kube-state-metrics/kube-state-metrics:v2.6.0
Kubernetes version (use kubectl version): v1.27.5-eks-43840fb
Cloud provider or hardware configuration: AWS EKS cluster

The text was updated successfully, but these errors were encountered:

dashpole · 2023-10-05T16:48:27Z

/assign @CatherineF-dev
/triage accepted

CatherineF-dev · 2023-10-09T16:21:02Z

Hi, could you try KSM v2.7 and see whether it still has this issue?

I have a cluster running with v2.7 and doesn't have this issue.

ThiagoScodeler · 2023-10-09T17:37:14Z

Hi @CatherineF-dev just tried v2.7.0 and got the same issue, only "monitoring namespace" related metrics:

ThiagoScodeler · 2023-10-09T17:47:48Z

@CatherineF-dev kube-state-metrics pod logs:

I1009 17:19:23.557406       1 wrapper.go:78] Starting kube-state-metrics
I1009 17:19:23.557917       1 server.go:125] "Used default resources"
I1009 17:19:23.558027       1 types.go:184] "Using all namespaces"
I1009 17:19:23.558099       1 server.go:166] "Metric allow-denylisting" allowDenyStatus="Excluding the following lists that were on denylist: "
W1009 17:19:23.558189       1 client_config.go:617] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1009 17:19:23.560029       1 server.go:311] "Tested communication with server"
I1009 17:19:23.575931       1 server.go:316] "Run with Kubernetes cluster version" major="1" minor="27+" gitVersion="v1.27.4-eks-2d98532" gitTreeState="clean" gitCommit="3d90c097c72493c2f1a9dd641e4a22d24d15be68" platform="linux/amd64"
I1009 17:19:23.576116       1 server.go:317] "Communication with server successful"
I1009 17:19:23.576396       1 server.go:263] "Started metrics server" metricsServerAddress="[::]:8080"
I1009 17:19:23.576399       1 metrics_handler.go:97] "Autosharding disabled"
I1009 17:19:23.576689       1 server.go:69] levelinfomsgListening onaddress[::]:8080
I1009 17:19:23.576770       1 server.go:69] levelinfomsgTLS is disabled.http2falseaddress[::]:8080
I1009 17:19:23.576812       1 server.go:252] "Started kube-state-metrics self metrics server" telemetryAddress="[::]:8081"
I1009 17:19:23.576860       1 server.go:69] levelinfomsgListening onaddress[::]:8081
I1009 17:19:23.576876       1 server.go:69] levelinfomsgTLS is disabled.http2falseaddress[::]:8081
I1009 17:19:23.578081       1 builder.go:257] "Active resources" activeStoreNames="certificatesigningrequests,configmaps,cronjobs,daemonsets,deployments,endpoints,horizontalpodautoscalers,ingresses,jobs,leases,limitranges,mutatingwebhookconfigurations,namespaces,networkpolicies,nodes,persistentvolumeclaims,persistentvolumes,poddisruptionbudgets,pods,replicasets,replicationcontrollers,resourcequotas,secrets,services,statefulsets,storageclasses,validatingwebhookconfigurations,volumeattachments"

CatherineF-dev · 2023-10-09T19:09:58Z

Could you curl KSM endpoint directly to list all metrics?

ThiagoScodeler · 2023-10-09T19:53:03Z

@CatherineF-dev yes, I can curl KSM and visualize all metrics. I'm using this Grafana dashboard: https://grafana.com/grafana/dashboards/13332-kube-state-metrics-v2/

CatherineF-dev · 2023-10-10T01:58:26Z

Could you paste one KSM metric around pods?

ThiagoScodeler · 2023-10-10T10:10:24Z

@CatherineF-dev here are some metrics for a non "monitoring" namespaced:

kube_pod_container_resource_limits{namespace="test-development",pod="test-deployment-475bc6cdc9-cjfr9",uid="8a5de859-f203-48fe-8113-afd002056e5648",container="test-container",node="ip-1-1-1-150.ec2.internal",resource="cpu",unit="core"} 1
::
kube_pod_container_resource_requests{namespace="test-development",pod="test-deployment-475bc6cdc9-cjfr9",uid="8a5de859-f203-48fe-8113-afd002056e5648",container="test-container",node="ip-1-1-1-150.ec2.internal",resource="cpu",unit="core"} 0.5
::
kube_pod_container_state_started{namespace="test-development",pod="test-deployment-475bc6cdc9-cjfr9",uid="8a5de859-f203-48fe-8113-afd002056e5648",container="test-container"} 1.696455268e+09
::
kube_pod_container_status_ready{namespace="test-development",pod="test-deployment-475bc6cdc9-cjfr9",uid="8a5de859-f203-48fe-8113-afd002056e5648",container="test-container"} 1

CatherineF-dev · 2023-10-10T14:10:09Z

It does show metrics in other namespaces. I feel it's an issue around grafana dashboard. Maybe you can contact the team who provides this dashboard.

Do you have other questions? If not, we will close this issue.

CatherineF-dev · 2023-10-10T14:12:06Z

/remove kind/bug

ThiagoScodeler · 2023-10-10T14:35:10Z

@CatherineF-dev i'll get in contact with them. Do you have any other recommended grafana dashboard?
Thank you for your support, no more questions from my side.

CatherineF-dev · 2023-10-10T14:43:36Z

Searched inside this repo and didn't find grafana dashboard to monitor cluster.
You can contribute to it if you have time and would like to.

/close

k8s-ci-robot · 2023-10-10T14:43:42Z

@CatherineF-dev: Closing this issue.

In response to this:

Searched inside this repo and didn't find grafana dashboard to monitor cluster.
You can contribute to it if you have time and would like to.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

ThiagoScodeler added the kind/bug Categorizes issue or PR as related to a bug. label Sep 29, 2023

k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 29, 2023

k8s-ci-robot assigned CatherineF-dev Oct 5, 2023

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Oct 5, 2023

k8s-ci-robot closed this as completed Oct 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Kube state metrics only shows metrics related to the namespace where it is running #2211

Kube state metrics only shows metrics related to the namespace where it is running #2211

ThiagoScodeler commented Sep 29, 2023

dashpole commented Oct 5, 2023

CatherineF-dev commented Oct 9, 2023

ThiagoScodeler commented Oct 9, 2023

ThiagoScodeler commented Oct 9, 2023

CatherineF-dev commented Oct 9, 2023

ThiagoScodeler commented Oct 9, 2023 •

edited

Loading

CatherineF-dev commented Oct 10, 2023

ThiagoScodeler commented Oct 10, 2023

CatherineF-dev commented Oct 10, 2023

CatherineF-dev commented Oct 10, 2023

ThiagoScodeler commented Oct 10, 2023 •

edited

Loading

CatherineF-dev commented Oct 10, 2023

k8s-ci-robot commented Oct 10, 2023

Kube state metrics only shows metrics related to the namespace where it is running #2211

Kube state metrics only shows metrics related to the namespace where it is running #2211

Comments

ThiagoScodeler commented Sep 29, 2023

dashpole commented Oct 5, 2023

CatherineF-dev commented Oct 9, 2023

ThiagoScodeler commented Oct 9, 2023

ThiagoScodeler commented Oct 9, 2023

CatherineF-dev commented Oct 9, 2023

ThiagoScodeler commented Oct 9, 2023 • edited Loading

CatherineF-dev commented Oct 10, 2023

ThiagoScodeler commented Oct 10, 2023

CatherineF-dev commented Oct 10, 2023

CatherineF-dev commented Oct 10, 2023

ThiagoScodeler commented Oct 10, 2023 • edited Loading

CatherineF-dev commented Oct 10, 2023

k8s-ci-robot commented Oct 10, 2023

ThiagoScodeler commented Oct 9, 2023 •

edited

Loading

ThiagoScodeler commented Oct 10, 2023 •

edited

Loading