Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix errors printed by kubernetes-static binary #149

Closed
invidian opened this issue Jul 9, 2021 · 0 comments
Closed

Fix errors printed by kubernetes-static binary #149

invidian opened this issue Jul 9, 2021 · 0 comments
Labels
bug Categorizes issue or PR as related to a bug.

Comments

@invidian
Copy link
Contributor

invidian commented Jul 9, 2021

Right now (with slightly improved formatting), the binary which is used mainly for testing prints a lot of errors, which makes it very hard to use. Example:

WARN[0000] Job kube-state-metrics ran with errors: populate errors:
  - error populating metric for entity ID default_cockroachdb-public: cannot fetch value for metric "selector.*": related metric not found. Metric: apiserver_kube_service_spec_selectors service:default_cockroachdb-public
  - error populating metric for entity ID default_cockroachdb: cannot fetch value for metric "selector.*": related metric not found. Metric: apiserver_kube_service_spec_selectors service:default_cockroachdb
  - error populating metric for entity ID kube-system_metrics-server: cannot fetch value for metric "selector.*": related metric not found. Metric: apiserver_kube_service_spec_selectors service:kube-system_metrics-server
  - error populating metric for entity ID default_kubernetes: cannot fetch value for metric "selector.*": related metric not found. Metric: apiserver_kube_service_spec_selectors service:default_kubernetes
  - error populating metric for entity ID kube-system_kube-dns: cannot fetch value for metric "selector.*": related metric not found. Metric: apiserver_kube_service_spec_selectors service:kube-system_kube-dns
  - error populating metric for entity ID minikube: cannot fetch value for metric "condition.*": metric "kube_node_status_condition" not found
  - error generating entity ID for kube-system_coredns-5c98db65d4-pgnvj: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_metrics-server-67fb648c5-xcx7h: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for default_sh: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_kindnet-hdg22: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_kube-apiserver-minikube: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for default_cockroachdb-2: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_kube-controller-manager-minikube: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_storage-provisioner: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_kube-state-metrics-6766c6d46b-rwzpw: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_kube-scheduler-minikube: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_kube-proxy-njc6c: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_coredns-5c98db65d4-m76hk: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for kube-system_etcd-minikube: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for default_cockroachdb-0: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found
  - error generating entity ID for default_cockroachdb-1: cannot fetch label pod for metric kube_pod_status_phase, metric "kube_pod_status_phase" not found

This should be fixed to make the tool more useful.

I suspect those errors might also be a indicator of some bugs. For example, I don't see any reference to apiserver_kube_service_spec_selectors metric

@invidian invidian added the bug Categorizes issue or PR as related to a bug. label Jul 12, 2021
invidian added a commit that referenced this issue Jul 12, 2021
Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 12, 2021
Without this patch, following error is printed:

"entity name and type are required when defining one"

Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 13, 2021
Those metric may not always be available due to various reasons, marking
them as optional silents them from reporting as errors.

If I recall correctly:
- createdAt, createdKind, createdBy, deploymentName metrics won't be
  available for e.g. static pods.
- reason, message metrics will only be available for failing pods.
- cpuRequestedCores, cpuLimitCores, memoryRequestedBytes,
  memoryLimitBytes will only be calculated for pods with resource
  requests configured.
- pvc* metrics will only be available for volumes backed by actual PVC,
  not for EmptyDir volume etc.

Refs #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 14, 2021
Those metric may not always be available due to various reasons, marking
them as optional silents them from reporting as errors.

If I recall correctly:
- createdAt, createdKind, createdBy, deploymentName metrics won't be
  available for e.g. static pods.
- reason, message metrics will only be available for failing pods.
- cpuRequestedCores, cpuLimitCores, memoryRequestedBytes,
  memoryLimitBytes will only be calculated for pods with resource
  requests configured.
- pvc* metrics will only be available for volumes backed by actual PVC,
  not for EmptyDir volume etc.

Refs #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 14, 2021
Those metric may not always be available due to various reasons, marking
them as optional silents them from reporting as errors.

If I recall correctly:
- createdAt, createdKind, createdBy, deploymentName metrics won't be
  available for e.g. static pods.
- reason, message metrics will only be available for failing pods.
- cpuRequestedCores, cpuLimitCores, memoryRequestedBytes,
  memoryLimitBytes will only be calculated for pods with resource
  requests configured.
- pvc* metrics will only be available for volumes backed by actual PVC,
  not for EmptyDir volume etc.

Refs #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 16, 2021
Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 16, 2021
Without this patch, following error is printed:

"entity name and type are required when defining one"

Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 16, 2021
Those metric may not always be available due to various reasons, marking
them as optional silents them from reporting as errors.

If I recall correctly:
- createdAt, createdKind, createdBy, deploymentName metrics won't be
  available for e.g. static pods.
- reason, message metrics will only be available for failing pods.
- cpuRequestedCores, cpuLimitCores, memoryRequestedBytes,
  memoryLimitBytes will only be calculated for pods with resource
  requests configured.
- pvc* metrics will only be available for volumes backed by actual PVC,
  not for EmptyDir volume etc.

Refs #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 16, 2021
Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 16, 2021
Without this patch, following error is printed:

"entity name and type are required when defining one"

Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
invidian added a commit that referenced this issue Jul 16, 2021
Those metric may not always be available due to various reasons, marking
them as optional silents them from reporting as errors.

If I recall correctly:
- createdAt, createdKind, createdBy, deploymentName metrics won't be
  available for e.g. static pods.
- reason, message metrics will only be available for failing pods.
- cpuRequestedCores, cpuLimitCores, memoryRequestedBytes,
  memoryLimitBytes will only be calculated for pods with resource
  requests configured.
- pvc* metrics will only be available for volumes backed by actual PVC,
  not for EmptyDir volume etc.

Refs #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
roobre added a commit that referenced this issue Jul 16, 2021
* cmd/kubernetes-static/readme.md: remove trailing whitespace

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/readme.md: fix documentation

Right now the binary actually expect working directory to be root of the
repository.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* go.mod: bump Go version to 1.16

To indicate that Go 1.16 should be used for building, so we can safely
use go:embed directive for kubernetes-static.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/data: fix location of api-server metrics

Integration expects it at "api-server" while it was at "apiserver".

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: print newline at the end of execution

So when running shell do not spawn at the end of the output. That
improves readability and usability, while it shouldn't interrupt any
consumers, as usually trailing whitespace is properly handled.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/metric: automatically improve formatting

Using 'gci' formatter.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static: use go:embed for serving static data

This way, binary is independent from host file system and it's only
important during build.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: simplify service list initialization

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: simplify mock client initialization

In majority of cases there is no need to use 'new' keyword.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: simplify returned endpoint address

There is no need to use 'localhost' explicitly.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: remove unnecessary Sleep

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: improve imports naming

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static: use only single source file

So it's more intuitive to use 'go run'.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: add missing Service objects

Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: fix collecting API server metrics

Without this patch, following error is printed:

"entity name and type are required when defining one"

Part of #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: use constants for controlplane components

So they are less likely to get out of sync with other parts of the code.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/definition: improve variable names a bit

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/metric/definition.go: mark some metrics as optional

Those metric may not always be available due to various reasons, marking
them as optional silents them from reporting as errors.

If I recall correctly:
- createdAt, createdKind, createdBy, deploymentName metrics won't be
  available for e.g. static pods.
- reason, message metrics will only be available for failing pods.
- cpuRequestedCores, cpuLimitCores, memoryRequestedBytes,
  memoryLimitBytes will only be calculated for pods with resource
  requests configured.
- pvc* metrics will only be available for volumes backed by actual PVC,
  not for EmptyDir volume etc.

Refs #149

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/definition/fetch.go: improve error messages a bit

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* cmd/kubernetes-static/main.go: mock newer Kubernetes version

To at least align with version of latest test data we have.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/definition/populate.go: improve error message formatting

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/ksm/group.go: improve errors a bit

To use standard formatting and some minimal error annotation to make
error tracing easier while debugging.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/ksm/group.go: small styling improvements

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/kubelet: small styling improvements

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/kubelet/metric/metric.go: simplify loop logic

Loop labels are not really needed, as they always stop the closest loop
anyway.

Check for nil Containers slice is also not needed, since iterating over
nil slice will result in no iterations anyway.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/prometheus: small styling improvements

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/prometheus: improve error messages

So it's clear which is label name and metric name.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>
Co-authored-by: Roberto Santalla <roobre@roobre.es>

* src/metric/definition.go: improve error messages in toUtilization()

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/definition: small styling and formatting improvements

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

* src/kubelet/metric: improvements to GetMetricsData method

It will now return a pointer to a summary, which should save some memory
copying and is more standard approach for functions, which may return
error in Go.

Additionally, to make use of using structure pointer,
GroupStatsSummary() method is also adopted to take a pointer of the
summary.

There are also improved error messages in GetMetricsData(), which should
be more helpful while debugging.

Signed-off-by: Mateusz Gozdek <mgozdek@microsoft.com>

Co-authored-by: Roberto Santalla <roobre@roobre.es>
@davidgit davidgit closed this as completed Aug 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

2 participants