You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug a clear and concise description of what the bug is.
We are using kube prometheus stack version 61.3.2 with default prometheus rules for all its components. As we intend to use https://github.com/cloudflare/pint for linting and identify missing metrics, we found many default prometheus rule with linting issue.
for example: pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterCrashlooping",owner="",problem="Template is using `job` label but the query removes it.",reporter="alerts/template",severity="bug"} 1
expression:
- alert: AlertmanagerClusterCrashlooping
annotations:
description: '{{ $value | humanizePercentage }} of Alertmanager instances within
the {{$labels.job}} cluster have restarted at least 5 times in the last 10m.'
runbook_url: https://runbooks.prometheus-operator.dev/runbooks/alertmanager/alertmanagerclustercrashlooping
summary: Half or more of the Alertmanager instances within the same cluster
are crashlooping.
expr: |-
(
count by (namespace,service,cluster) (
changes(process_start_time_seconds{job="prometheus-stack-alertmanager",namespace="namespace1"}[10m]) > 4
)
/
count by (namespace,service,cluster) (
up{job="prometheus-stack-alertmanager",namespace="namespace1"}
)
)
>= 0.5
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-etcd-test-f4c8-4208-a6b8-57da78332911.yaml",kind="alerting",name="etcdHighNumberOfLeaderChanges",owner="",problem="Template is using `job` label but `absent()` is not passing it.",reporter="alerts/template",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterCrashlooping",owner="",problem="`prom` Prometheus server at http://localhost:9090 has `process_start_time_seconds` metric with `job` label but there are no series matching `{job=\"prometheus-stack-alertmanager\"}` in the last 1w.",reporter="promql/series",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterDown",owner="",problem="Template is using `job` label but the query removes it.",reporter="alerts/template",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterFailedToSendAlerts",owner="",problem="Template is using `job` label but the query removes it.",reporter="alerts/template",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterFailedToSendAlerts",owner="",problem="Unnecessary wildcard regexp, simply use `alertmanager_notifications_failed_total{job=\"prometheus-stack-alertmanager\", namespace=\"core-stack\", integration=\"\"}` if you want to match on all time series for `alertmanager_notifications_failed_total` without the `integration` label.",reporter="promql/regexp",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterFailedToSendAlerts",owner="",problem="Unnecessary wildcard regexp, simply use `alertmanager_notifications_failed_total{job=\"prometheus-stack-alertmanager\", namespace=\"core-stack\"}` if you want to match on all `integration` values.",reporter="promql/regexp",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterFailedToSendAlerts",owner="",problem="Unnecessary wildcard regexp, simply use `alertmanager_notifications_total{job=\"prometheus-stack-alertmanager\", namespace=\"core-stack\", integration=\"\"}` if you want to match on all time series for `alertmanager_notifications_total` without the `integration` label.",reporter="promql/regexp",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerClusterFailedToSendAlerts",owner="",problem="Unnecessary wildcard regexp, simply use `alertmanager_notifications_total{job=\"prometheus-stack-alertmanager\", namespace=\"core-stack\"}` if you want to match on all `integration` values.",reporter="promql/regexp",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-alertmanager.rules-8989ea89-f65c-452c-9fd8-f8269af0e2f7.yaml",kind="alerting",name="AlertmanagerConfigInconsistent",owner="",problem="Template is using `job` label but the query removes it.",reporter="alerts/template",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-node-exporter.rules-4a1ae679-a8b5-4bbf-9bfd-ed4f42728e9f.yaml",kind="recording",name="instance:node_load1_per_cpu:ratio",owner="",problem="This query will never return anything on `prom` Prometheus server at http://localhost:9090 because results from the right and the left hand side have different labels: `[container, endpoint, instance, job, namespace, node, pod, service]` != `[container, endpoint, instance, job, namespace, node, pod, receiver_opsgenie_admins, receiver_slack_cluster, service]`. Failing query: `node_load1{job=\"node-exporter\"} / instance:node_num_cpu:sum{job=\"node-exporter\"}`.",reporter="promql/vector_matching",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-kube-prometheus-node-recording.rules-7235730b-029a-4598-9d86-9729c424a8e2.yaml",kind="recording",name="cluster:node_cpu:ratio",owner="",problem="This query will never return anything on `prom` Prometheus server at http://localhost:9090 because results from the right and the left hand side have different labels: `[receiver_opsgenie_admins, receiver_slack_cluster]` != `[]`. Failing query: `cluster:node_cpu:sum_rate5m / count(sum by (instance, cpu) (node_cpu_seconds_total))`.",reporter="promql/vector_matching",severity="bug"} 1
pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-node-exporter.rules-4a1ae679-a8b5-4bbf-9bfd-ed4f42728e9f.yaml",kind="recording",name="instance:node_load1_per_cpu:ratio",owner="",problem="This query will never return anything on `prom` Prometheus server at http://localhost:9090 because results from the right and the left hand side have different labels: `[container, endpoint, instance, job, namespace, node, pod, service]` != `[container, endpoint, instance, job, namespace, node, pod, receiver_opsgenie_admins, receiver_slack_cluster, service]`. Failing query: `node_load1{job=\"node-exporter\"} / instance:node_num_cpu:sum{job=\"node-exporter\"}`.",reporter="promql/vector_matching",severity="bug"} 1```
and almost all alerts under `https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/kubernetes-apps.yaml`
```pint_problem{filename="/etc/prometheus/rules/prometheus-prometheus-stack-prometheus-rulefiles-0/core-stack-prometheus-stack-kubernetes-apps-e679ee64-11ae-433d-820f-b5221857004e.yaml",kind="alerting",name="KubeStatefulSetUpdateNotRolledOut",owner="",problem="Unnecessary wildcard regexp, simply use `kube_statefulset_replicas{job=\"kube-state-metrics\"}` if you want to match on all `namespace` values.",reporter="promql/regexp",severity="bug"} 1```
All the above alerts having lint issue.
### What you expected to happen?
Prometheus default rules needs to be adjusted in order to get rid of linting errors.
### How to reproduce it?
Run pint as sidecar to the prometheus to get the linting alerts.
### Enter the changed values of values.yaml?
_No response_
### Enter the command that you execute and failing/misfunctioning.
It has nothing to do with helm chart as we need to adjust default prometheus rules.
### Anything else we need to know?
_No response_
The text was updated successfully, but these errors were encountered:
Describe the bug a clear and concise description of what the bug is.
We are using kube prometheus stack version
61.3.2
with default prometheus rules for all its components. As we intend to usehttps://github.com/cloudflare/pint
for linting and identify missing metrics, we found many default prometheus rule with linting issue.What's your helm version?
61.3.2
What's your kubectl version?
1.28.11
Which chart?
https://github.com/prometheus-community/helm-charts/edit/kube-prometheus-stack-61.3.2/
What's the chart version?
61.3.2
What happened?
Following default prometheus rules have issue.
The text was updated successfully, but these errors were encountered: