[kube-prometheus-stack] additionalRuleLabels shouldn't apply to recording rules #3396

defenestration · 2023-05-17T18:21:39Z

Describe the bug a clear and concise description of what the bug is.

It looks like a recent #3351 for kube-prometheus-stack-45.28.1 applies additionalRuleLabels to recording rules. Which is a new change. We have a additionalRuleLabels label defined, to help us route alert rules based on a namespace annotation. (see #1231 (comment) for background on that.)

In our helm chart we use something like:

additionalRuleLabels: 
  tsc_owner: '{{ with printf `kube_namespace_annotations{namespace="%s"}` .Labels.namespace | query }}{{ with (. | first).Labels.annotation_tsc_owner }}{{ . }}{{else}}none{{end}}{{else}}none{{end}}'

With the recent release, this label also gets applied to recording rules.
This caused an error in for us. The built-in alert PrometheusRuleFailures fires and going to http://prometheus/rules page shows an error with this recording rule in kube-apiserver-availability.rules

record: code_verb:apiserver_request_total:increase30d
expr: avg_over_time(code_verb:apiserver_request_total:increase1h[30d]) * 24 * 30

ERR	
vector contains metrics with the same labelset after applying rule labels

It seems like it might be good to specify different, additionalRuleLabels to apply to alerts and recording rules separately, ex; add additionalAlertRuleLabels additionalRecordingRuleLabels parameters to the helm chart possibly.

What's your helm version?

version.BuildInfo{Version:"v3.11.2", GitCommit:"912ebc1cd10d38d340f048efaf0abda047c3468e", GitTreeState:"clean", GoVersion:"go1.18.10"}

What's your kubectl version?

1.24

Which chart?

kube-prometheus-stack

What's the chart version?

45.28.1

What happened?

The built-in alert PrometheusRuleFailures fires and going to http://prometheus/rules page shows an error with this recording rule in kube-apiserver-availability.rules

record: code_verb:apiserver_request_total:increase30d
expr: avg_over_time(code_verb:apiserver_request_total:increase1h[30d]) * 24 * 30

ERR	
vector contains metrics with the same labelset after applying rule labels

What you expected to happen?

no error

How to reproduce it?

Try to add a similar additionalRuleLabels

defaultRules:
  additionalRuleLabels: 
    tsc_owner: '{{ with printf `kube_namespace_annotations{namespace="%s"}` .Labels.namespace | query }}{{ with (. | first).Labels.annotation_tsc_owner }}{{ . }}{{else}}none{{end}}{{else}}none{{end}}'

This will render correct on alert rules

the metric code_verb:apiserver_request_total:increase1h contains the label text verbatim as well, which is undesired.

Enter the changed values of values.yaml?

None from previous version of chart.

Enter the command that you execute and failing/misfunctioning.

N/A - flux auto upgrades to latest version of helm chart.

Anything else we need to know?

No response

The text was updated successfully, but these errors were encountered:

defenestration · 2023-05-17T18:40:19Z

I rolled back to 45.28.0 and this issue stopped, so definitely caused by 45.28.1

defenestration · 2023-05-17T18:52:39Z

I see the original issue #3340 for the problem PR suggests what i think should be the desired solution: separate additionalRuleLabels for alert & recording. Probably additionalRuleLabels could be used to apply to both, as long as we can still separate out alert and recording rules.

buroa · 2023-05-17T23:02:52Z

There is something wrong with this version which I needed to roll back as well. Trying to install the Helm release gives errors like this:

failed to diff release against cluster resources: [PrometheusRule/monitoring/kube-prometheus-stack-k8s.rules dry-run failed, reason: Invalid, error: PrometheusRule.monitoring.coreos.com "kube-prometheus-stack-k8s.rules" is invalid: [spec.groups[0].rules[0].labels: Invalid value: "null"

buroa · 2023-05-18T13:52:24Z

@scott-grimes Any idea?

scott-grimes · 2023-05-18T14:03:15Z

looks like we should split this like @defenestration has suggested, allow additionalRuleLabels to be applied to both, add an additional two config options to target labels specifically for alerts and recording rules separately

buroa · 2023-05-18T14:09:11Z

looks like we should split this like @defenestration has suggested, allow additionalRuleLabels to be applied to both, add an additional two config options to target labels specifically for alerts and recording rules separately

The biggest problem this PR has introduced is the required labels key, even if additionalRuleLabels is not specified. To get around this bug, I had to do something like buroa/k8s-gitops@ede393d just to get it to deploy correctly.

scott-grimes · 2023-05-18T21:35:34Z

@buroa submitted pr to fix issue labels:always being required. PR above checks to see if there are any labels that need to be applied, and adds the labels: only when necessary

buroa · 2023-05-18T21:41:21Z

@scott-grimes Thanks! Looks great!

Jellyfrog · 2023-05-24T15:24:39Z

The PR that closes this doesn't fix the issue from the title tho "additionalRuleLabels shouldn't apply to recording rules".
This should be reopened

darkobas · 2023-05-24T21:22:46Z

yes, the issue still persists

defenestration · 2023-05-25T17:51:27Z

Agreed, @buroa 's issue was fixed, but the original issue still exists. Unsure if i can reopen this issue myself though.

defenestration · 2023-05-25T18:08:30Z

Well, looking at a diff between 45.28.0 and the current 46.4.1 I see additionalRuleGroupLabels was added, so for my issue, I can probably work with that, but will have to figure out which ones are recording rules and which not.
Could be trouble if recording/alerting rules are in the same group.

https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml#L79

Current plan is to look at the files changed in the initial pr for 45.28.1 and not add my relabel rules to those groups.

defenestration · 2023-05-25T19:24:13Z

So the rule groups so far have divisions between alert and recording rules. Here are the current additionalRuleGroupLabels that only have recording rules:

k8s
kubeApiserverAvailability
kubeApiserverBurnrate
kubeApiserverHistogram
kubelet
kubePrometheusGeneral
kubePrometheusNodeRecording
kubeSchedulerRecording
node
nodeExporterRecording

So i just added my label rule to every group but those.

      additionalRuleGroupLabels:
        alertmanager: 
          owner: {{ stuff }} 
       ...

So, it seems the issue can stay closed after all. Its not as fine of a line as initially imagined but so be it.

dmitriy-lukyanchikov · 2023-06-05T21:06:24Z

i believe that shouldn't work like that because in docs its saying that its label only for alert but its not. and you cant define label for just all alerts its pretty annoying issue that break previous functionality i dont think it should be closed until someone fixes docs or broken functionality

speer · 2023-06-12T16:07:21Z

i believe that shouldn't work like that because in docs its saying that its label only for alert but its not. and you cant define label for just all alerts its pretty annoying issue that break previous functionality i dont think it should be closed until someone fixes docs or broken functionality

I agree! We are hitting the same issue after upgrading.

nickyfoster · 2023-12-04T18:49:18Z

Please reopen as we're still experiencing the same issue.

zen · 2024-01-18T18:31:40Z

We are on the most recent chart version (55.11.0) and also hitting the same issue. We are using very simple config :

defaultRules:
  additionalRuleLabels:
    team: devops

sbrodriguez · 2024-01-26T10:04:35Z

Same problem in Pometheus 56.1.0 version. We have to label individually excluding kubeApiserverAvailability.

defaultRules:
  additionalRuleGroupLabels:
...
    k8sPodOwner:
      group_dest: infra
    kubeApiserverAvailability: {}
    kubeApiserverBurnrate:
      group_dest: infra
    kubeApiserverHistogram:
...

If we add the label there or in the global config additionalRuleLabels we allways get the error in this rule:

Vector contains metrics with the same labelset after applying rule labels.

sfrolich · 2024-02-07T02:36:30Z

Same here on 56.6.2

Aaron-ML · 2024-03-18T23:32:51Z

Still seeing this even on 57.0.2. Does not work as documented.

ashuec90 · 2024-10-14T09:17:48Z

We also seeing this error in 62.7.0 chart version as well for kube-apiserver-availability.rules.

defenestration added the bug Something isn't working label May 17, 2023

defenestration changed the title ~~[kube-prometheus-stack] additionalRuleLabels shoudln't apply to recording rules~~ [kube-prometheus-stack] additionalRuleLabels shouldn't apply to recording rules May 17, 2023

scott-grimes mentioned this issue May 18, 2023

[kube-prometheus-stack] - resolve issue 3396, no null labels allowed for PrometheusRules #3400

Merged

3 tasks

buroa mentioned this issue May 18, 2023

[kube-prometheus-stack] role labels issue introduced in 45.28.1 #3398

Closed

splattner mentioned this issue May 19, 2023

Revert "chore(deps): update helm release kube-prometheus-stack to v45.28.1" acend/infrastructure#196

Merged

GMartinez-Sisti closed this as completed in #3400 May 22, 2023

michaelruigrok mentioned this issue Mar 27, 2024

[kube-prometheus-stack] let additionalRuleGroupLabels override additionalRuleLabels #4395

Open

3 tasks

dimaslv mentioned this issue Jul 29, 2024

Split rule group "node" into "node" (only alerts) and "node-records" (only records) VictoriaMetrics/helm-charts#1175

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[kube-prometheus-stack] additionalRuleLabels shouldn't apply to recording rules #3396

[kube-prometheus-stack] additionalRuleLabels shouldn't apply to recording rules #3396

defenestration commented May 17, 2023

defenestration commented May 17, 2023

defenestration commented May 17, 2023

buroa commented May 17, 2023 •

edited

Loading

buroa commented May 18, 2023

scott-grimes commented May 18, 2023

buroa commented May 18, 2023

scott-grimes commented May 18, 2023 •

edited

Loading

buroa commented May 18, 2023

Jellyfrog commented May 24, 2023

darkobas commented May 24, 2023

defenestration commented May 25, 2023

defenestration commented May 25, 2023 •

edited

Loading

defenestration commented May 25, 2023

dmitriy-lukyanchikov commented Jun 5, 2023

speer commented Jun 12, 2023

nickyfoster commented Dec 4, 2023

zen commented Jan 18, 2024

sbrodriguez commented Jan 26, 2024 •

edited

Loading

sfrolich commented Feb 7, 2024

Aaron-ML commented Mar 18, 2024

ashuec90 commented Oct 14, 2024

[kube-prometheus-stack] additionalRuleLabels shouldn't apply to recording rules #3396

[kube-prometheus-stack] additionalRuleLabels shouldn't apply to recording rules #3396

Comments

defenestration commented May 17, 2023

Describe the bug a clear and concise description of what the bug is.

What's your helm version?

What's your kubectl version?

Which chart?

What's the chart version?

What happened?

What you expected to happen?

How to reproduce it?

Enter the changed values of values.yaml?

Enter the command that you execute and failing/misfunctioning.

Anything else we need to know?

defenestration commented May 17, 2023

defenestration commented May 17, 2023

buroa commented May 17, 2023 • edited Loading

buroa commented May 18, 2023

scott-grimes commented May 18, 2023

buroa commented May 18, 2023

scott-grimes commented May 18, 2023 • edited Loading

buroa commented May 18, 2023

Jellyfrog commented May 24, 2023

darkobas commented May 24, 2023

defenestration commented May 25, 2023

defenestration commented May 25, 2023 • edited Loading

defenestration commented May 25, 2023

dmitriy-lukyanchikov commented Jun 5, 2023

speer commented Jun 12, 2023

nickyfoster commented Dec 4, 2023

zen commented Jan 18, 2024

sbrodriguez commented Jan 26, 2024 • edited Loading

sfrolich commented Feb 7, 2024

Aaron-ML commented Mar 18, 2024

ashuec90 commented Oct 14, 2024

buroa commented May 17, 2023 •

edited

Loading

scott-grimes commented May 18, 2023 •

edited

Loading

defenestration commented May 25, 2023 •

edited

Loading

sbrodriguez commented Jan 26, 2024 •

edited

Loading