Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kube-prometheus-stack] Add downward compat for Prom CRD #4818

Merged

Conversation

schnatterer
Copy link
Contributor

@schnatterer schnatterer commented Aug 26, 2024

What this PR does / why we need it

Running the latest prom-stack versions on legacy OpenShift clusters with no influence on the preinstalled CRDs results in errors such as this:

failed to create typed patch object (..): .spec.scrapeConfigNamespaceSelector: field not declared in schema

This PR provides a workaround using this values.yaml:

prometheus:
  prometheusSpec:
    scrapeConfigNamespaceSelector: null
    scrapeConfigSelectorNilUsesHelmValues: null
    scrapeConfigSelector: null

Yes, this is no ideal solution, but that seems to be what the enterprise-world requires 😐️

At runtime, the operator yields some warnings about the missing fields, but apparently everything else works.

Special notes for your reviewer

Checklist

  • DCO signed
  • Chart Version bumped
  • Title of the PR starts with chart name (e.g. [prometheus-couchdb-exporter])

… CRD fields

Running the latest prom-stack versions on legacy OpenShift clusters with no influence on the preinstalled CRDs results in errors such as this:

failed to create typed patch object (..): .spec.scrapeConfigNamespaceSelector: field not declared in schema

This patch provides a workaround using this values.yaml:

prometheus:
  prometheusSpec:
    scrapeConfigNamespaceSelector: null
    scrapeConfigSelectorNilUsesHelmValues: null
    scrapeConfigSelector: null
Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
@schnatterer schnatterer force-pushed the feature/downward-compat-prom-crd branch from 25a3828 to 7157047 Compare August 28, 2024 09:17
@jkroepke
Copy link
Member

jkroepke commented Aug 28, 2024

Hi @schnatterer

in general, I understand the use-case and I'm happy to assist here.

However, I also like constancy. In conclusion, all selectors should gain this capability.

@GMartinez-Sisti What did you think here? The selector having some hacks here. If you would ask me, I would favorite a cleanup.

Instead:

{{- if .Values.prometheus.prometheusSpec.scrapeConfigSelector }}
  scrapeConfigSelector:
{{ tpl (toYaml .Values.prometheus.prometheusSpec.scrapeConfigSelector | indent 4) . }}
{{ else if .Values.prometheus.prometheusSpec.scrapeConfigSelectorNilUsesHelmValues  }}
  scrapeConfigSelector:
    matchLabels:
      release: {{ $.Release.Name | quote }}
{{ else if not (eq .Values.prometheus.prometheusSpec.scrapeConfigSelector nil) }}
  scrapeConfigSelector: {}
{{- end }}

I would replace the whole logic to:

{{- if not (or (eq .Values.prometheus.prometheusSpec.scrapeConfigSelector nil) (eq .Values.prometheus.prometheusSpec.scrapeConfigSelector "")) }}
  scrapeConfigSelector: {{ tpl (toYaml .Values.prometheus.prometheusSpec.scrapeConfigSelector | indent 4) . }}
{{- end }}

and I would like to move the default value to the helm values file, since we support tpl.

  scrapeConfigSelector:
    matchLabels:
      release: '{{ $.Release.Name | quote }}'

In conclusion SelectorNilUsesHelmValues could be removed and the template might be less complex.

If scrapeConfigSelector is set to {}, it will be propagated to the template.

Such similar config can be applied to the following selectors:

  • ruleSelector
  • ruleNamespaceSelector
  • scrapeConfigSelector
  • scrapeConfigNamespaceSelector
  • probeSelector
  • probeNamespaceSelector
  • serviceMonitorSelector
  • serviceMonitorNamespaceSelector
  • podMonitorSelector
  • podMonitorNamespaceSelector

@GMartinez-Sisti
Copy link
Member

@jkroepke I like the idea, it does make it cleaner!

Regarding:

SelectorNilUsesHelmValues could be removed

We need to test if this is considered a breaking change.

@jkroepke
Copy link
Member

It must be considered as breaking change.

If end user set SelectorNilUsesHelmValues and doesn't change the default value, while we changing the default at values.yaml, there result would be a breaking change.

However, I feel thats fine by bumping major version + documentation.

…n in prometheus-community#4818

Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
@schnatterer
Copy link
Contributor Author

schnatterer commented Aug 28, 2024

Hi @jkroepke and @GMartinez-Sisti,
Thanks for picking up on the PR.
I'm also a fan of consistency and happy to improve the chart as a whole.

SelectorNilUsesHelmValues always has been a bit confusing to me, so getting rid of it would be nice.

I tried your suggestion and it seems to fix my problem.

The default values fail, though.

$ helm template .
Error: template: kube-prometheus-stack/templates/prometheus/prometheus.yaml:249:80: executing "kube-prometheus-stack/templates/prometheus/prometheus.yaml" at <eq .Values.prometheus.prometheusSpec.scrapeConfigSelector "">: error calling eq: incompatible types for comparison

This can be fixed by dropping the string comparison

{{- if not (eq .Values.prometheus.prometheusSpec.scrapeConfigSelector nil)}}
# instead of
{{- if not (or (eq .Values.prometheus.prometheusSpec.scrapeConfigSelector nil) (eq .Values.prometheus.prometheusSpec.scrapeConfigSelector "")) }}

Next is this:

$ helm template .
Error: YAML parse error on kube-prometheus-stack/templates/prometheus/prometheus.yaml: error converting YAML to JSON: yaml: line 68: mapping values are not allowed in this context

Which could be fixed like so

  scrapeConfigSelector:
      {{- tpl (toYaml .Values.prometheus.prometheusSpec.scrapeConfigSelector | indent 4) . | nindent 2 }}
# instead of
  scrapeConfigSelector: {{ tpl (toYaml .Values.prometheus.prometheusSpec.scrapeConfigSelector | indent 4) . }}

Now templating succeeds but we have redundant quotes:

$ helm template .   |  grep scrapeConfig -A2
  scrapeConfigSelector:
      matchLabels:
        release: '"release-name"'
        
$  helm template prometheus-community/kube-prometheus-stack --version 58.2.1 | grep scrapeConfig -A2
  scrapeConfigSelector:
    matchLabels:
      release: "release-name"

This could be fixed by removing the quote function from the helm-values.

    scrapeConfigSelector:
      matchLabels:
        release: "{{ $.Release.Name }}"

Then, however, the templated YAML contains single quotes instead of the double quotes it contained before. Would that matter?

 helm template .   |  grep scrapeConfig -A2
  scrapeConfigSelector:
      matchLabels:
        release: 'release-name'

To make sure we're all on the same boat, I pushed the current state of the POC, see 739fe4c . Please have a look at it and tell me what you think. If we're all happy with the concept, I can implement the other selectors.

I also would consider this a breaking change and see the downsides of it.

What would that mean for me as as contributor? Would you accept a PR by me that document the change and bump the major version?
Or would a change like that delay the delivery of my fix?

@schnatterer
Copy link
Contributor Author

schnatterer commented Aug 28, 2024

Ah, I missed the test cases for setting other options, which do not seem to work as expected 😬

$ helm template . --values - <<EOF | grep 'scrapeConfigSelector' -A2
prometheus:
  prometheusSpec:
    scrapeConfigSelector: {}
EOF

  scrapeConfigSelector:
      matchLabels:
        release: 'release-name'


$  helm template . --values - <<EOF | grep 'scrapeConfigSelector' -A3
prometheus:
  prometheusSpec:
    scrapeConfigSelector:
       a: b
EOF

  scrapeConfigSelector:
      a: b
      matchLabels:
        release: 'release-name'

I'm a bit a loss here with my templating skills.

@schnatterer
Copy link
Contributor Author

As an alternative, could we agree on a less invasive fix, as originally proposed, but for all selectors?

This way we would at least have consistency, and no breaking change.

WDYT @jkroepke ?

@jkroepke
Copy link
Member

jkroepke commented Aug 28, 2024

Coud you try to implement the nil check based on Masterminds/sprig#53 (comment)

Edit: I see, in helm it not possible to clear a map.


After a quick test, a possible implementation could be this:

Template:

spec:
{{- if not (kindIs "invalid" .Values.prometheus.prometheusSpec.scrapeConfigSelector) }}
  scrapeConfigSelector: {{ tpl (toYaml .Values.prometheus.prometheusSpec.scrapeConfigSelector) . | nindent 4 }}
{{- end }}

Values:

prometheus:
  prometheusSpec:
    scrapeConfigSelector:
      matchLabels:
        release: "{{ $.Release.Name }}"

Case 1: Default Values

Results into

spec:
  scrapeConfigSelector: 
    matchLabels:
      release: 'release-name'

Case 2: Override with nil

$ helm template . --set 'prometheus.prometheusSpec.scrapeConfigSelector=null'
$ helm template . --values - <<EOF
prometheus:
  prometheusSpec:
    scrapeConfigSelector: ~
EOF

Results into

spec:

Case 2: Empty array.

For prometheus operator, the value nil and the value empty array are different and have different behaviors.

Ref: https://github.com/prometheus-operator/prometheus-operator/blob/e291bd7aec65d0867977fdd2eb464e9db800b667/pkg/apis/monitoring/v1/prometheus_types.go#L127C1-L128C67

Due helm limitations, override with an empty array will be ignored. The only known helm native solution would be overriding the subkey with an nil value.

$ helm template . --set 'prometheus.prometheusSpec.scrapeConfigSelector.matchLabels=null'
$ helm template . --values - <<EOF
prometheus:
  prometheusSpec:
    scrapeConfigSelector: 
      matchLabels: ~
EOF

Results into

spec:
  scrapeConfigSelector: 
    {}

…stion in prometheus-community#4818

prometheus-community#4818 (comment)
Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
@schnatterer
Copy link
Contributor Author

Interesting solution, @jkroepke 💡

Let me sum up the breaking change:

If you used to set scrapeConfigSelectorNilUsesHelmValues to false.
You now have to set scrapeConfigSelector.matchLabels=null.

In other words:

#  What used to be:
$ helm template prometheus-community/kube-prometheus-stack --version 62.3.0 --values - <<EOF | grep scrapeConfigSelector
prometheus:
  prometheusSpec:
    scrapeConfigSelectorNilUsesHelmValues: false
EOF
  scrapeConfigSelector: {}

# Is now:
helm template . --values - <<EOF | grep 'scrapeConfigSelector' -A2
prometheus:
  prometheusSpec:
    scrapeConfigSelector:
      matchLabels: null
EOF
  scrapeConfigSelector:
    {}

And we can live with the additional line break after scrapeConfigSelector here 👆️, right?

We can also live with the single quotes in the default values 👇️, right?

# What used to be:
$ helm template prometheus-community/kube-prometheus-stack --version 62.3.0 | grep scrapeConfigSelector -A3
  scrapeConfigSelector:
    matchLabels:
      release: "release-name"
# Is now:
$  helm template .  | grep 'scrapeConfigSelector' -A2
  scrapeConfigSelector:
    matchLabels:
      release: 'release-name'

The same is true for all *NilUsesHelmValues of the selectors you mentioned above, right?

Plus, I can now set the selector to null to remove it altogether for downward compatibility (my original use case).

So what I will do is

  • implement this for all selectors mentioned above
  • Remove all *NilUsesHelmValues from values.yaml
  • Adapt the comment over the selectors to reflect matchLabels: null as new NilUsesHelmValues, and mention the option of setting it to null (please review this in my suggestion in b83ccf1)
  • Add a section in the README ### From 62.x to 63.x that describes the breaking change
  • bump the major version

And after some polishing during review you will merge the PR?

I'm asking because the number of changes will make this quite an invest for me, and I want to make sure we all agree on this path, so my time is not wasted.

@jkroepke
Copy link
Member

jkroepke commented Aug 29, 2024

And we can live with the additional line break after scrapeConfigSelector here 👆️, right?

It is at least valid YAML. That what i checked. If it causes problems, I have already an idea to fix that.

We can also live with the single quotes in the default values 👇️, right?

Kubernetes don't care if single or double quotes are used. The YAML document will be convered to an internal structure anyways. That why kubectl get yaml looks always different than the input. If you work with helm template --release-name foo, the quotes should be omit anyways.

I'm asking because the number of changes will make this quite an invest for me, and I want to make sure we all agree on this path, so my time is not wasted.

I would agree here, however I also may a different opinion. If there is an breaking change which simplifies the maintenance, I would go of it. We are not writing enterprise software.

That why I feel important to hear an different opinion an other maintainer, e.g. @GMartinez-Sisti or @QuentinBisson

implement this for all selectors mentioned above

Keep in mind, the selectors exists for Alertmanager and Thanos Ruler as well. I know, its ton of work here.

And after some polishing during review you will merge the PR?

Yes. I got already feedback quickly here. But hold the work until we have an different opinion.

@schnatterer
Copy link
Contributor Author

@GMartinez-Sisti @QuentinBisson Could you state your opinion, if we should go forward with the breaking changes? This comment sums up what we plan, and jkroepke states his opinion in this comment.

If you have doubts about a breaking change, we could go with my original approach in 7157047, which is minimally invasive but harder to maintain. I'd still be happy with this approach, because I only suggested a small fix and would be happy to get it faster and with less effort 🙂

@QuentinBisson
Copy link
Member

I'm fine with the breaking change and this is most likely a better way forward :)

@QuentinBisson
Copy link
Member

Thanks for the great work @schnatterer and @jkroepke 🚀

schnatterer and others added 3 commits September 10, 2024 13:43
Was removed by accident.

Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
…-community#4818

Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
@schnatterer
Copy link
Contributor Author

Thanks for your review, @jkroepke. Good catches. Resolved in 39d577a.
Should I go ahead with squashing or do we want another approval first?

@jkroepke
Copy link
Member

Based on your doc changes, I remember that we have only prometheus in scope. We can also this logic to AM and Ruler.

  • {{- if .Values.alertmanager.alertmanagerSpec.alertmanagerConfigSelector }}
    alertmanagerConfigSelector:
    {{ tpl (toYaml .Values.alertmanager.alertmanagerSpec.alertmanagerConfigSelector | indent 4) . }}
    {{ else }}
    alertmanagerConfigSelector: {}
    {{- end }}
    {{- if .Values.alertmanager.alertmanagerSpec.alertmanagerConfigNamespaceSelector }}
    alertmanagerConfigNamespaceSelector:
    {{ tpl (toYaml .Values.alertmanager.alertmanagerSpec.alertmanagerConfigNamespaceSelector | indent 4) . }}
    {{ else }}
    alertmanagerConfigNamespaceSelector: {}
    {{- end }}
  • {{- if .Values.thanosRuler.thanosRulerSpec.ruleNamespaceSelector }}
    ruleNamespaceSelector:
    {{ tpl (toYaml .Values.thanosRuler.thanosRulerSpec.ruleNamespaceSelector | indent 4) . }}
    {{ else }}
    ruleNamespaceSelector: {}
    {{- end }}
    {{- if .Values.thanosRuler.thanosRulerSpec.ruleSelector }}
    ruleSelector:
    {{ tpl (toYaml .Values.thanosRuler.thanosRulerSpec.ruleSelector | indent 4) .}}
    {{- else if .Values.thanosRuler.thanosRulerSpec.ruleSelectorNilUsesHelmValues }}
    ruleSelector:
    matchLabels:
    release: {{ $.Release.Name | quote }}
    {{ else }}
    ruleSelector: {}
    {{- end }}

I can do this on evening, then it should be complete.

I feel confident enough to approve and merge it by myself. Just for concepts, I also like to hear a second opinion.

jkroepke and others added 5 commits September 23, 2024 22:40
Signed-off-by: Jan-Otto Kröpke <joe@cloudeteer.de>
Signed-off-by: Jan-Otto Kröpke <github@jkroepke.de>
As found by linter

Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
As found by linter

Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
As found by next linter

Signed-off-by: Johannes Schnatterer <johannes.schnatterer@cloudogu.com>
@schnatterer
Copy link
Contributor Author

Wow, that process of linting is dragging on. For me as a contributor it would make things a lot faster, if the issue template would contain a pointer to a doc that tells me how to run the different linters locally, e.g. using docker.

@jkroepke
Copy link
Member

Wow, that process of linting is dragging on. For me as a contributor it would make things a lot faster, if the issue template would contain a pointer to a doc that tells me how to run the different linters locally, e.g. using docker.

Good point, could you create an issue for that?

Copy link
Member

@jkroepke jkroepke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM to me. Tested locally.

@jkroepke jkroepke merged commit 25f32d6 into prometheus-community:main Sep 26, 2024
4 checks passed
@jkroepke
Copy link
Member

@schnatterer Thanks for you contribution!

@QuentinBisson
Copy link
Member

Yes thank you both for the great work

@schnatterer
Copy link
Contributor Author

🎉 Thanks for accepting my proposal and for the many discussions @jkroepke, @GMartinez-Sisti and @QuentinBisson !

Great result!

rouke-broersma referenced this pull request in broersma-forslund/homelab Sep 26, 2024
This PR contains the following updates:

| Package | Update | Change |
|---|---|---|
|
[kube-prometheus-stack](https://redirect.github.com/prometheus-operator/kube-prometheus)
([source](https://redirect.github.com/prometheus-community/helm-charts))
| major | `62.7.0` -> `63.0.0` |

---

### Release Notes

<details>
<summary>prometheus-community/helm-charts
(kube-prometheus-stack)</summary>

###
[`v63.0.0`](https://redirect.github.com/prometheus-community/helm-charts/releases/tag/kube-prometheus-stack-63.0.0)

[Compare
Source](https://redirect.github.com/prometheus-community/helm-charts/compare/kube-prometheus-stack-62.7.0...kube-prometheus-stack-63.0.0)

kube-prometheus-stack collects Kubernetes manifests, Grafana dashboards,
and Prometheus rules combined with documentation and scripts to provide
easy to operate end-to-end Kubernetes cluster monitoring with Prometheus
using the Prometheus Operator.

#### What's Changed

- \[kube-prometheus-stack] Add downward compat for Prom CRD by
[@&#8203;schnatterer](https://redirect.github.com/schnatterer) in
[https://github.com/prometheus-community/helm-charts/pull/4818](https://redirect.github.com/prometheus-community/helm-charts/pull/4818)

#### New Contributors

- [@&#8203;schnatterer](https://redirect.github.com/schnatterer) made
their first contribution in
[https://github.com/prometheus-community/helm-charts/pull/4818](https://redirect.github.com/prometheus-community/helm-charts/pull/4818)

**Full Changelog**:
prometheus-community/helm-charts@prometheus-conntrack-stats-exporter-0.5.11...kube-prometheus-stack-63.0.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you
are satisfied.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR was generated by [Mend Renovate](https://mend.io/renovate/).
View the [repository job
log](https://developer.mend.io/github/broersma-forslund/homelab).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzOC44MC4wIiwidXBkYXRlZEluVmVyIjoiMzguODAuMCIsInRhcmdldEJyYW5jaCI6Im1haW4iLCJsYWJlbHMiOltdfQ==-->

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Rouke Broersma <mobrockers@gmail.com>
rbnis added a commit to rbnis/gitops that referenced this pull request Sep 26, 2024
RafalKorepta added a commit to redpanda-data/helm-charts that referenced this pull request Sep 27, 2024
For chart testing the prometheus operator started to require (from 63.0.0
version) label release that equal to `prometheus` in the default deployment done
in our CI.

The change from the upstream:
prometheus-community/helm-charts#4818
RafalKorepta added a commit to redpanda-data/helm-charts that referenced this pull request Sep 27, 2024
For chart testing the prometheus operator started to require (from 63.0.0
version) label release that equal to `prometheus` in the default deployment done
in our CI.

The change from the upstream:
prometheus-community/helm-charts#4818
@Pionerd
Copy link

Pionerd commented Sep 27, 2024

We use kube-prometheus-stack as a sub chart. We are unable to set the serviceMonitorSelector back to null due to helm/helm#12879

Any suggestions? For us this is a breaking change.

@jkroepke
Copy link
Member

Allowing false as alternative would be sufficient?

@Pionerd
Copy link

Pionerd commented Sep 27, 2024

Perfect

RafalKorepta added a commit to redpanda-data/helm-charts that referenced this pull request Sep 27, 2024
For chart testing the prometheus operator started to require (from 63.0.0
version) label release that equal to `prometheus` in the default deployment done
in our CI.

The change from the upstream:
prometheus-community/helm-charts#4818
@KyriosGN0
Copy link

@Pionerd @jkroepke i have the same issue, the fix is to use the old key or set matchLabels to be false ?

@jkroepke
Copy link
Member

There is no known fix yet, only a proposal.

@jkroepke
Copy link
Member

jkroepke commented Sep 28, 2024

Not sure, if this would work right now:

kube-prometheus-stack:
  prometheus:
    prometheusSpec:
      additionalConfig:
        serviceMonitorNamespaceSelector: ~

I feel the best solution would be convert all Selector values to simple string values. It seems like map values are too magic for modern helm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants