Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[kube-prometheus-stack] Allow specifying Additional Labels per Alert #3014

Open
Daniel-Vaz opened this issue Feb 9, 2023 · 6 comments
Open
Labels
enhancement New feature or request lifecycle/stale

Comments

@Daniel-Vaz
Copy link

Is your feature request related to a problem ?

Issue:

Currently we only are able to add additional labels to the default rules using the defaultRules.additionalRuleLabels value. This Value is applied to every single Alert under helm-charts/charts/kube-prometheus-stack/templates/prometheus/rules-1.14/.

In our setup we heavily rely on a instance label that identifies the target in the alert receivers like Slack, PagerDuty, etc.

Since each alert addresses different objects\targets, and we want to set the instance label to identify those targets, we can't do that by adding a common defaultRules.additionalRuleLabels value since the labels will change depending on the alerts expressions of course.

For example:

  • Setting instance: '{{ $labels.namespace }}/{{ $labels.deployment }}' will correctly identify the alerts that have the "namespace" and "deployment" labels, but for example alerts regarding cronjobs or even statefullsets would break and\or appear empty.

Describe the solution you'd like.

We would need to have the possibility to edit Per Alert the additionalRuleLabels value so that we can have a more fine-grained Control on what labels we add to what alerts.

Something similar to the following values.yaml structure:

  defaultRules:
    create: true
    rules:
      alertmanager:
        AlertmanagerFailedReload:
          enabled: true
          additionalRuleLabels: {}
          additionalRuleAnnotations: {}
        AlertmanagerMembersInconsistent:
          enabled: true
          additionalRuleLabels: {}
          additionalRuleAnnotations: {}
# (etc...)

Describe alternatives you've considered.

Alternative we can just disable all default rules and manage them ourselves but this would defeat the purpose of using the kube-prometheus-stack for getting automatically community approved patches, recommendations and updates.

Additional context.

We currently are migrating from a "legacy" Prom\Alertman\Grafana stack to Kube-Prometheus-stack and this changes to the overall helm templates really would make a difference in this migration.

@Daniel-Vaz Daniel-Vaz added the enhancement New feature or request label Feb 9, 2023
@Daniel-Vaz
Copy link
Author

@zeritti Can you help me on this ?

I have the knowledge to change the value file and rules templates manually, but it seems that all rules templates file are generated from this script, and I'm horrible at python.

I've been trying to change the add_custom_labels function in order to be able to append a custom rule_condition template snippet per alert but I can't seem to make it work ...

So far I reached this point:

def add_custom_labels(rules, indent=4):
    """Add if wrapper for additional rules labels"""
    rules_group = re.findall(r'(?<=name: ).*', rules)
    alerts_names = re.findall(r'(?<=- alert: ).*', rules)
    
    separator = " " * indent + "- alert:.*"
    alerts_positions = re.finditer(separator,rules)
    alert=-1
    for alert_position in alerts_positions:
        rule_condition = f'{{- if {condition_map[rules_group[0]]}.{alerts_names[alert]}.additionalRuleLabels }}\n{{ toYaml {condition_map[rules_group[0]]}.{alerts_names[alert]}.additionalRuleLabels | indent 8 }}\n{{- end }}'
        rule_condition_len = len(rule_condition) + 1

        # add rule_condition at the end of the alert block
        if alert >= 0 :
            index = alert_position.start() + rule_condition_len * alert - 1
            rules = rules[:index] + "\n" + rule_condition + rules[index:]
        alert += 1

    # add rule_condition at the end of the last alert
    if alert >= 0:
        index = len(rules) - 1
        rules = rules[:index] + "\n" + rule_condition + rules[index:]
    return rules

But this always generates broken templates since the statement alerts_positions = re.finditer(separator,rules) is executed only once before the for loop and therefore the alert_position.start() always is wrong (except the first and last Iteration of the loop) since the change we want to make to the file is dependent on the rule_condition_len value and this is always different per alert name.

Any python expert help here would be great.
Thank you again for all the awesome work !

@Daniel-Vaz
Copy link
Author

Hello Community :D
I'm still a bit stuck with this, anyone willing to give me a hand in the python script mentioned above would really be my hero !
Thank you in advance

@stale
Copy link

stale bot commented Apr 2, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

@stale stale bot added the lifecycle/stale label Apr 2, 2023
@Daniel-Vaz
Copy link
Author

remove stale

@stale stale bot removed the lifecycle/stale label Apr 3, 2023
@defenestration
Copy link

May want to look at #1231 (comment) which runs a query on the namespace hosting the service. I didn't realize you could do that either but it solved a similar problem for me.

@stale
Copy link

stale bot commented Jun 17, 2023

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Any further update will cause the issue/pull request to no longer be considered stale. Thank you for your contributions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request lifecycle/stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants