Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: disallow any unknown fields to prometheus and add new flags to support loki, thanos or mimir #77

Merged
merged 3 commits into from
Jul 17, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 9 additions & 6 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,19 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0

## [Unreleased]

- Updated: Prometheus and other dependencies
- CI: Updated Github actions for golangcilint and goreleaser
- Fixed: :warning: **Unmarshalling of the rule files is strict again**, this behavior was unintentionally brought when adding support for yaml comments.
- Changed: :warning: **Renamed `hasValidPartialStrategy` to `hasValidPartialResponseStrategy`** as it was documented so it is actually a fix
- Changed: :warning: **Disallow special rule file fields of Thanos, Mimir or Loki by default**
To enable them, you need to set some of the new flags described below
- Added: New flags `--support-thanos`, `--support-mimir`, `--support-loki` to enable special rule file fields of Thanos, Mimir or Loki
- Added: :tada: **Support for validation of Loki rules!** Now you can validate Loki rules as well. First two validators are:
- `expressionIsValidLogQL` to check if the expression is a valid LogQL query
- `logQlExpressionUsesRangeAggregation` to check if the LogQL expression uses range aggregation
- Added: support for alert field `keep_firing_for`
- Added: support for the `query_offset` field in the rule group
- Added: new validator `expressionIsValidPromQL` to check if the expression is a valid PromQL query
- Added: :tada: **Support for Loki rules!** Now you can validate Loki rules as well. First two validators are:
- `expressionIsValidLogQL` to check if the expression is a valid LogQL query
- `logQlExpressionUsesRangeAggregation` to check if the LogQL expression uses range aggregation
- Changed: :warning: **Renamed `hasValidPartialStrategy` to `hasValidPartialResponseStrategy` as it was documented so it is actually a fix**
- Updated: Prometheus and other dependencies
- CI: Updated Github actions for golangcilint and goreleaser

## [2.14.1]
- Fixed: error message in the `hasSourceTenantsForMetrics` validator
Expand Down
11 changes: 8 additions & 3 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -34,18 +34,23 @@ build:
GOOS=linux GOARCH=amd64 CGO_ENABLED=0 go build -o $(PROMRUVAL_BIN)

E2E_TESTS_VALIDATIONS_FILE := examples/validation.yaml
E2E_TESTS_LOKI_VALIDATIONS_FILE := examples/validation_loki.yaml
E2E_TESTS_ADDITIONAL_VALIDATIONS_FILE := examples/additional-validation.yaml
E2E_TESTS_LOKI_RULES_FILES := examples/loki_rules/*.yaml
E2E_TESTS_RULES_FILES := examples/rules/*.yaml
E2E_TESTS_DOCS_FILE_MD := examples/human_readable.md
E2E_TESTS_DOCS_FILE_HTML := examples/human_readable.html

E2E_TESTS_LOKI_DIR := examples/loki/
E2E_TESTS_MIMIR_DIR := examples/mimir/
E2E_TESTS_THANOS_DIR := examples/thanos/
e2e-test: build
$(PROMRUVAL_BIN) validate --config-file $(E2E_TESTS_VALIDATIONS_FILE) --config-file $(E2E_TESTS_ADDITIONAL_VALIDATIONS_FILE) $(E2E_TESTS_RULES_FILES)
$(PROMRUVAL_BIN) validate --config-file $(E2E_TESTS_LOKI_VALIDATIONS_FILE) $(E2E_TESTS_LOKI_RULES_FILES)
$(PROMRUVAL_BIN) validation-docs --config-file $(E2E_TESTS_VALIDATIONS_FILE) --config-file $(E2E_TESTS_ADDITIONAL_VALIDATIONS_FILE) > $(E2E_TESTS_DOCS_FILE_MD)
$(PROMRUVAL_BIN) validation-docs --config-file $(E2E_TESTS_VALIDATIONS_FILE) --config-file $(E2E_TESTS_ADDITIONAL_VALIDATIONS_FILE) --output=html > $(E2E_TESTS_DOCS_FILE_HTML)

$(PROMRUVAL_BIN) validate --support-loki --config-file $(E2E_TESTS_LOKI_DIR)/validation.yaml $(E2E_TESTS_LOKI_DIR)/rules.yaml
$(PROMRUVAL_BIN) validate --support-thanos --config-file $(E2E_TESTS_THANOS_DIR)/validation.yaml $(E2E_TESTS_THANOS_DIR)/rules.yaml
$(PROMRUVAL_BIN) validate --support-mimir --config-file $(E2E_TESTS_MIMIR_DIR)/validation.yaml $(E2E_TESTS_MIMIR_DIR)/rules.yaml

docker: build
docker build -t fusakla/promruval .

Expand Down
35 changes: 21 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -59,36 +59,39 @@ make build

```bash
$ ./promruval --help-long
usage: promruval --config-file=CONFIG-FILE [<flags>] <command> [<args> ...]
usage: promruval [<flags>] <command> [<args> ...]

Prometheus rules validation tool.

Flags:
--help Show context-sensitive help (also try --help-long and --help-man).
--[no-]help Show context-sensitive help (also try --help-long and --help-man).
-c, --config-file=CONFIG-FILE ...
Path to validation config file. Can be passed multiple times, only validationRules will be reflected from the additional configs.
--debug Enable debug logging.
Path to validation config file. Can be passed multiple times, only validationRules will be reflected from the additional configs.
--[no-]debug Enable debug logging.

Commands:
help [<command>...]
help [<command>...]
Show help.


version
version
Print version and build information.


validate [<flags>] <path>...
validate [<flags>] <path>...
Validate Prometheus rule files using validation rules from config file.

-d, --disable-rule=DISABLE-RULE ...
Allows to disable any validation rules by it's name. Can be passed multiple times.
-e, --enable-rule=ENABLE-RULE ...
Only enable these validation rules. Can be passed multiple times.
-o, --output=[text,json,yaml] Format of the output.
--color Use color output.
--[no-]color Use color output.
--[no-]support-loki Support Loki rules format.
--[no-]support-mimir Support Mimir rules format.
--[no-]support-thanos Support Thanos rules format.

validation-docs [<flags>]
validation-docs [<flags>]
Print human readable form of the validation rules from config file.

-o, --output=[text,markdown,html]
Expand Down Expand Up @@ -299,14 +302,18 @@ groups:
### Other monitoring solutions support

#### Thanos
Thanos has only one special case which is the `partial_response_strategy` setting on the group level which is tolerated
in the config and can ve validated using the [`hasValidPartialResponseStrategy`](./docs/validations.md#hasvalidpartialresponsestrategy) validation.
If you want to validate Thanos rules, use the `promruval validate --support-thanos` flag, otherwise you might get errors on unknown fields such as `partial_response_strategy`.

#### Mimir/Cortex
Mimir/Cortex has only one special case which is the `source_tenants` setting on the group level which is tolerated
and can ve validated using the [`hasSourceTenantsForMetrics`](./docs/validations.md#hassourcetenantsformetrics) or [`hasAllowedSourceTenants`](./docs/validations.md#hasallowedsourcetenants) validations for example.
You can validate it using the [`hasValidPartialResponseStrategy`](./docs/validations.md#hasvalidpartialresponsestrategy) validation.

#### Mimir
If you want to validate Mimir rules, use the `promruval validate --support-mimir` flag, otherwise you might get errors on unknown fields such as `source_tenants`.

The `source_tenants` can be validated using the [`hasSourceTenantsForMetrics`](./docs/validations.md#hassourcetenantsformetrics) or [`hasAllowedSourceTenants`](./docs/validations.md#hasallowedsourcetenants) validations for example.

#### Loki
If you want to validate Mimir rules, use the `promruval validate --support-loki` flag, otherwise you might get errors on unknown fields such as `namespace` or `remote_write`.

Since Loki has almost identical rule config as Prometheus, you can use the same validations for Loki rules.
Loki has special validations for its expressions since it uses different query language [LogQL](https://grafana.com/docs/loki/latest/query/).
To see the LogQL specific validations see the [here](./docs/validations.md#logql-expression-validators).
Expand Down
10 changes: 0 additions & 10 deletions examples/human_readable.html
Original file line number Diff line number Diff line change
Expand Up @@ -36,14 +36,6 @@ <h2><a href="#check-prometheus-limitations">check-prometheus-limitations</a></h2
<li>All rules does not use any of the <code>cluster</code>,<code>locality</code>,<code>prometheus-type</code>,<code>replica</code> labels is in its expression</li>
</ul>

<h2><a href="#check-source-tenants">check-source-tenants</a></h2>
<ul>
<li>All rules rule group, the rule belongs to, has the required <code>source_tenants</code> configured, according to the mapping of metric names to tenants:
<br/> <code>k8s</code>: <code>^container_.*$</code> (Metrics from cAdvisor)
<br/> <code>k8s</code>: <code>^kube_.*$</code> (Metrics from KSM)
<br/> <code>mysql</code>: <code>^mysql_.*$</code> (MySQL metrics from the MySQL team)</li>
</ul>

<h2><a href="#check-metric-name">check-metric-name</a></h2>
<ul>
<li>Alert expression uses metric name in selectors</li>
Expand All @@ -53,9 +45,7 @@ <h2><a href="#check-metric-name">check-metric-name</a></h2>

<h2><a href="#check-groups">check-groups</a></h2>
<ul>
<li>Group does not have other <code>source_tenants</code> than: <code>tenant1</code>, <code>tenant2</code>, <code>k8s</code></li>
<li>Group evaluation interval is between <code>20s</code> and <code>106751d23h47m16s854ms</code> if set</li>
<li>Group has valid partial_response_strategy (one of <code>warn</code> or <code>abort</code>) if set</li>
<li>Group has at most 10 rules</li>
<li>Group does not have higher <code>limit</code> configured then 100</li>
</ul>
Expand Down
8 changes: 0 additions & 8 deletions examples/human_readable.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,21 +26,13 @@ Validation rules:
- All rules expression does not use data older than `6h0m0s`
- All rules does not use any of the `cluster`,`locality`,`prometheus-type`,`replica` labels is in its expression

check-source-tenants
- All rules rule group, the rule belongs to, has the required `source_tenants` configured, according to the mapping of metric names to tenants:
`k8s`: `^container_.*$` (Metrics from cAdvisor)
`k8s`: `^kube_.*$` (Metrics from KSM)
`mysql`: `^mysql_.*$` (MySQL metrics from the MySQL team)

check-metric-name
- Alert expression uses metric name in selectors
- Alert labels are valid templates
- Alert `keep_firing_for` is not longer than `1h`

check-groups
- Group does not have other `source_tenants` than: `tenant1`, `tenant2`, `k8s`
- Group evaluation interval is between `20s` and `106751d23h47m16s854ms` if set
- Group has valid partial_response_strategy (one of `warn` or `abort`) if set
- Group has at most 10 rules
- Group does not have higher `limit` configured then 100

Expand Down
6 changes: 6 additions & 0 deletions examples/loki/rules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# ignore_validations: hasAllowedLimit
namespace: foo
groups:
- name: group1
remote_write:
- url: http://localhost:1234
File renamed without changes.
14 changes: 0 additions & 14 deletions examples/loki_rules/rules.yaml

This file was deleted.

11 changes: 11 additions & 0 deletions examples/mimir/rules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
groups:
- name: group1
source_tenants: ["k8s", "bar"]
rules:
- alert: test
expr: avg_over_time(max_over_time(container_cpu_seconds_total{job="prometheus"}[10h] offset 10d)[10m:10m])
for: 4w
keep_firing_for: 5m
labels:
severity: critica
team: foo
16 changes: 16 additions & 0 deletions examples/mimir/validation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
validationRules:
- name: check-mimir
scope: Group
validations:
- type: hasAllowedSourceTenants
params:
allowedSourceTenants: ["k8s", "bar"]
- name: check-source-tenants
scope: All rules
validations:
- type: hasSourceTenantsForMetrics
params:
sourceTenants:
"k8s":
- regexp: "container_.*"
description: "Metrics from cAdvisor"
4 changes: 0 additions & 4 deletions examples/rules/rules.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# ignore_validations: hasAllowedLimit
groups:
- name: group1
partial_response_strategy: abort
interval: 1m
limit: 10
rules:
Expand All @@ -13,8 +12,6 @@ groups:

# ignore_validations: labelHasAllowedValue
- name: testGroup
partial_response_strategy: "warn"
source_tenants: ["tenant1", "tenant2"]
limit: 1000
rules:
# Comment before.
Expand All @@ -35,7 +32,6 @@ groups:
disabled_validation_rules: check-team-label,check-prometheus-limitations

- name: testIgnoreValidationsInExpr
source_tenants: ["k8s"]
limit: 10
rules:
- alert: test
Expand Down
3 changes: 3 additions & 0 deletions examples/thanos/rules.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
groups:
- name: group1
partial_response_strategy: "warn"
5 changes: 5 additions & 0 deletions examples/thanos/validation.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
validationRules:
- name: check-thanos-rules
scope: Group
validations:
- type: hasValidPartialResponseStrategy
19 changes: 0 additions & 19 deletions examples/validation.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,21 +71,6 @@ validationRules:
params:
labels: ["cluster", "locality", "prometheus-type", "replica"]

- name: check-source-tenants
scope: All rules
validations:
- type: hasSourceTenantsForMetrics
params:
sourceTenants:
"k8s":
- regexp: "container_.*"
description: "Metrics from cAdvisor"
- regexp: "kube_.*"
description: "Metrics from KSM"
"mysql":
- regexp: "mysql_.*"
description: "MySQL metrics from the MySQL team"

- name: check-metric-name
scope: Alert
validations:
Expand All @@ -98,14 +83,10 @@ validationRules:
- name: check-groups
scope: Group
validations:
- type: hasAllowedSourceTenants
params:
allowedSourceTenants: ["tenant1", "tenant2", "k8s"]
- type: hasAllowedEvaluationInterval
params:
minimum: "20s"
intervalMustBeSet: false
- type: hasValidPartialResponseStrategy
- type: maxRulesPerGroup
params:
limit: 10
Expand Down
2 changes: 1 addition & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -126,7 +126,7 @@ require (
github.com/go-logr/logr v1.4.2 // indirect
github.com/go-logr/stdr v1.2.2 // indirect
github.com/gogo/protobuf v1.3.2 // indirect
github.com/google/go-cmp v0.6.0 // indirect
github.com/google/go-cmp v0.6.0
github.com/grafana/loki/v3 v3.1.0
github.com/grafana/regexp v0.0.0-20240518133315-a468a5bfb3bc // indirect
github.com/json-iterator/go v1.1.12 // indirect
Expand Down
16 changes: 16 additions & 0 deletions main.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"github.com/fusakla/promruval/v2/pkg/config"
"github.com/fusakla/promruval/v2/pkg/prometheus"
"github.com/fusakla/promruval/v2/pkg/report"
"github.com/fusakla/promruval/v2/pkg/unmarshaler"
"github.com/fusakla/promruval/v2/pkg/validate"
"github.com/fusakla/promruval/v2/pkg/validationrule"
"github.com/fusakla/promruval/v2/pkg/validator"
Expand All @@ -35,6 +36,9 @@ var (
enabledRules = validateCmd.Flag("enable-rule", "Only enable these validation rules. Can be passed multiple times.").Short('e').Strings()
validationOutputFormat = validateCmd.Flag("output", "Format of the output.").Short('o').PlaceHolder("[text,json,yaml]").Default("text").Enum("text", "json", "yaml")
color = validateCmd.Flag("color", "Use color output.").Bool()
supportLoki = validateCmd.Flag("support-loki", "Support Loki rules format.").Bool()
supportMimir = validateCmd.Flag("support-mimir", "Support Mimir rules format.").Bool()
supportThanos = validateCmd.Flag("support-thanos", "Support Thanos rules format.").Bool()

docsCmd = app.Command("validation-docs", "Print human readable form of the validation rules from config file.")
docsOutputFormat = docsCmd.Flag("output", "Format of the output.").Short('o').PlaceHolder("[text,markdown,html]").Default("text").Enum("text", "markdown", "html")
Expand Down Expand Up @@ -147,6 +151,18 @@ func main() {
filesToBeValidated = append(filesToBeValidated, paths...)
}

if *supportLoki {
unmarshaler.SupportLoki(true)
}

if *supportMimir {
unmarshaler.SupportMimir(true)
}

if *supportThanos {
unmarshaler.SupportThanos(true)
}

var prometheusClient *prometheus.Client
if mainValidationConfig.Prometheus.URL != "" {
prometheusClient, err = prometheus.NewClient(mainValidationConfig.Prometheus)
Expand Down
7 changes: 5 additions & 2 deletions pkg/unmarshaler/helpers.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,7 @@ func unmarshalToNodeAndStruct(value, dstNode *yaml.Node, dstStruct interface{},
}

// mustListStructYamlFieldNames returns a list of yaml field names for the given struct.
func mustListStructYamlFieldNames(s interface{}) []string {
func mustListStructYamlFieldNames(s interface{}, ignoreFields []string) []string {
y, err := yaml.Marshal(s)
if err != nil {
fmt.Println("failed to marshal", err)
Expand All @@ -88,8 +88,11 @@ func mustListStructYamlFieldNames(s interface{}) []string {
fmt.Println("failed to marshal", err)
panic(err)
}
names := make([]string, 0, len(m))
names := []string{}
for k := range m {
if slices.Contains(ignoreFields, k) {
continue
}
names = append(names, k)
}
return names
Expand Down
Loading