A simple service to accept webhook payloads from Prometheus Alertmanager and to push those alerts on to GlobalNOC's Alertmon.
The gnocpush
input json format is the same as the Prometheus Alertmanager webhook format.
Labels/annotations are required that match GlocalNOC's required parameter names.
Note that the top level common/group annotations/labels are ignored.
description
- A description of the alert.
node_name
- The name of the node that the alert is associated with.service_name
- The name of the service that the alert is associated with.severity
- The severity of the alert. One of:Critical
,Major
,Minor
,Unknown
,OK
device
- The subcomponent of the node that is alarming.start_time
- The time that the alert started.
---
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
labels:
lsst.io/rule: "true"
name: net
spec:
groups:
- name: net.rules
rules:
- alert: lhn_interface_up
annotations:
description: '{{ $labels.instance }} - {{ $labels.ifName }}|{{ $labels.ifAlias }} is down'
expr: ifOperStatus{ifAlias=~".*LHN.*"} != 1
for: 30s
labels:
severity: critical
node_name: '{{ $labels.instance }}'
device: '{{ $labels.ifName }}'
service_name: ifInErrors-{{ $labels.ifName}}
gnoc: "true"
{
"receiver": "gnocpush",
"status": "firing",
"alerts": [
{
"status": "firing",
"labels": {
"alertname": "ifInErrors",
"device": "Ethernet17/1",
"gnoc": "true",
"ifAlias": "rubinobs-br01 Et17/1 <--SCIENCE #1--> LS-DWDM Linecard001-Port2",
"ifDescr": "Ethernet17/1",
"ifIndex": "17001",
"ifName": "Ethernet17/1",
"instance": "new-rubinobs-br01",
"job": "snmp-network",
"node_name": "new-rubinobs-br01",
"prom": "dev/ruka",
"prometheus": "kube-prometheus-stack/kube-prometheus-stack-prometheus",
"service_name": "ifInErrors-Ethernet17/1",
"severity": "major",
"site": "dev"
},
"annotations": {
"description": "new-rubinobs-br01 - Ethernet17/1|rubinobs-br01 Et17/1 <--SCIENCE #1--> LS-DWDM Linecard001-Port2 has 12.2k input errors"
},
"startsAt": "2024-04-26T20:46:34.933Z",
"endsAt": "0001-01-01T00:00:00Z",
"generatorURL": "https://prometheus.example.org/graph?g0.expr=ifInErrors+%3E+1000&g0.tab=1",
"fingerprint": "46df8c14dbab758c"
}
],
"groupLabels": {
"gnoc": "true"
},
"commonLabels": {
"gnoc": "true",
"job": "snmp-network",
"prom": "dev/ruka",
"prometheus": "kube-prometheus-stack/kube-prometheus-stack-prometheus",
"site": "dev"
},
"commonAnnotations": {},
"externalURL": "https://alertmanager.example.org",
"version": "4",
"groupKey": "{}/{gnoc=\"true\"}:{gnoc=\"true\"}",
"truncatedAlerts": 0
}
Note that gnocpush
does not impose any alert grouping constraints.
config:
routes:
- receiver: gnocpush
continue: true
repeat_interval: 30s
group_interval: 30s
group_wait: 30s
group_by:
- gnoc
matchers:
- gnoc = "true"
receivers:
- name: gnocpush
webhook_configs:
- url: http://gnocpush.gnocpush:8080/alerts
OCI images are available at ghcr.io/lsst-it/gnocpush.
A helm chart is available in the charts
directory and in a helm repository at https://lsst-it.github.io/gnocpush.
Installing with helm via the helm repository:
helm repo add gnocpush https://lsst-it.github.io/gnocpush
helm repo update
helm upgrade --install \
gnocpush gnocpush/gnocpush \
--create-namespace --namespace gnocpush \
-f ./values.yaml
Installing with helm from a git repo clone:
helm upgrade --install \
gnocpush ./charts/gnocpush \
--create-namespace --namespace gnocpush \
-f ./values.yaml
k logs alertmanager-kube-prometheus-stack-alertmanager-0 --tail=100 -f
k logs -l app.kubernetes.io/instance=gnocpush -f
k -n gnocpush port-forward gnocpush-dc4d94d8-mqvqq 8080
$ curl localhost:8080/metrics
virtualenv venv
. venv/bin/activate
pip install --editable .
docker run \
-e GNOC_USERNAME=$GNOC_USERNAME \
-e GNOC_PASSWORD=$GNOC_PASSWORD \
-e GNOC_SERVER=$GNOC_SERVER \
-e GNOC_REALM=$GNOC_REALM \
--network=host ghcr.io/lsst-it/gnocpush
curl http://localhost:8080/alerts -v --json @- < alerts.json