Skip to content

Commit

Permalink
Implement SNMPtrap delivery controls (#404)
Browse files Browse the repository at this point in the history
* Implement SNMPtrap delivery controls

Implement ability to override the default values for the SNMPtrap
alertmanager receiver via prometheus-webhook-snmp component.

Closes: STF-559

* Run operator-sdk generate bundle

Run the following command to update the bundle artifacts:

operator-sdk-0.19.4 generate bundle   --metadata   --manifests   --channels unstable   --default-channel unstable

* Build out the remaining SNMP options

Build out the remaining options for prometheus-webhook-snmp to allow for
finer grained controls and delivery of SNMP traps via alertmanager
alerts.

* Generate bundle contents with operator-sdk
  • Loading branch information
leifmadsen authored Feb 28, 2023
1 parent 7687cd7 commit 16324bc
Show file tree
Hide file tree
Showing 9 changed files with 130 additions and 11 deletions.
20 changes: 15 additions & 5 deletions build/stf-run-ci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ choose to override:
| Parameter name | Values | Default | Description |
| ------------------------------ | ------------ | --------- | ------------------------------------ |
| `__deploy_stf` | {true,false} | true | Whether to deploy an instance of STF |
| `__local_build_enabled` | {true,false} | true | Whether to deploy STF from local built artifacts. Also see `working_branch`, `sg_branch`, `sgo_branch` |
| `__deploy_from_bundles_enabled` | {true,false} | false | Whether to deploy STF from OLM bundles (TODO: compat with __local_build_enabled) |
| `__service_telemetry_bundle_image_path` | <image_path> | <none> | Image path to Service Telemetry Operator bundle |
| `__smart_gateway_bundle_image_path` | <image_path> | <none> | Image path to Smart Gateway Operator bundle |
| `__local_build_enabled` | {true,false} | true | Whether to deploy STF from local built artifacts. Also see `working_branch`, `sg_branch`, `sgo_branch` |
| `__deploy_from_bundles_enabled` | {true,false} | false | Whether to deploy STF from OLM bundles (TODO: compat with `__local_build_enabled`) |
| `__service_telemetry_bundle_image_path` | <image_path> | <none> | Image path to Service Telemetry Operator bundle |
| `__smart_gateway_bundle_image_path` | <image_path> | <none> | Image path to Smart Gateway Operator bundle |
| `prometheus_webhook_snmp_branch` | <git_branch> | master | Which Prometheus Webhook SNMP git branch to checkout |
| `sgo_branch` | <git_branch> | master | Which Smart Gateway Operator git branch to checkout |
| `sg_core_branch` | <git_branch> | master | Which Smart Gateway Core git branch to checkout |
Expand All @@ -41,7 +41,17 @@ choose to override:
| `__service_telemetry_storage_ephemeral_enabled` | {true,false} | false | Whether to enable ephemeral storage support in ServiceTelemetry |
| `__service_telemetry_storage_persistent_storage_class` | <storage_class> | <undefined> | Set a custom storageClass to override the default provided by OpenShift platform |
| `__service_telemetry_snmptraps_enabled` | {true,false} | true | Whether to enable snmptraps delivery via Alertmanager receiver (prometheus-webhook-snmp) |
| `__service_telemetry_observability_strategy` | <observability_strategy> | use_community | Which observability strategy to use for deployment. Default deployment is 'use_community'. Also supported is 'none' |
| `__service_telemetry_snmptraps_community` | <snmptrap_community> | `public` | Set the SNMP community to send traps to. Defaults to public |
| `__service_telemetry_snmptraps_target` | <snmptrap_target> | `192.168.24.254` | Set the SNMP target to send traps to. Defaults to 192.168.24.254 |
| `__service_telemetry_snmptraps_retries` | <snmptrap_retry_count> | 5 | Set the SNMP retry count for traps. Defaults to 5 |
| `__service_telemetry_snmptraps_port` | <snmptrap_port> | 162 | Set the SNMP target port for traps. Defaults to 162 |
| `__service_telemetry_snmptraps_timeout` | <snmptrap_timeout> | 1 | Set the SNMP retry timeout (in seconds). Defaults to 1 |
| `__service_telemetry_alert_oid_label` | <alert_label> | oid | The alert label name to look for oid value. Default to oid. |
| `__service_telemetry_trap_oid_prefix` | <oid_prefix> | 1.3.6.1.4.1.50495.15 | The OID prefix for trap variable bindings. |
| `__service_telemetry_trap_default_oid` | <default_oid> | 1.3.6.1.4.1.50495.15.1.2.1 | The trap OID if none is found in the Prometheus alert labels. |
| `__service_telemetry_trap_default_severity` | <default_severity> | <undefined> | The trap severity if none is found in the Prometheus alert labels. |
| `__service_telemetry_logs_enabled` | {true,false} | false | Whether to enable logs support in ServiceTelemetry |
| `__service_telemetry_observability_strategy` | <observability_strategy> | `use_community` | Which observability strategy to use for deployment. Default deployment is 'use_community'. Also supported is 'none' |
| `__internal_registry_path` | <registry_path> | image-registry.openshift-image-registry.svc:5000 | Path to internal registry for image path |
| `__deploy_loki_enabled` | {true,false} | false | Whether to deploy loki-operator and other systems for logging development purposes |
| `__golang_image_path` | <image_path> | quay.io/infrawatch/golang:1.16 | Golang image path for building the loki-operator image |
Expand Down
10 changes: 10 additions & 0 deletions build/stf-run-ci/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@ __service_telemetry_high_availability_enabled: false
__service_telemetry_metrics_enabled: true
__service_telemetry_storage_ephemeral_enabled: false
__service_telemetry_snmptraps_enabled: true
__service_telemetry_snmptraps_target: "192.168.24.254"
__service_telemetry_snmptraps_community: "public"
__service_telemetry_snmptraps_retries: 5
__service_telemetry_snmptraps_timeout: 1
__service_telemetry_snmptraps_port: 162
__service_telemetry_snmptraps_alert_oid_label: "oid"
__service_telemetry_snmptraps_trap_oid_prefix: "1.3.6.1.4.1.50495.15"
__service_telemetry_snmptraps_trap_default_oid: "1.3.6.1.4.1.50495.15.1.2.1"
__service_telemetry_snmptraps_trap_default_severity: ""
__service_telemetry_logs_enabled: false
__service_telemetry_observability_strategy: use_community
__internal_registry_path: image-registry.openshift-image-registry.svc:5000
__service_telemetry_bundle_image_path:
Expand Down
9 changes: 9 additions & 0 deletions build/stf-run-ci/tasks/deploy_stf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,15 @@
receivers:
snmpTraps:
enabled: {{ __service_telemetry_snmptraps_enabled }}
target: "{{ __service_telemetry_snmptraps_target }}"
community: "{{ __service_telemetry_snmptraps_community }}"
retries: {{ __service_telemetry_snmptraps_retries }}
port: {{ __service_telemetry_snmptraps_port }}
timeout: {{ __service_telemetry_snmptraps_timeout }}
alertOidLabel: "{{ __service_telemetry_snmptraps_alert_oid_label }}"
trapOidPrefix: "{{ __service_telemetry_snmptraps_trap_oid_prefix }}"
trapDefaultOid: "{{ __service_telemetry_snmptraps_trap_default_oid }}"
trapDefaultSeverity: "{{ __service_telemetry_snmptraps_trap_default_severity }}"
backends:
events:
elasticsearch:
Expand Down
26 changes: 25 additions & 1 deletion deploy/crds/infra.watch_servicetelemetrys_crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,32 @@ spec:
enabled:
description: Deploy container to send snmp traps
type: boolean
community:
description: 'Target community for SNMP traps. Default is "public"'
type: string
target:
description: Target address for SNMP traps to send to
description: 'Target address for SNMP traps to send to.'
type: string
retries:
description: 'SNMP trap delivery retry limit. Default is 5'
type: integer
timeout:
description: 'Response timeout, in seconds. Default is 1'
type: integer
port:
description: 'SNMP track delivery port. Default is 162'
type: integer
alertOidLabel:
description: 'Label for finding the OID. Default is "oid"'
type: string
trapOidPrefix:
description: 'OID prefix for the trap variable bindings. Default is "1.3.6.1.4.1.50495.15"'
type: string
trapDefaultOid:
description: 'The trap OID if none is found in the Prometheus alert labels. Default is "1.3.6.1.4.1.50495.15.1.2.1"'
type: string
trapDefaultSeverity:
description: 'The trap severity if none is found in the Prometheus alert labels. Default is empty.'
type: string
type: object
type: object
Expand Down
8 changes: 8 additions & 0 deletions deploy/crds/infra.watch_v1beta1_servicetelemetry_cr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,15 @@ spec:
receivers:
snmpTraps:
enabled: false
community: public
target: 192.168.24.254
retries: 5
port: 162
timeout: 1
alertOidLabel: oid
trapOidPrefix: "1.3.6.1.4.1.50495.15"
trapDefaultOid: "1.3.6.1.4.1.50495.15.1.2.1"
trapDefaultSeverity: ""
storage:
strategy: persistent
persistent:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,44 @@ spec:
properties:
snmpTraps:
properties:
alertOidLabel:
description: Label for finding the OID. Default is
"oid"
type: string
community:
description: Target community for SNMP traps. Default
is "public"
type: string
enabled:
description: Deploy container to send snmp traps
type: boolean
port:
description: SNMP track delivery port. Default is
162
type: integer
retries:
description: SNMP trap delivery retry limit. Default
is 5
type: integer
target:
description: Target address for SNMP traps to send
to
to.
type: string
timeout:
description: Response timeout, in seconds. Default
is 1
type: integer
trapDefaultOid:
description: The trap OID if none is found in the
Prometheus alert labels. Default is "1.3.6.1.4.1.50495.15.1.2.1"
type: string
trapDefaultSeverity:
description: The trap severity if none is found in
the Prometheus alert labels. Default is empty.
type: string
trapOidPrefix:
description: OID prefix for the trap variable bindings.
Default is "1.3.6.1.4.1.50495.15"
type: string
type: object
type: object
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,16 @@ metadata:
"alertmanager": {
"receivers": {
"snmpTraps": {
"alertOidLabel": "oid",
"community": "public",
"enabled": false,
"target": "192.168.24.254"
"port": 162,
"retries": 5,
"target": "192.168.24.254",
"timeout": 1,
"trapDefaultOid": "1.3.6.1.4.1.50495.15.1.2.1",
"trapDefaultSeverity": "",
"trapOidPrefix": "1.3.6.1.4.1.50495.15"
}
},
"storage": {
Expand Down
8 changes: 8 additions & 0 deletions roles/servicetelemetry/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,15 @@ servicetelemetry_defaults:
receivers:
snmp_traps:
enabled: false
community: public
target: 192.168.24.254
retries: 5
timeout: 1
port: 162
alert_oid_label: "oid"
trap_oid_prefix: "1.3.6.1.4.1.50495.15"
trap_default_oid: "1.3.6.1.4.1.50495.15.1.2.1"
trap_default_severity: ""

backends:
metrics:
Expand Down
16 changes: 13 additions & 3 deletions roles/servicetelemetry/templates/manifest_snmp_traps.j2
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,20 @@ spec:
- containerPort: 9099
env:
- name: SNMP_COMMUNITY
value: public
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.community }}"
- name: SNMP_RETRIES
value: "1"
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.retries }}"
- name: SNMP_HOST
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.target }}"
- name: SNMP_PORT
value: "162"
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.port }}"
- name: SNMP_TIMEOUT
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.port }}"
- name: ALERT_OID_LABEL
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.alert_oid_label }}"
- name: TRAP_OID_PREFIX
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.trap_oid_prefix }}"
- name: TRAP_DEFAULT_OID
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.trap_default_oid }}"
- name: TRAP_DEFAULT_SEVERITY
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.trap_default_severity }}"

0 comments on commit 16324bc

Please sign in to comment.