Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement SNMPtrap delivery controls #404

Merged
merged 5 commits into from
Feb 28, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 15 additions & 5 deletions build/stf-run-ci/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,10 +21,10 @@ choose to override:
| Parameter name | Values | Default | Description |
| ------------------------------ | ------------ | --------- | ------------------------------------ |
| `__deploy_stf` | {true,false} | true | Whether to deploy an instance of STF |
| `__local_build_enabled` | {true,false} | true | Whether to deploy STF from local built artifacts. Also see `working_branch`, `sg_branch`, `sgo_branch` |
| `__deploy_from_bundles_enabled` | {true,false} | false | Whether to deploy STF from OLM bundles (TODO: compat with __local_build_enabled) |
| `__service_telemetry_bundle_image_path` | <image_path> | <none> | Image path to Service Telemetry Operator bundle |
| `__smart_gateway_bundle_image_path` | <image_path> | <none> | Image path to Smart Gateway Operator bundle |
| `__local_build_enabled` | {true,false} | true | Whether to deploy STF from local built artifacts. Also see `working_branch`, `sg_branch`, `sgo_branch` |
| `__deploy_from_bundles_enabled` | {true,false} | false | Whether to deploy STF from OLM bundles (TODO: compat with `__local_build_enabled`) |
| `__service_telemetry_bundle_image_path` | <image_path> | <none> | Image path to Service Telemetry Operator bundle |
| `__smart_gateway_bundle_image_path` | <image_path> | <none> | Image path to Smart Gateway Operator bundle |
| `prometheus_webhook_snmp_branch` | <git_branch> | master | Which Prometheus Webhook SNMP git branch to checkout |
| `sgo_branch` | <git_branch> | master | Which Smart Gateway Operator git branch to checkout |
| `sg_core_branch` | <git_branch> | master | Which Smart Gateway Core git branch to checkout |
Expand All @@ -41,7 +41,17 @@ choose to override:
| `__service_telemetry_storage_ephemeral_enabled` | {true,false} | false | Whether to enable ephemeral storage support in ServiceTelemetry |
| `__service_telemetry_storage_persistent_storage_class` | <storage_class> | <undefined> | Set a custom storageClass to override the default provided by OpenShift platform |
| `__service_telemetry_snmptraps_enabled` | {true,false} | true | Whether to enable snmptraps delivery via Alertmanager receiver (prometheus-webhook-snmp) |
| `__service_telemetry_observability_strategy` | <observability_strategy> | use_community | Which observability strategy to use for deployment. Default deployment is 'use_community'. Also supported is 'none' |
| `__service_telemetry_snmptraps_community` | <snmptrap_community> | `public` | Set the SNMP community to send traps to. Defaults to public |
| `__service_telemetry_snmptraps_target` | <snmptrap_target> | `192.168.24.254` | Set the SNMP target to send traps to. Defaults to 192.168.24.254 |
| `__service_telemetry_snmptraps_retries` | <snmptrap_retry_count> | 5 | Set the SNMP retry count for traps. Defaults to 5 |
| `__service_telemetry_snmptraps_port` | <snmptrap_port> | 162 | Set the SNMP target port for traps. Defaults to 162 |
| `__service_telemetry_snmptraps_timeout` | <snmptrap_timeout> | 1 | Set the SNMP retry timeout (in seconds). Defaults to 1 |
| `__service_telemetry_alert_oid_label` | <alert_label> | oid | The alert label name to look for oid value. Default to oid. |
| `__service_telemetry_trap_oid_prefix` | <oid_prefix> | 1.3.6.1.4.1.50495.15 | The OID prefix for trap variable bindings. |
| `__service_telemetry_trap_default_oid` | <default_oid> | 1.3.6.1.4.1.50495.15.1.2.1 | The trap OID if none is found in the Prometheus alert labels. |
| `__service_telemetry_trap_default_severity` | <default_severity> | <undefined> | The trap severity if none is found in the Prometheus alert labels. |
| `__service_telemetry_logs_enabled` | {true,false} | false | Whether to enable logs support in ServiceTelemetry |
| `__service_telemetry_observability_strategy` | <observability_strategy> | `use_community` | Which observability strategy to use for deployment. Default deployment is 'use_community'. Also supported is 'none' |
| `__internal_registry_path` | <registry_path> | image-registry.openshift-image-registry.svc:5000 | Path to internal registry for image path |
| `__deploy_loki_enabled` | {true,false} | false | Whether to deploy loki-operator and other systems for logging development purposes |
| `__golang_image_path` | <image_path> | quay.io/infrawatch/golang:1.16 | Golang image path for building the loki-operator image |
Expand Down
10 changes: 10 additions & 0 deletions build/stf-run-ci/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,16 @@ __service_telemetry_high_availability_enabled: false
__service_telemetry_metrics_enabled: true
__service_telemetry_storage_ephemeral_enabled: false
__service_telemetry_snmptraps_enabled: true
__service_telemetry_snmptraps_target: "192.168.24.254"
__service_telemetry_snmptraps_community: "public"
__service_telemetry_snmptraps_retries: 5
__service_telemetry_snmptraps_timeout: 1
__service_telemetry_snmptraps_port: 162
__service_telemetry_snmptraps_alert_oid_label: "oid"
__service_telemetry_snmptraps_trap_oid_prefix: "1.3.6.1.4.1.50495.15"
__service_telemetry_snmptraps_trap_default_oid: "1.3.6.1.4.1.50495.15.1.2.1"
__service_telemetry_snmptraps_trap_default_severity: ""
__service_telemetry_logs_enabled: false
__service_telemetry_observability_strategy: use_community
__internal_registry_path: image-registry.openshift-image-registry.svc:5000
__service_telemetry_bundle_image_path:
Expand Down
9 changes: 9 additions & 0 deletions build/stf-run-ci/tasks/deploy_stf.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,15 @@
receivers:
snmpTraps:
enabled: {{ __service_telemetry_snmptraps_enabled }}
target: "{{ __service_telemetry_snmptraps_target }}"
community: "{{ __service_telemetry_snmptraps_community }}"
retries: {{ __service_telemetry_snmptraps_retries }}
port: {{ __service_telemetry_snmptraps_port }}
timeout: {{ __service_telemetry_snmptraps_timeout }}
alertOidLabel: "{{ __service_telemetry_snmptraps_alert_oid_label }}"
trapOidPrefix: "{{ __service_telemetry_snmptraps_trap_oid_prefix }}"
trapDefaultOid: "{{ __service_telemetry_snmptraps_trap_default_oid }}"
trapDefaultSeverity: "{{ __service_telemetry_snmptraps_trap_default_severity }}"
backends:
events:
elasticsearch:
Expand Down
26 changes: 25 additions & 1 deletion deploy/crds/infra.watch_servicetelemetrys_crd.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -56,8 +56,32 @@ spec:
enabled:
description: Deploy container to send snmp traps
type: boolean
community:
description: 'Target community for SNMP traps. Default is "public"'
type: string
target:
description: Target address for SNMP traps to send to
description: 'Target address for SNMP traps to send to.'
type: string
retries:
description: 'SNMP trap delivery retry limit. Default is 5'
type: integer
timeout:
description: 'Response timeout, in seconds. Default is 1'
type: integer
port:
description: 'SNMP track delivery port. Default is 162'
type: integer
alertOidLabel:
description: 'Label for finding the OID. Default is "oid"'
type: string
trapOidPrefix:
description: 'OID prefix for the trap variable bindings. Default is "1.3.6.1.4.1.50495.15"'
type: string
trapDefaultOid:
description: 'The trap OID if none is found in the Prometheus alert labels. Default is "1.3.6.1.4.1.50495.15.1.2.1"'
type: string
trapDefaultSeverity:
description: 'The trap severity if none is found in the Prometheus alert labels. Default is empty.'
type: string
type: object
type: object
Expand Down
8 changes: 8 additions & 0 deletions deploy/crds/infra.watch_v1beta1_servicetelemetry_cr.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,15 @@ spec:
receivers:
snmpTraps:
enabled: false
community: public
target: 192.168.24.254
retries: 5
port: 162
timeout: 1
alertOidLabel: oid
trapOidPrefix: "1.3.6.1.4.1.50495.15"
trapDefaultOid: "1.3.6.1.4.1.50495.15.1.2.1"
trapDefaultSeverity: ""
storage:
strategy: persistent
persistent:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,44 @@ spec:
properties:
snmpTraps:
properties:
alertOidLabel:
description: Label for finding the OID. Default is
"oid"
type: string
community:
description: Target community for SNMP traps. Default
is "public"
type: string
enabled:
description: Deploy container to send snmp traps
type: boolean
port:
description: SNMP track delivery port. Default is
162
type: integer
retries:
description: SNMP trap delivery retry limit. Default
is 5
type: integer
target:
description: Target address for SNMP traps to send
to
to.
type: string
timeout:
description: Response timeout, in seconds. Default
is 1
type: integer
trapDefaultOid:
description: The trap OID if none is found in the
Prometheus alert labels. Default is "1.3.6.1.4.1.50495.15.1.2.1"
type: string
trapDefaultSeverity:
description: The trap severity if none is found in
the Prometheus alert labels. Default is empty.
type: string
trapOidPrefix:
description: OID prefix for the trap variable bindings.
Default is "1.3.6.1.4.1.50495.15"
type: string
type: object
type: object
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,16 @@ metadata:
"alertmanager": {
"receivers": {
"snmpTraps": {
"alertOidLabel": "oid",
"community": "public",
"enabled": false,
"target": "192.168.24.254"
"port": 162,
"retries": 5,
"target": "192.168.24.254",
"timeout": 1,
"trapDefaultOid": "1.3.6.1.4.1.50495.15.1.2.1",
"trapDefaultSeverity": "",
"trapOidPrefix": "1.3.6.1.4.1.50495.15"
}
},
"storage": {
Expand Down
8 changes: 8 additions & 0 deletions roles/servicetelemetry/defaults/main.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,15 @@ servicetelemetry_defaults:
receivers:
snmp_traps:
enabled: false
community: public
target: 192.168.24.254
retries: 5
timeout: 1
port: 162
alert_oid_label: "oid"
trap_oid_prefix: "1.3.6.1.4.1.50495.15"
trap_default_oid: "1.3.6.1.4.1.50495.15.1.2.1"
trap_default_severity: ""

backends:
metrics:
Expand Down
16 changes: 13 additions & 3 deletions roles/servicetelemetry/templates/manifest_snmp_traps.j2
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,20 @@ spec:
- containerPort: 9099
env:
- name: SNMP_COMMUNITY
value: public
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.community }}"
- name: SNMP_RETRIES
value: "1"
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.retries }}"
- name: SNMP_HOST
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.target }}"
- name: SNMP_PORT
value: "162"
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.port }}"
- name: SNMP_TIMEOUT
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.port }}"
- name: ALERT_OID_LABEL
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.alert_oid_label }}"
- name: TRAP_OID_PREFIX
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.trap_oid_prefix }}"
- name: TRAP_DEFAULT_OID
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.trap_default_oid }}"
- name: TRAP_DEFAULT_SEVERITY
value: "{{ servicetelemetry_vars.alerting.alertmanager.receivers.snmp_traps.trap_default_severity }}"