Skip to content

Commit

Permalink
Datadog Integration (#3407)
Browse files Browse the repository at this point in the history
* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation, deployment override failsafes

* datadog-integration: updated consul-server agent telemetry-config.json with dd specific items as well as additional missing VM based options, unit tests, dd unix socket integration, dd agent acl token generation | final initial-push

* changelog entry update

* datadog-integration: updated consul-server agent server.config (enable_debug) and telemetry.config update | enable_debug to server.config

* curt pr review changes (minus extraConfig templating verification changes)

* global.metrics.AgentMetrics -> global.metrics.enableAgentMetrics

* dogstatsd and otlp mutually exclusive verification checks

* breaking changes now incorporated into consul.validateExtraConfig helper template function as precheck

* extraConfig hash updates post merge conflict update

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* update changelog .txt to match new PR number

* updated server-statefulset.yaml to correct ad.datadoghq.com/consul.logs annotation to valid single quote string

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* fix helpers.tpl consul.extraConfig from merge --> /consul/tmp/extra-config/extra-from-values.json | add labels to rolebinding for datadog secrets

* update UDP dogstatsdPort behavior to exclude including a port value if using a kube service address (as determined by user overrides)

* update _helpers.tpl consul.ValidateDatadogConfiguration func to account for using 'https' as protocol => should fail

* update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul

* update server-statefulset.yaml to exclude prometheus.io annotations if enabling datadog openmetrics method for consul server metrics scrape. conflict present with http vs https that breaks openemtrics scrape on consul

* correct otlp protocol helpers.tpl check to lower-case the protocol to match the open-telemetry-deployment.yaml behavior

* fix server-acl-init command_test.go for datadog token policy - datacenter should have been dc1

* add in server-statefulset bats test for extraConfig validation testing
  • Loading branch information
natemollica-nm authored Feb 12, 2024
1 parent 1501856 commit 997f2e8
Show file tree
Hide file tree
Showing 17 changed files with 1,309 additions and 15 deletions.
13 changes: 13 additions & 0 deletions .changelog/3407.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
```release-note:feature
helm: introduces `global.metrics.datadog` overrides to streamline consul-k8s datadog integration.
helm: introduces `server.enableAgentDebug` to expose agent [`enable_debug`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#enable_debug) configuration.
helm: introduces `global.metrics.disableAgentHostName` to expose agent [`telemetry.disable_hostname`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-disable_hostname) configuration.
helm: introduces `global.metrics.enableHostMetrics` to expose agent [`telemetry.enable_host_metrics`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-enable_host_metrics) configuration.
helm: introduces `global.metrics.prefixFilter` to expose agent [`telemetry.prefix_filter`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-prefix_filter) configuration.
helm: introduces `global.metrics.datadog.dogstatsd.dogstatsdAddr` to expose agent [`telemetry.dogstatsd_addr`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-dogstatsd_addr) configuration.
helm: introduces `global.metrics.datadog.dogstatsd.dogstatsdTags` to expose agent [`telemetry.dogstatsd_tags`](https://developer.hashicorp.com/consul/docs/agent/config/config-files#telemetry-dogstatsd_tags) configuration.
helm: introduces required `ad.datadoghq.com/` annotations and `tags.datadoghq.com/` labels for integration with [Datadog Autodiscovery](https://docs.datadoghq.com/integrations/consul/?tab=containerized) and [Datadog Unified Service Tagging](https://docs.datadoghq.com/getting_started/tagging/unified_service_tagging/?tab=kubernetes#serverless-environment) for Consul.
helm: introduces automated unix domain socket hostPath mounting for containerized integration with datadog within consul-server statefulset.
helm: introduces `global.metrics.datadog.otlp` override options to allow OTLP metrics forwarding to Datadog Agent.
control-plane: adds `server-acl-init` datadog agent token creation for datadog integration.
```
171 changes: 169 additions & 2 deletions charts/consul/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -151,6 +151,29 @@ is passed to consul as a -config-file param on command line.
[ -n "${HOSTNAME}" ] && sed -Ei "s|HOSTNAME|${HOSTNAME?}|g" /consul/extra-config/extra-from-values.json
{{- end -}}

{{/*
Cleanup server.extraConfig entries to avoid conflicting entries:
- server.enableAgentDebug:
- `enable_debug` should not exist in extraConfig
- metrics.disableAgentHostName:
- if global.metrics.enabled and global.metrics.enableAgentMetrics are enabled, `disable_hostname` should not exist in extraConfig
- metrics.enableHostMetrics:
- if global.metrics.enabled and global.metrics.enableAgentMetrics are enabled, `enable_host_metrics` should not exist in extraConfig
- metrics.prefixFilter
- if global.metrics.enabled and global.metrics.enableAgentMetrics are enabled, `prefix_filter` should not exist in extraConfig
- metrics.datadog.enabled:
- if global.metrics.datadog.enabled and global.metrics.datadog.dogstatsd.enabled, `dogstatsd_tags` and `dogstatsd_addr` should not exist in extraConfig
Usage: {{ template "consul.validateExtraConfig" . }}
*/}}
{{- define "consul.validateExtraConfig" -}}
{{- if (contains "enable_debug" .Values.server.extraConfig) }}{{ fail "The enable_debug key is present in extra-from-values.json. Use server.enableAgentDebug to set this value." }}{{- end }}
{{- if (contains "disable_hostname" .Values.server.extraConfig) }}{{ fail "The disable_hostname key is present in extra-from-values.json. Use global.metrics.disableAgentHostName to set this value." }}{{- end }}
{{- if (contains "enable_host_metrics" .Values.server.extraConfig) }}{{ fail "The enable_host_metrics key is present in extra-from-values.json. Use global.metrics.enableHostMetrics to set this value." }}{{- end }}
{{- if (contains "prefix_filter" .Values.server.extraConfig) }}{{ fail "The prefix_filter key is present in extra-from-values.json. Use global.metrics.prefix_filter to set this value." }}{{- end }}
{{- if (and .Values.global.metrics.enabled .Values.global.metrics.enableAgentMetrics) }}{{- if (and .Values.global.metrics.datadog.dogstatsd.enabled) }}{{- if (contains "dogstatsd_tags" .Values.server.extraConfig) }}{{ fail "The dogstatsd_tags key is present in extra-from-values.json. Use global.metrics.datadog.dogstatsd.dogstatsdTags to set this value." }}{{- end }}{{- end }}{{- if (and .Values.global.metrics.datadog.dogstatsd.enabled) }}{{- if (contains "dogstatsd_addr" .Values.server.extraConfig) }}{{ fail "The dogstatsd_addr key is present in extra-from-values.json. Use global.metrics.datadog.dogstatsd.dogstatsd_addr to set this value." }}{{- end }}{{- end }}{{- end }}
{{- end -}}

{{/*
Create chart name and version as used by the chart label.
*/}}
Expand Down Expand Up @@ -428,10 +451,10 @@ Usage: {{ template "consul.validateTelemetryCollectorCloud" . }}
*/}}
{{- define "consul.validateTelemetryCollectorCloud" -}}
{{- if (and .Values.telemetryCollector.cloud.clientId.secretName (and (not .Values.global.cloud.clientSecret.secretName) (not .Values.telemetryCollector.cloud.clientSecret.secretName))) }}
{{fail "When telemetryCollector.cloud.clientId.secretName is set, telemetryCollector.cloud.clientSecret.secretName must also be set."}}
{{fail "When telemetryCollector.cloud.clientId.secretName is set, telemetryCollector.cloud.clientSecret.secretName must also be set." }}
{{- end }}
{{- if (and .Values.telemetryCollector.cloud.clientSecret.secretName (and (not .Values.global.cloud.clientId.secretName) (not .Values.telemetryCollector.cloud.clientId.secretName))) }}
{{fail "When telemetryCollector.cloud.clientSecret.secretName is set, telemetryCollector.cloud.clientId.secretName must also be set."}}
{{fail "When telemetryCollector.cloud.clientSecret.secretName is set, telemetryCollector.cloud.clientId.secretName must also be set." }}
{{- end }}
{{- end }}
Expand Down Expand Up @@ -519,3 +542,147 @@ Usage: {{ template "consul.validateResourceAPIs" . }}
{{fail "When the value global.experiments.resourceAPIs is set, apiGateway.enabled is currently unsupported."}}
{{- end }}
{{- end }}

{{/*
Validation for Consul Metrics configuration:
Fail if metrics.enabled=true and metrics.disableAgentHostName=true, but metrics.enableAgentMetrics=false
- metrics.enabled = true
- metrics.enableAgentMetrics = false
- metrics.disableAgentHostName = true
Fail if metrics.enableAgentMetrics=true and metrics.disableAgentHostName=true, but metrics.enabled=false
- metrics.enabled = false
- metrics.enableAgentMetrics = true
- metrics.disableAgentHostName = true
Fail if metrics.enabled=true and metrics.enableHostMetrics=true, but metrics.enableAgentMetrics=false
- metrics.enabled = true
- metrics.enableAgentMetrics = false
- metrics.enableHostMetrics = true
Fail if metrics.enableAgentMetrics=true and metrics.enableHostMetrics=true, but metrics.enabled=false
- metrics.enabled = false
- metrics.enableAgentMetrics = true
- metrics.enableHostMetrics = true
Usage: {{ template "consul.validateMetricsConfig" . }}

*/}}

{{- define "consul.validateMetricsConfig" -}}
{{- if and (not .Values.global.metrics.enableAgentMetrics) (and .Values.global.metrics.disableAgentHostName .Values.global.metrics.enabled )}}
{{fail "When enabling metrics (global.metrics.enabled) and disabling hostname emission from metrics (global.metrics.disableAgentHostName), global.metrics.enableAgentMetrics must be set to true"}}
{{- end }}
{{- if and (not .Values.global.metrics) (and .Values.global.metrics.disableAgentHostName .Values.global.metrics.enableAgentMetrics )}}
{{fail "When enabling Consul agent metrics (global.metrics.enableAgentMetrics) and disabling hostname emission from metrics (global.metrics.disableAgentHostName), global metrics enablement (global.metrics.enabled) must be set to true"}}
{{- end }}
{{- if and (not .Values.global.metrics.enableAgentMetrics) (and .Values.global.metrics.disableAgentHostName .Values.global.metrics.enabled )}}
{{fail "When disabling hostname emission from metrics (global.metrics.disableAgentHostName) and enabling global metrics (global.metrics.enabled), Consul agent metrics must be enabled(global.metrics.enableAgentMetrics=true)"}}
{{- end }}
{{- if and (not .Values.global.metrics.enabled) (and .Values.global.metrics.disableAgentHostName .Values.global.metrics.enableAgentMetrics)}}
{{fail "When enabling Consul agent metrics (global.metrics.enableAgentMetrics) and disabling hostname metrics emission (global.metrics.disableAgentHostName), global metrics must be enabled (global.metrics.enabled)."}}
{{- end }}
{{- end -}}

{{/*
Validation for Consul Datadog Integration deployment:
Fail if Datadog integration enabled and Consul server agent telemetry is not enabled.
- global.metrics.datadog.enabled=true
- global.metrics.enableAgentMetrics=false || global.metrics.enabled=false
Fail if Consul OpenMetrics (Prometheus) and DogStatsD metrics are both enabled and configured.
- global.metrics.datadog.dogstatsd.enabled (scrapes `/v1/agent/metrics?format=prometheus` via the `use_prometheus_endpoint` option)
- global.metrics.datadog.openMetricsPrometheus.enabled (scrapes `/v1/agent/metrics?format=prometheus`)
- see https://docs.datadoghq.com/integrations/consul/?tab=host#host for recommendation to not have both
Fail if Datadog OTLP forwarding is enabled and Consul Telemetry Collection is not enabled.
- global.metrics.datadog.otlp.enabled=true
- telemetryCollector.enabled=false
Fail if Consul Open Telemetry collector forwarding protocol is not one of either "http" or "grpc"
- global.metrics.datadog.otlp.protocol!="http" || global.metrics.datadog.otlp.protocol!="grpc"
Usage: {{ template "consul.validateDatadogConfiguration" . }}

*/}}

{{- define "consul.validateDatadogConfiguration" -}}
{{- if and .Values.global.metrics.datadog.enabled (or (not .Values.global.metrics.enableAgentMetrics) (not .Values.global.metrics.enabled) )}}
{{fail "When enabling datadog metrics collection, the /v1/agent/metrics is required to be accessible, therefore global.metrics.enableAgentMetrics and global.metrics.enabled must be also be enabled."}}
{{- end }}
{{- if and .Values.global.metrics.datadog.dogstatsd.enabled .Values.global.metrics.datadog.openMetricsPrometheus.enabled }}
{{fail "You must have one of DogStatsD (global.metrics.datadog.dogstatsd.enabled) or OpenMetrics (global.metrics.datadog.openMetricsPrometheus.enabled) enabled, not both as this is an unsupported configuration." }}
{{- end }}
{{- if and .Values.global.metrics.datadog.otlp.enabled (not .Values.telemetryCollector.enabled) }}
{{fail "Cannot enable Datadog OTLP metrics collection (global.metrics.datadog.otlp.enabled) without consul-telemetry-collector. Ensure Consul OTLP collection is enabled (telemetryCollector.enabled) and configured." }}
{{- end }}
{{- if and (ne ( lower .Values.global.metrics.datadog.otlp.protocol) "http") (ne ( lower .Values.global.metrics.datadog.otlp.protocol) "grpc") }}
{{fail "Valid values for global.metrics.datadog.otlp.protocol must be one of either \"http\" or \"grpc\"." }}
{{- end }}
{{- end -}}

{{/*
Sets the dogstatsd_addr field of the agent configuration dependent on the
socket transport type being used:
- "UDS" (Unix Domain Socket): prefixes "unix://" to URL and appends path to socket (i.e., unix:///var/run/datadog/dsd.socket)
- "UDP" (User Datagram Protocol): adds no prefix and appends dogstatsd port number to hostname/IP (i.e., 172.20.180.10:8125)
- global.metrics.enableDatadogIntegration.dogstatsd configuration
Usage: {{ template "consul.dogstatsdAaddressInfo" . }}
*/}}

{{- define "consul.dogstatsdAaddressInfo" -}}
{{- if (and .Values.global.metrics.datadog.enabled .Values.global.metrics.datadog.dogstatsd.enabled) }}
"dogstatsd_addr": "{{- if eq .Values.global.metrics.datadog.dogstatsd.socketTransportType "UDS" }}unix://{{ .Values.global.metrics.datadog.dogstatsd.dogstatsdAddr }}{{- else }}{{ .Values.global.metrics.datadog.dogstatsd.dogstatsdAddr | trimAll "\"" }}{{- if ne ( .Values.global.metrics.datadog.dogstatsd.dogstatsdPort | int ) 0 }}:{{ .Values.global.metrics.datadog.dogstatsd.dogstatsdPort | toString }}{{- end }}{{- end }}",{{- end }}
{{- end -}}

{{/*
Configures the metrics prefixing that's required to either allow or dissallow certaing RPC or gRPC server calls:
Usage: {{ template "consul.prefixFilter" . }}
*/}}
{{- define "consul.prefixFilter" -}}
{{- $allowList := .Values.global.metrics.prefixFilter.allowList }}
{{- $blockList := .Values.global.metrics.prefixFilter.blockList }}
{{- if and (not (empty $allowList)) (not (empty $blockList)) }}
"prefix_filter": [{{- range $index, $value := concat $allowList $blockList -}}
"{{- if (has $value $allowList) }}{{ printf "+%s" ($value | trimAll "\"") }}{{- else }}{{ printf "-%s" ($value | trimAll "\"") }}{{- end }}"{{- if lt $index (sub (len (concat $allowList $blockList)) 1) -}},{{- end -}}
{{- end -}}],
{{- else if not (empty $allowList) }}
"prefix_filter": [{{- range $index, $value := $allowList -}}
"{{ printf "+%s" ($value | trimAll "\"") }}"{{- if lt $index (sub (len $allowList) 1) -}},{{- end -}}
{{- end -}}],
{{- else if not (empty $blockList) }}
"prefix_filter": [{{- range $index, $value := $blockList -}}
"{{ printf "-%s" ($value | trimAll "\"") }}"{{- if lt $index (sub (len $blockList) 1) -}},{{- end -}}
{{- end -}}],
{{- end }}
{{- end -}}
{{/*
Retrieves the global consul/consul-enterprise version string for use with labels or tags.
Requirements for valid labels:
- a valid label must be an empty string or consist of
=> alphanumeric characters
=> '-', '_' or '.'
=> must start and end with an alphanumeric character
(e.g. 'MyValue', or 'my_value', or '12345', regex used for validation is
'(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])?')
Usage: {{ template "consul.versionInfo" }}
*/}}
{{- define "consul.versionInfo" -}}
{{- $imageVersion := regexSplit ":" .Values.global.image -1 }}
{{- $versionInfo := printf "%s" (index $imageVersion 1 ) | trimSuffix "\"" }}
{{- $sanitizedVersion := "" }}
{{- $pattern := "^([A-Za-z0-9][-A-Za-z0-9_.]*[A-Za-z0-9])?$" }}
{{- if not (regexMatch $pattern $versionInfo) -}}
{{- $sanitizedVersion = regexReplaceAll "[^A-Za-z0-9-_.]|sha256" $versionInfo "" }}
{{- $sanitizedVersion = printf "%s" (trimSuffix "-" (trimPrefix "-" $sanitizedVersion)) -}}
{{- else }}
{{- $sanitizedVersion = $versionInfo }}
{{- end -}}
{{- printf "%s" $sanitizedVersion | quote }}
{{- end -}}
38 changes: 38 additions & 0 deletions charts/consul/templates/datadog-agent-role.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
{{- if .Values.global.metrics.datadog.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: {{ template "consul.fullname" . }}-datadog-metrics
namespace: {{ .Release.Namespace }}
labels:
app: datadog
heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
component: agent
{{- if (or (and .Values.global.openshift.enabled .Values.server.exposeGossipAndRPCPorts) .Values.global.enablePodSecurityPolicies) }}
{{- if .Values.global.enablePodSecurityPolicies }}
rules:
- apiGroups: ["policy"]
resources: ["podsecuritypolicies"]
resourceNames:
- {{ template "consul.fullname" . }}-datadog-metrics
verbs:
- use
{{- end }}
{{- if (and .Values.global.openshift.enabled .Values.server.exposeGossipAndRPCPorts ) }}
- apiGroups: ["security.openshift.io"]
resources: ["securitycontextconstraints"]
resourceNames:
- {{ template "consul.fullname" . }}-datadog-metrics
verbs:
- use
{{- end }}
{{- else}}
rules:
- apiGroups: [ "" ]
resources: [ "secrets" ]
resourceNames:
- {{ .Release.Namespace }}-datadog-agent-metrics-acl-token
verbs: [ "get", "watch", "list" ]
{{- end }}
{{- end }}
26 changes: 26 additions & 0 deletions charts/consul/templates/datadog-agent-rolebinding.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
{{- if .Values.global.metrics.datadog.enabled }}
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: {{ template "consul.fullname" . }}-datadog-metrics
namespace: {{ .Release.Namespace }}
labels:
app: {{ template "consul.name" . }}
chart: {{ template "consul.chart" . }}
heritage: {{ .Release.Service }}
release: {{ .Release.Name }}
component: agent
subjects:
- kind: ServiceAccount
apiGroup: ""
name: datadog-agent
namespace: datadog
- kind: ServiceAccount
apiGroup: ""
name: datadog-cluster-agent
namespace: datadog
roleRef:
kind: Role
name: {{ template "consul.fullname" . }}-datadog-metrics
apiGroup: ""
{{- end }}
4 changes: 4 additions & 0 deletions charts/consul/templates/server-acl-init-job.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -273,6 +273,10 @@ spec:
-create-enterprise-license-token=true \
{{- end }}
{{- if (and (not .Values.global.metrics.datadog.dogstatsd.enabled) .Values.global.metrics.datadog.enabled .Values.global.acls.manageSystemACLs) }}
-create-dd-agent-token=true \
{{- end }}
{{- if .Values.server.snapshotAgent.enabled }}
-snapshot-agent=true \
{{- end }}
Expand Down
9 changes: 8 additions & 1 deletion charts/consul/templates/server-config-configmap.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ data:
{{- if .Values.server.logLevel }}
"log_level": "{{ .Values.server.logLevel | upper }}",
{{- end }}
"enable_debug": {{ .Values.server.enableAgentDebug }},
"domain": "{{ .Values.global.domain }}",
"limits": {
"request_limits": {
Expand Down Expand Up @@ -192,7 +193,13 @@ data:
telemetry-config.json: |-
{
"telemetry": {
"prometheus_retention_time": "{{ .Values.global.metrics.agentMetricsRetentionTime }}"
"prometheus_retention_time": "{{ .Values.global.metrics.agentMetricsRetentionTime }}",
"disable_hostname": {{ .Values.global.metrics.disableAgentHostName }},{{ template "consul.prefixFilter" . }}
"enable_host_metrics": {{ .Values.global.metrics.enableHostMetrics }}{{- if .Values.global.metrics.datadog.dogstatsd.enabled }},{{ template "consul.dogstatsdAaddressInfo" . }}
{{- if .Values.global.metrics.datadog.dogstatsd.enabled }}
"dogstatsd_tags": {{ .Values.global.metrics.datadog.dogstatsd.dogstatsdTags | toJson }}
{{- end }}
{{- end }}
}
}
{{- end }}
Expand Down
Loading

0 comments on commit 997f2e8

Please sign in to comment.