Skip to content

Commit

Permalink
Add ClusterChecks deployment template (helm#12582)
Browse files Browse the repository at this point in the history
* Add ClusterChecks deployment template

Add the possibility to run dedicated agent(s) for handling the
Cluster Checks process. It allows having dedicated `resources`
request/limit for the Cluster Checks.

Signed-off-by: cedric lamoriniere <cedric.lamoriniere@datadoghq.com>

* Some small update

Signed-off-by: cedric lamoriniere <cedric.lamoriniere@datadoghq.com>
  • Loading branch information
clamoriniere authored and goshlanguage committed May 17, 2019
1 parent 16cd2b0 commit 4c7c030
Show file tree
Hide file tree
Showing 6 changed files with 171 additions and 5 deletions.
2 changes: 1 addition & 1 deletion stable/datadog/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: datadog
version: 1.24.0
version: 1.25.0
appVersion: 6.10.1
description: DataDog Agent
keywords:
Expand Down
13 changes: 11 additions & 2 deletions stable/datadog/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -281,12 +281,21 @@ helm install --name <RELEASE_NAME> \
| `clusterAgent.image.pullPolicy` | Image pull policy | `IfNotPresent` |
| `clusterAgent.image.pullSecrets` | Image pull secrets | `nil` |
| `clusterAgent.metricsProvider.enabled` | Enable Datadog metrics as a source for HPA scaling | `false` |
| `clusterAgent.clusterChecks.enabled` | Enable Cluster Checks on both the Cluster Agent and the Agent daemonset | `false` |
| `clusterAgent.confd` | Additional check configurations (static and Autodiscovery) | `nil` |
| `clusterAgent.clusterChecks.enabled` | Enable Cluster Checks on both the Cluster Agent and the Agent daemonset | `false` |
| `clusterAgent.confd` | Additional check configurations (static and Autodiscovery) | `nil` |
| `clusterAgent.resources.requests.cpu` | CPU resource requests | `200m` |
| `clusterAgent.resources.limits.cpu` | CPU resource limits | `200m` |
| `clusterAgent.resources.requests.memory` | Memory resource requests | `256Mi` |
| `clusterAgent.resources.limits.memory` | Memory resource limits | `256Mi` |
| `clusterAgent.tolerations` | List of node taints to tolerate | `[]` |
| `clusterAgent.livenessProbe` | Overrides the default liveness probe | http port 443 if external metrics enabled |
| `clusterAgent.readinessProbe` | Overrides the default readiness probe | http port 443 if external metrics enabled |
| `clusterchecksDeployment.enabled` | Enable Datadog agent deployment dedicated for running Cluster Checks. It allows having different resources (Request/Limit) for Cluster Checks agent pods. | `false` |
| `clusterchecksDeployment.env` | Additional Datadog environment variables for Cluster Checks Deployment | `nil` |
| `clusterchecksDeployment.resources.requests.cpu` | CPU resource requests | `200m` |
| `clusterchecksDeployment.resources.limits.cpu` | CPU resource limits | `200m` |
| `clusterchecksDeployment.resources.requests.memory` | Memory resource requests | `256Mi` |
| `clusterchecksDeployment.resources.limits.memory` | Memory resource limits | `256Mi` |
| `clusterchecksDeployment.nodeSelector` | Node selectors | `nil` |
| `clusterchecksDeployment.affinity` | Node affinities | avoid running pods on the same node |
| `clusterchecksDeployment.livenessProbe` | Overrides the default liveness probe | http port 5555 |
1 change: 1 addition & 0 deletions stable/datadog/templates/_helpers.tpl
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ Expand the name of the chart.
{{/*
Create a default fully qualified app name.
We truncate at 63 chars because some Kubernetes name fields are limited to this (by the DNS naming spec).
And depending on the resources the name is completed with an extension.
If release name contains chart name it will be used as a full name.
*/}}
{{- define "datadog.fullname" -}}
Expand Down
95 changes: 95 additions & 0 deletions stable/datadog/templates/clusterchecks-deployment.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
{{- if and .Values.clusterAgent.clusterChecks.enabled .Values.clusterchecksDeployment.enabled }}
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: {{ template "datadog.fullname" . }}-clusterchecks
labels:
app: "{{ template "datadog.fullname" . }}"
chart: "{{ .Chart.Name }}-{{ .Chart.Version | replace "+" "_" }}"
release: {{ .Release.Name | quote }}
heritage: {{ .Release.Service | quote }}
spec:
replicas: {{ .Values.clusterchecksDeployment.replicas }}
template:
metadata:
labels:
app: {{ template "datadog.fullname" . }}-clusterchecks
name: {{ template "datadog.fullname" . }}-clusterchecks
spec:
serviceAccountName: {{ if .Values.rbac.create }}{{ template "datadog.fullname" . }}{{ else }}"{{ .Values.rbac.serviceAccountName }}"{{ end }}
containers:
- name: {{ default .Chart.Name .Values.datadog.name }}
image: "{{ .Values.image.repository }}:{{ .Values.image.tag }}"
imagePullPolicy: {{ .Values.image.pullPolicy }}
env:
- name: DD_API_KEY
valueFrom:
secretKeyRef:
name: {{ template "datadog.apiSecretName" . }}
key: api-key
- name: DD_EXTRA_CONFIG_PROVIDERS
value: "clusterchecks"
- {name: DD_HEALTH_PORT, value: "5555"}
# Cluster checks
- name: DD_CLUSTER_AGENT_KUBERNETES_SERVICE_NAME
value: {{ template "datadog.fullname" . }}-cluster-agent
- name: DD_CLUSTER_AGENT_AUTH_TOKEN
valueFrom:
secretKeyRef:
name: {{ template "datadog.fullname" . }}-cluster-agent
key: token
- name: DD_CLUSTER_AGENT_ENABLED
value: {{ .Values.clusterAgent.enabled | quote }}
- {name: DD_EXTRA_CONFIG_PROVIDERS, value: "clusterchecks"}
# Remove unused features
- {name: DD_APM_ENABLED, value: "false"}
- {name: DD_PROCESS_AGENT_ENABLED, value: "false"}
- {name: DD_LOGS_ENABLED, value: "false"}
# Safely run alongside the daemonset
- {name: DD_ENABLE_METADATA_COLLECTION, value: "false"}
- name: DD_HOSTNAME
valueFrom:
fieldRef:
fieldPath: spec.nodeName
{{- if .Values.clusterchecksDeployment.env }}
{{ toYaml .Values.clusterchecksDeployment.env | indent 10 }}
{{- end }}
resources:
{{ toYaml .Values.clusterchecksDeployment.resources | indent 12 }}
volumeMounts:
- {name: s6-run, mountPath: /var/run/s6}
- {name: remove-corechecks, mountPath: /etc/datadog-agent/conf.d}
{{- if .Values.clusterchecksDeployment.livenessProbe }}
livenessProbe:
{{ toYaml .Values.clusterchecksDeployment.livenessProbe | indent 10 }}
{{- else }}
livenessProbe:
httpGet:
path: /health
port: 5555
initialDelaySeconds: 15
periodSeconds: 15
timeoutSeconds: 5
successThreshold: 1
failureThreshold: 6
{{- end }}
volumes:
- {name: s6-run, emptyDir: {}}
- {name: remove-corechecks, emptyDir: {}}
affinity:
{{- if .Values.clusterchecksDeployment.affinity }}
{{ toYaml .Values.clusterchecksDeployment.affinity | indent 8 }}
{{- else }}
# Ensure we only run one worker per node, to avoid name collisions
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchLabels:
app: {{ template "datadog.fullname" . }}-clusterchecks
topologyKey: kubernetes.io/hostname
{{- end }}
{{- if .Values.clusterchecksDeployment.nodeSelector }}
nodeSelector:
{{ toYaml .Values.clusterchecksDeployment.nodeSelector | indent 8 }}
{{- end }}
{{ end }}
2 changes: 1 addition & 1 deletion stable/datadog/templates/daemonset.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@ spec:
- name: DD_DOGSTATSD_SOCKET
value: "/var/run/datadog/dsd.socket"
{{- end }}
{{- if .Values.clusterAgent.clusterChecks.enabled }}
{{- if and .Values.clusterAgent.clusterChecks.enabled (not .Values.clusterchecksDeployment.enabled) }}
- name: DD_EXTRA_CONFIG_PROVIDERS
value: "clusterchecks"
{{- end }}
Expand Down
63 changes: 62 additions & 1 deletion stable/datadog/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -438,7 +438,7 @@ daemonset:
# nodeSelector: {}

## @param affinity - object - optional
## Allow the DaemonSet to schedule ussing affinity rules
## Allow the DaemonSet to schedule using affinity rules
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
#
# affinity: {}
Expand Down Expand Up @@ -499,3 +499,64 @@ deployment:
## Sets PriorityClassName if defined.
#
# priorityClassName:

clusterchecksDeployment:

## @param enabled - boolean - required
## If true, deploys agent dedicated for running the Cluster Checks instead of running in the Daemonset's agents.
## ref: https://docs.datadoghq.com/agent/autodiscovery/clusterchecks/
#
enabled: false

## @param replicas - integer - required
## If you want to deploy the cluckerchecks agent in HA, keep at least clusterchecksDeployment.replicas set to 2.
## And increase the clusterchecksDeployment.replicas according to the number of Cluster Checks.
#
replicas: 2

## @param resources - object -required
## Datadog clusterchecks-agent resource requests and limits.
#
resources:
requests:
cpu: 200m
memory: 500Mi
limits:
cpu: 200m
memory: 500Mi

## @param affinity - object - optional
## Allow the ClusterChecks Deployment to schedule using affinity rules.
## By default, ClusterChecks Deployment Pods are forced to run on different Nodes.
## Ref: https://kubernetes.io/docs/concepts/configuration/assign-pod-node/#affinity-and-anti-affinity
#
# affinity:

## @param nodeSelector - object - optional
## Allow the ClusterChecks Deploument to schedule on selected nodes
## Ref: https://kubernetes.io/docs/user-guide/node-selection/
#
# nodeSelector: {}

## @param tolerations - array - required
## Tolerations for pod assignment
## Ref: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/
#
tolerations: []

## @param livenessProbe - object - optional
## Override the agent's liveness probe logic from the default:
## In case of issues with the probe, you can disable it with the
## following values, to allow easier investigating:
#
# livenessProbe:
# exec:
# command: ["/bin/true"]

## @param env - list of object - optional
## The dd-agent supports many environment variables
## ref: https://github.com/DataDog/datadog-agent/tree/master/Dockerfiles/agent#environment-variables
#
# env:
# - name: <ENV_VAR_NAME>
# value: <ENV_VAR_VALUE>

0 comments on commit 4c7c030

Please sign in to comment.