Skip to content

Commit

Permalink
Merge pull request #228 from sighupio/feature-prometheus-agent
Browse files Browse the repository at this point in the history
feat(monitoring): add remoteWrite to Prometheus and Prometheus Agent mode
  • Loading branch information
nutellinoit authored Jul 31, 2024
2 parents 02c22fa + 94714bf commit 907c80a
Show file tree
Hide file tree
Showing 28 changed files with 793 additions and 13,975 deletions.
1 change: 1 addition & 0 deletions defaults/ekscluster-kfd-v1alpha2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -146,6 +146,7 @@ data:
retentionTime: 30d
retentionSize: 120GB
storageSize: 150Gi
prometheusAgent: {}
alertmanager:
installDefaultRules: true
deadManSwitchWebhookUrl: ""
Expand Down
1 change: 1 addition & 0 deletions defaults/kfddistribution-kfd-v1alpha2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ data:
retentionTime: 30d
retentionSize: 120GB
storageSize: 150Gi
prometheusAgent: {}
alertmanager:
installDefaultRules: true
deadManSwitchWebhookUrl: ""
Expand Down
1 change: 1 addition & 0 deletions defaults/onpremises-kfd-v1alpha2.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -139,6 +139,7 @@ data:
retentionTime: 30d
retentionSize: 120GB
storageSize: 150Gi
prometheusAgent: {}
alertmanager:
installDefaultRules: true
deadManSwitchWebhookUrl: ""
Expand Down
118 changes: 109 additions & 9 deletions docs/schemas/ekscluster-kfd-v1alpha2.md
Original file line number Diff line number Diff line change
Expand Up @@ -2658,9 +2658,14 @@ The type of the logging, must be ***none***, ***opensearch*** or ***loki***
| [minio](#specdistributionmodulesmonitoringminio) | `object` | Optional |
| [overrides](#specdistributionmodulesmonitoringoverrides) | `object` | Optional |
| [prometheus](#specdistributionmodulesmonitoringprometheus) | `object` | Optional |
| [prometheusAgent](#specdistributionmodulesmonitoringprometheusagent) | `object` | Optional |
| [type](#specdistributionmodulesmonitoringtype) | `string` | Required |
| [x509Exporter](#specdistributionmodulesmonitoringx509exporter) | `object` | Optional |

### Description

configuration for the Monitoring module components

## .spec.distribution.modules.monitoring.alertmanager

### Properties
Expand Down Expand Up @@ -3223,11 +3228,20 @@ The value of the toleration

| Property | Type | Required |
|:---------------------------------------------------------------------------|:---------|:---------|
| [remoteWrite](#specdistributionmodulesmonitoringprometheusremotewrite) | `array` | Optional |
| [resources](#specdistributionmodulesmonitoringprometheusresources) | `object` | Optional |
| [retentionSize](#specdistributionmodulesmonitoringprometheusretentionsize) | `string` | Optional |
| [retentionTime](#specdistributionmodulesmonitoringprometheusretentiontime) | `string` | Optional |
| [storageSize](#specdistributionmodulesmonitoringprometheusstoragesize) | `string` | Optional |

## .spec.distribution.modules.monitoring.prometheus.remoteWrite

### Description

Set this option to ship the collected metrics to a remote Prometheus receiver.

`remoteWrite` is an array of objects that allows configuring the [remoteWrite](https://prometheus.io/docs/specs/remote_write_spec/) options for Prometheus. The objects in the array follow [the same schema as in the prometheus operator](https://prometheus-operator.dev/docs/operator/api/#monitoring.coreos.com/v1.RemoteWriteSpec).

## .spec.distribution.modules.monitoring.prometheus.resources

### Properties
Expand Down Expand Up @@ -3283,35 +3297,109 @@ The memory request for the opensearch pods

### Description

The retention size for the prometheus pods
The retention size for the k8s Prometheus instance.

## .spec.distribution.modules.monitoring.prometheus.retentionTime

### Description

The retention time for the prometheus pods
The retention time for the k8s Prometheus instance.

## .spec.distribution.modules.monitoring.prometheus.storageSize

### Description

The storage size for the prometheus pods
The storage size for the k8s Prometheus instance.

## .spec.distribution.modules.monitoring.prometheusAgent

### Properties

| Property | Type | Required |
|:----------------------------------------------------------------------------|:---------|:---------|
| [remoteWrite](#specdistributionmodulesmonitoringprometheusagentremotewrite) | `array` | Optional |
| [resources](#specdistributionmodulesmonitoringprometheusagentresources) | `object` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.remoteWrite

### Description

Set this option to ship the collected metrics to a remote Prometheus receiver.

`remoteWrite` is an array of objects that allows configuring the [remoteWrite](https://prometheus.io/docs/specs/remote_write_spec/) options for Prometheus. The objects in the array follow [the same schema as in the prometheus operator](https://prometheus-operator.dev/docs/operator/api/#monitoring.coreos.com/v1.RemoteWriteSpec).

## .spec.distribution.modules.monitoring.prometheusAgent.resources

### Properties

| Property | Type | Required |
|:-------------------------------------------------------------------------------|:---------|:---------|
| [limits](#specdistributionmodulesmonitoringprometheusagentresourceslimits) | `object` | Optional |
| [requests](#specdistributionmodulesmonitoringprometheusagentresourcesrequests) | `object` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.resources.limits

### Properties

| Property | Type | Required |
|:---------------------------------------------------------------------------------|:---------|:---------|
| [cpu](#specdistributionmodulesmonitoringprometheusagentresourceslimitscpu) | `string` | Optional |
| [memory](#specdistributionmodulesmonitoringprometheusagentresourceslimitsmemory) | `string` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.resources.limits.cpu

### Description

The cpu limit for the opensearch pods

## .spec.distribution.modules.monitoring.prometheusAgent.resources.limits.memory

### Description

The memory limit for the opensearch pods

## .spec.distribution.modules.monitoring.prometheusAgent.resources.requests

### Properties

| Property | Type | Required |
|:-----------------------------------------------------------------------------------|:---------|:---------|
| [cpu](#specdistributionmodulesmonitoringprometheusagentresourcesrequestscpu) | `string` | Optional |
| [memory](#specdistributionmodulesmonitoringprometheusagentresourcesrequestsmemory) | `string` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.resources.requests.cpu

### Description

The cpu request for the prometheus pods

## .spec.distribution.modules.monitoring.prometheusAgent.resources.requests.memory

### Description

The memory request for the opensearch pods

## .spec.distribution.modules.monitoring.type

### Description

The type of the monitoring, must be ***none***, ***prometheus*** or ***mimir***
The type of the monitoring, must be ***none***, ***prometheus***, ***prometheusAgent*** or ***mimir***.

- `none`: will disable the whole monitoring stack.
- `prometheus`: will install Prometheus Operator and a preconfigured Prometheus instance, Alertmanager, a set of alert rules, exporters needed to monitor all the components of the cluster, Grafana and a series of dashboards to view the collected metrics, and more.
- `prometheusAgent`: wil install Prometheus operator, an instance of Prometheus in Agent mode (no alerting, no queries, no storage), and all the exporters needed to get metrics for the status of the cluster and the workloads. Useful when having a centralized (remote) Prometheus where to ship the metrics and not storing them locally in the cluster.
- `mimir`: will install the same as the `prometheus` option, and in addition Grafana Mimir that allows for longer retention of metrics and the usage of Object Storage.

### Constraints

**enum**: the value of this property must be equal to one of the following values:

| Value |
|:-------------|
|`"none"` |
|`"prometheus"`|
|`"mimir"` |
| Value |
|:------------------|
|`"none"` |
|`"prometheus"` |
|`"prometheusAgent"`|
|`"mimir"` |

## .spec.distribution.modules.monitoring.x509Exporter

Expand Down Expand Up @@ -4305,6 +4393,10 @@ The size of the disk in GB

## .spec.infrastructure.vpn.iamUserNameOverride

### Description

Overrides the default IAM user name for the VPN

### Constraints

**pattern**: the string must match the following regular expression:
Expand Down Expand Up @@ -4565,6 +4657,10 @@ This optional array defines additional IAM users that will be added to the aws-a

## .spec.kubernetes.clusterIAMRoleNamePrefixOverride

### Description

Overrides the default IAM role name prefix for the EKS cluster

### Constraints

**pattern**: the string must match the following regular expression:
Expand Down Expand Up @@ -5108,6 +5204,10 @@ This value defines the VPC ID where the EKS cluster will be created, required on

## .spec.kubernetes.workersIAMRoleNamePrefixOverride

### Description

Overrides the default IAM role name prefix for the EKS workers

### Constraints

**pattern**: the string must match the following regular expression:
Expand Down
106 changes: 97 additions & 9 deletions docs/schemas/kfddistribution-kfd-v1alpha2.md
Original file line number Diff line number Diff line change
Expand Up @@ -2133,9 +2133,14 @@ The type of the logging, must be ***none***, ***opensearch*** or ***loki***
| [minio](#specdistributionmodulesmonitoringminio) | `object` | Optional |
| [overrides](#specdistributionmodulesmonitoringoverrides) | `object` | Optional |
| [prometheus](#specdistributionmodulesmonitoringprometheus) | `object` | Optional |
| [prometheusAgent](#specdistributionmodulesmonitoringprometheusagent) | `object` | Optional |
| [type](#specdistributionmodulesmonitoringtype) | `string` | Required |
| [x509Exporter](#specdistributionmodulesmonitoringx509exporter) | `object` | Optional |

### Description

configuration for the Monitoring module components

## .spec.distribution.modules.monitoring.alertmanager

### Properties
Expand Down Expand Up @@ -2698,11 +2703,20 @@ The value of the toleration

| Property | Type | Required |
|:---------------------------------------------------------------------------|:---------|:---------|
| [remoteWrite](#specdistributionmodulesmonitoringprometheusremotewrite) | `array` | Optional |
| [resources](#specdistributionmodulesmonitoringprometheusresources) | `object` | Optional |
| [retentionSize](#specdistributionmodulesmonitoringprometheusretentionsize) | `string` | Optional |
| [retentionTime](#specdistributionmodulesmonitoringprometheusretentiontime) | `string` | Optional |
| [storageSize](#specdistributionmodulesmonitoringprometheusstoragesize) | `string` | Optional |

## .spec.distribution.modules.monitoring.prometheus.remoteWrite

### Description

Set this option to ship the collected metrics to a remote Prometheus receiver.

`remoteWrite` is an array of objects that allows configuring the [remoteWrite](https://prometheus.io/docs/specs/remote_write_spec/) options for Prometheus. The objects in the array follow [the same schema as in the prometheus operator](https://prometheus-operator.dev/docs/operator/api/#monitoring.coreos.com/v1.RemoteWriteSpec).

## .spec.distribution.modules.monitoring.prometheus.resources

### Properties
Expand Down Expand Up @@ -2758,35 +2772,109 @@ The memory request for the opensearch pods

### Description

The retention size for the prometheus pods
The retention size for the k8s Prometheus instance.

## .spec.distribution.modules.monitoring.prometheus.retentionTime

### Description

The retention time for the prometheus pods
The retention time for the K8s Prometheus instance.

## .spec.distribution.modules.monitoring.prometheus.storageSize

### Description

The storage size for the prometheus pods
The storage size for the k8s Prometheus instance.

## .spec.distribution.modules.monitoring.prometheusAgent

### Properties

| Property | Type | Required |
|:----------------------------------------------------------------------------|:---------|:---------|
| [remoteWrite](#specdistributionmodulesmonitoringprometheusagentremotewrite) | `array` | Optional |
| [resources](#specdistributionmodulesmonitoringprometheusagentresources) | `object` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.remoteWrite

### Description

Set this option to ship the collected metrics to a remote Prometheus receiver.

`remoteWrite` is an array of objects that allows configuring the [remoteWrite](https://prometheus.io/docs/specs/remote_write_spec/) options for Prometheus. The objects in the array follow [the same schema as in the prometheus operator](https://prometheus-operator.dev/docs/operator/api/#monitoring.coreos.com/v1.RemoteWriteSpec).

## .spec.distribution.modules.monitoring.prometheusAgent.resources

### Properties

| Property | Type | Required |
|:-------------------------------------------------------------------------------|:---------|:---------|
| [limits](#specdistributionmodulesmonitoringprometheusagentresourceslimits) | `object` | Optional |
| [requests](#specdistributionmodulesmonitoringprometheusagentresourcesrequests) | `object` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.resources.limits

### Properties

| Property | Type | Required |
|:---------------------------------------------------------------------------------|:---------|:---------|
| [cpu](#specdistributionmodulesmonitoringprometheusagentresourceslimitscpu) | `string` | Optional |
| [memory](#specdistributionmodulesmonitoringprometheusagentresourceslimitsmemory) | `string` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.resources.limits.cpu

### Description

The cpu limit for the loki pods

## .spec.distribution.modules.monitoring.prometheusAgent.resources.limits.memory

### Description

The memory limit for the opensearch pods

## .spec.distribution.modules.monitoring.prometheusAgent.resources.requests

### Properties

| Property | Type | Required |
|:-----------------------------------------------------------------------------------|:---------|:---------|
| [cpu](#specdistributionmodulesmonitoringprometheusagentresourcesrequestscpu) | `string` | Optional |
| [memory](#specdistributionmodulesmonitoringprometheusagentresourcesrequestsmemory) | `string` | Optional |

## .spec.distribution.modules.monitoring.prometheusAgent.resources.requests.cpu

### Description

The cpu request for the prometheus pods

## .spec.distribution.modules.monitoring.prometheusAgent.resources.requests.memory

### Description

The memory request for the opensearch pods

## .spec.distribution.modules.monitoring.type

### Description

The type of the monitoring, must be ***none***, ***prometheus*** or ***mimir***
The type of the monitoring, must be ***none***, ***prometheus***, ***prometheusAgent*** or ***mimir***.

- `none`: will disable the whole monitoring stack.
- `prometheus`: will install Prometheus Operator and a preconfigured Prometheus instace, Alertmanager, a set of alert rules, exporters needed to monitor all the components of the cluster, Grafana and a series of dashboards to view the collected metrics, and more.
- `prometheusAgent`: wil install Prometheus operator, an instance of Prometheus in Agent mode (no alerting, no queries, no storage), and all the exporters needed to get metrics for the status of the cluster and the workloads. Useful when having a centralized (remote) Prometheus where to ship the metrics and not storing them locally in the cluster.
- `mimir`: will install the same as the `prometheus` option, and in addition Grafana Mimir that allows for longer retention of metrics and the usage of Object Storage.

### Constraints

**enum**: the value of this property must be equal to one of the following values:

| Value |
|:-------------|
|`"none"` |
|`"prometheus"`|
|`"mimir"` |
| Value |
|:------------------|
|`"none"` |
|`"prometheus"` |
|`"prometheusAgent"`|
|`"mimir"` |

## .spec.distribution.modules.monitoring.x509Exporter

Expand Down
Loading

0 comments on commit 907c80a

Please sign in to comment.