-
Notifications
You must be signed in to change notification settings - Fork 21
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: add
monitor
to operator, fix monitoring setup (#256)
## Description This adds a new portion to the `Package` spec, `monitor`. This functionality is briefly documented [here](https://github.com/defenseunicorns/uds-core/blob/operator-monitor-magic/docs/MONITOR.md). Also included is a mutation for all service monitors to handle them in the expected istio mTLS way. This PR also fixes some missing dashboards and enables all default dashboards from the upstream kube-prometheus-stack chart (this may be excessive?). Currently monitored: - Prometheus stack (operator, self, alertmanager) - Loki - kube-system things (kubelet, coredns, apiserver) - Promtail - Metrics-server - Velero - Keycloak - Grafana - All istio envoy proxies (podmonitor) Not added here: - NeuVector: Currently has limited config options with regards to auth, keeping this disabled in anticipation of SSO on NeuVector, with a desire to contribute upstream to enable our use case of SSO-only auth. In addition this PR switches single package testing to use/add Istio. This appears to add around 1 minute to each pipeline run, but: - allows us to test the istio VS endpoints in single package checks (some quirks with this in current state due to SSO redirects, but will allow us to do e2e testing in those pipelines in the future) - allows us to assume istio always and not build bespoke pepr code for the specific no-istio scenario (ex: prometheus can assume certs) This would still allow someone to run locally without istio for some packages / un-inject, mess around with mTLS, etc - which may be useful still to identify where Istio is causing problems. This just switches our CI posture so that we assume istio always. ## Related Issue Fixes #17 ## Type of change - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Other (security config, docs update, etc) ## Checklist before merging - [x] Test, docs, adr added or updated as needed - [x] [Contributor Guide Steps](https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md)(https://github.com/defenseunicorns/uds-template-capability/blob/main/CONTRIBUTING.md#submitting-a-pull-request) followed --------- Co-authored-by: Tristan Holaday <40547442+TristanHoladay@users.noreply.github.com> Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
- Loading branch information
1 parent
697d5eb
commit 2e9fea6
Showing
38 changed files
with
1,420 additions
and
139 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
# Monitoring / Metrics Scraping in UDS Core | ||
|
||
UDS Core leverages Pepr to handle setup of Prometheus scraping metrics endpoints, with the particular configuration necessary to work in a STRICT mTLS (Istio) environment. We handle this with both mutations of existing service monitors and generation of service monitors via the `Package` CR. | ||
|
||
## Mutations | ||
|
||
All service monitors are mutated to set the scrape scheme to HTTPS and set the TLS Config to what is required for Istio mTLS scraping (see [this doc](https://istio.io/latest/docs/ops/integrations/prometheus/#tls-settings) for details). Beyond this, no other fields are mutated. Supporting existing service monitors is useful since some charts include service monitors by default with more advanced configurations, and it is in our best interest to enable those and use them where possible. | ||
|
||
Assumptions are made about STRICT mTLS here for simplicity, based on the `istio-injection` namespace label. Without making these assumptions we would need to query `PeerAuthentication` resources or another resource to determine the exact workload mTLS posture. | ||
|
||
Note: This mutation is the default behavior for all service monitors but can be skipped using the annotation key `uds/skip-sm-mutate` (with any value). Skipping this mutation should only be done if your service exposes metrics on a PERMISSIVE mTLS port. | ||
|
||
## Package CR `monitor` field | ||
|
||
UDS Core also supports generating service monitors from the `monitor` list in the `Package` spec. Charts do not always support service monitors, so generating them can be useful. This also provides a simplified way for other users to create service monitors, similar to the way we handle `VirtualServices` today. A full example of this can be seen below: | ||
|
||
```yaml | ||
... | ||
spec: | ||
monitor: | ||
- selector: # Selector for the service to monitor | ||
app: foobar | ||
portName: metrics # Name of the port to monitor | ||
targetPort: 1234 # Corresponding target port on the pod/container (for network policy) | ||
# Optional properties depending on your application | ||
description: "Metrics" # Add to customize the service monitor name | ||
podSelector: # Add if pod labels are different than `selector` (for network policy) | ||
app: barfoo | ||
path: "/mymetrics" # Add if metrics are exposed on a different path than "/metrics" | ||
``` | ||
This config is used to generate service monitors and corresponding network policies to setup scraping for your applications. The `ServiceMonitor`s will go through the mutation process to add `tlsConfig` and `scheme` to work in an istio environment. | ||
|
||
This spec intentionally does not support all options available with a `ServiceMonitor`. While we may add additional fields in the future, we do not want to simply rebuild the `ServiceMonitor` spec since mutations are already available to handle Istio specifics. The current subset of spec options is based on the bare minimum necessary to craft resources. | ||
|
||
NOTE: While this is a rather verbose spec, each of the above fields are strictly required to craft the necessary service monitor and network policy resources. | ||
|
||
## Notes on Alternative Approaches | ||
|
||
In coming up with this feature a few alternative approaches were considered but not chosen due to issues with each one. The current spec provides the best balance of a simplified interface compared to the `ServiceMonitor` spec, and a faster/easier reconciliation loop. | ||
|
||
### Generation based on service lookup | ||
|
||
An alternative spec option would use the service name instead of selectors/port name. The service name could then be used to lookup the corresponding service and get the necessary selectors/port name (based on numerical port). There are however 2 issues with this route: | ||
1. There is a timing issue if the `Package` CR is applied to the cluster before the app chart itself (which is the norm with our UDS Packages). The service would not exist at the time the `Package` is reconciled. We could lean into eventual consistency here, if we implemented a retry mechanism for the `Package`, which would mitigate this issue. | ||
2. We would need an "alert" mechanism (watch) to notify us when the service(s) are updated, to roll the corresponding updates to network policies and service monitors. While this is doable it feels like unnecessary complexity compared to other options. | ||
|
||
### Generation of service + monitor | ||
|
||
Another alternative approach would be to use a pod selector and port only. We would then generate both a service and servicemonitor, giving us full control of the port names and selectors. This seems like a viable path, but does add an extra resource for us to generate and manage. There could be unknown side effects of generating services that could clash with other services (particularly with istio endpoints). This would otherwise be a relative straightforward approach and is worth evaluating again if we want to simplify the spec later on. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,20 @@ | ||
{{- if .Capabilities.APIVersions.Has "monitoring.coreos.com/v1" }} | ||
# The serviceMonitor for metrics-server is unique due to permissive mTLS on its port, so it is created outside of the Package spec | ||
apiVersion: monitoring.coreos.com/v1 | ||
kind: ServiceMonitor | ||
metadata: | ||
annotation: | ||
uds/skip-sm-mutate: "true" | ||
name: metrics-server-metrics | ||
namespace: metrics-server | ||
spec: | ||
endpoints: | ||
- path: /metrics | ||
port: https | ||
scheme: https | ||
tlsConfig: | ||
insecureSkipVerify: true | ||
selector: | ||
matchLabels: | ||
app.kubernetes.io/name: metrics-server | ||
{{- end }} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,3 +6,5 @@ readinessProbe: | |
initialDelaySeconds: 10 | ||
periodSeconds: 5 | ||
failureThreshold: 5 | ||
metrics: | ||
enabled: true |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,12 +1,13 @@ | ||
leastPrivilege: false | ||
|
||
exporter: | ||
# Temporarily disabled until we can get the monitor chart working | ||
# via https://github.com/neuvector/neuvector-helm/pull/355 | ||
# This is disabled pending further discussion/upstream changes to handle metrics with SSO setup | ||
enabled: false | ||
# Disable serviceMonitor to handle standalone testing, handled by UDS Package | ||
serviceMonitor: | ||
enabled: false | ||
apiSvc: neuvector-svc-controller-api:10443 | ||
svc: | ||
enabled: true | ||
# Temporary disabling of service to allow adding appProtocol via custom service in the config chart | ||
enabled: false | ||
type: ClusterIP |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
40 changes: 40 additions & 0 deletions
40
src/pepr/operator/controllers/monitoring/service-monitor.spec.ts
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,40 @@ | ||
import { describe, expect, it } from "@jest/globals"; | ||
import { generateServiceMonitor } from "./service-monitor"; | ||
|
||
describe("test generate service monitor", () => { | ||
it("should return a valid Service Monitor object", () => { | ||
const pkg = { | ||
apiVersion: "uds.dev/v1alpha1", | ||
kind: "Package", | ||
metadata: { | ||
name: "test", | ||
uid: "f50120aa-2713-4502-9496-566b102b1174", | ||
}, | ||
}; | ||
const portName = "http-metrics"; | ||
const metricsPath = "/test"; | ||
const selectorApp = "test"; | ||
const monitor = { | ||
portName: portName, | ||
path: metricsPath, | ||
targetPort: 1234, | ||
selector: { | ||
app: selectorApp, | ||
}, | ||
}; | ||
const namespace = "test"; | ||
const pkgName = "test"; | ||
const generation = "1"; | ||
const payload = generateServiceMonitor(pkg, monitor, namespace, pkgName, generation); | ||
|
||
expect(payload).toBeDefined(); | ||
expect(payload.metadata?.name).toEqual(`${pkgName}-${selectorApp}-${portName}`); | ||
expect(payload.metadata?.namespace).toEqual(namespace); | ||
expect(payload.spec?.endpoints).toBeDefined(); | ||
if (payload.spec?.endpoints) { | ||
expect(payload.spec.endpoints[0].port).toEqual(portName); | ||
expect(payload.spec.endpoints[0].path).toEqual(metricsPath); | ||
} | ||
expect(payload.spec?.selector.matchLabels).toHaveProperty("app", "test"); | ||
}); | ||
}); |
Oops, something went wrong.