Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Improved metrics troubleshooting #1413

Merged
merged 13 commits into from
Sep 5, 2024
31 changes: 29 additions & 2 deletions docs/user/04-metrics.md
Original file line number Diff line number Diff line change
Expand Up @@ -322,6 +322,29 @@ For metrics ingestion to start automatically, simply apply the following annotat
| `prometheus.io/path` | `/metrics`, `/custom_metrics` | `/metrics` | Defines the HTTP path where Prometheus can find metrics data. |
| `prometheus.io/scheme` | `http`, `https` | If Istio is active, `https` is supported; otherwise, only `http` is available. The default scheme is `http` unless an Istio sidecar is present, denoted by the label `security.istio.io/tlsMode=istio`, in which case `https` becomes the default. | Determines the protocol used for scraping metrics — either HTTPS with mTLS or plain HTTP. |

An example configuration for a `Service` might be:
NHingerl marked this conversation as resolved.
Show resolved Hide resolved
```yaml
apiVersion: v1
kind: Service
metadata:
annotations:
prometheus.io/port: "8080"
prometheus.io/scrape: "true"
name: sample
spec:
ports:
- name: http-metrics
appProtocol: http
port: 8080
protocol: TCP
targetPort: 8080
selector:
app: sample
type: ClusterIP
```
> [!NOTE]
NHingerl marked this conversation as resolved.
Show resolved Hide resolved
> When running the pod targetted by a Service with Istio, please ensure that Istio can derive the used [appProtocol](https://kubernetes.io/docs/concepts/services-networking/service/#application-protocol) from the Service port definition. This can be achieved by prefixing the port name with the protocol like in `http-metrics`, or by explicitly defining the `appProtocol` attribute. If Istio cannot derive the protocol, the communication for scraping the metric endpoint cannot be established.
NHingerl marked this conversation as resolved.
Show resolved Hide resolved

> [!NOTE]
> The Metric agent can scrape endpoints even if the workload is a part of the Istio service mesh and accepts mTLS communication. However, there's a constraint: For scraping through HTTPS, Istio must configure the workload using 'STRICT' mTLS mode. Without 'STRICT' mTLS mode, you can set up scraping through HTTP by applying the annotation `prometheus.io/scheme=http`. For related troubleshooting, see [Log entry: Failed to scrape Prometheus endpoint](#log-entry-failed-to-scrape-prometheus-endpoint).

Expand Down Expand Up @@ -626,13 +649,17 @@ To detect and fix such situations, check the pipeline status and check out [Trou
2023-08-29T09:53:07.123Z warn internal/transaction.go:111 Failed to scrape Prometheus endpoint {"kind": "receiver", "name": "prometheus/app-pods", "data_type": "metrics", "scrape_timestamp": 1693302787120, "target_labels": "{__name__=\"up\", instance=\"10.42.0.18:8080\", job=\"app-pods\"}"}
```

a-thaler marked this conversation as resolved.
Show resolved Hide resolved
**Cause**: The workload is not configured to use 'STRICT' mTLS mode. For details, see [Activate Prometheus-based metrics](#4-activate-prometheus-based-metrics).
**Cause 1**: The workload is not configured to use 'STRICT' mTLS mode. For details, see [Activate Prometheus-based metrics](#_4-activate-prometheus-based-metrics).
NHingerl marked this conversation as resolved.
Show resolved Hide resolved

**Remedy**: You can either set up 'STRICT' mTLS mode or HTTP scraping:
**Remedy 1**: You can either set up 'STRICT' mTLS mode or HTTP scraping:

- Configure the workload using “STRICT” mTLS mode (for example, by applying a corresponding PeerAuthentication).
- Set up scraping through HTTP by applying the `prometheus.io/scheme=http` annotation.

a-thaler marked this conversation as resolved.
Show resolved Hide resolved
**Cause 2**: The service enabling the scrape via the prometheus annotations is not revealing the application protocol to use in the port definition. For details, see [Activate Prometheus-based metrics](#_4-activate-prometheus-based-metrics).
a-thaler marked this conversation as resolved.
Show resolved Hide resolved

**Remedy 2**: Define the application protocol in the Service port definition by either prefixing the port name with the protocol, like in `http-metrics` or define the `appProtocol`.
a-thaler marked this conversation as resolved.
Show resolved Hide resolved

### Gateway Buffer Filling Up

**Symptom**: In the MetricPipeline status, the `TelemetryFlowHealthy` condition has status **BufferFillingUp**.
Expand Down
704 changes: 700 additions & 4 deletions docs/user/assets/logs-arch.drawio.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading