-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change the instance name for standard pod scraping to be unique #261
Conversation
Any of the potentially many containers in a pod can expose one or more ports with Prometheus metrics. However, with our current target labels, all of these targets get the same instance label (just the pod name), which leads to the dreaded `PrometheusOutOfOrderTimestamps` alert, see grafana/deployment_tools#3441 . (In fact, if we get the alert, we are already lucky, because the problem can go unnoticed until someone actually needs one of the time series that receive samples from different targets, rendering them useless.) In practice, we rarely have more than one port to scrape per pod, but it does happen, and it's totally within the intended usage pattern of K8s, which means it can happen more at any time. The two examples I'm aware of: - Kube-state-metrics (KSM) has only one container it its pod, but that container exposes two metrics ports (http-metrics and self-metrics). - Consul pods run a container with the consul-exporter and a container with the statsd-exporter, each exposing their metrics on a different port. Both ports are named http-metrics, which is possible because they are exposed by different containers. (This is the case that triggered the above linked issue.) To avoid the metric duplication, we could add a container and a port label, but it is a Prometheus convention that the instance label alone should be unique within a job. Which brings us to what I'm proposing in this commit: Create the instance label by joining pod name, container name, and port name with `:` in between. In most cases, the resulting instance value will appear redundant, but I believe the consistency has some value. Applying same magic to shorten the instance label when possible would add complexity and remove the consistency.
Hrm, having hit this before, I see the value. But damn, it can potentially break things in subtle ways. Joins would be difficult b/w cAdvisor data and metrics, if we ever want to do that. Hrm. Not sure, but we should fix this imo.
Just thinking out loud: we could break this but realised |
That's a good point. Joins with cAdvisor metrics isn't properly possible at the moment anyway because we do not attach the container name anywhere, i.e. in the consul case, you couldn't join because cAdvisor would give you metrics for the About the convention: I guess it is often helpful in grouping and label matching to know that |
I'll add a commit that also adds |
This allows joining with cAdvisor metrics.
5c63aff
to
0ccca40
Compare
@beorn7 can you follow up to ensure the Loki scrape config is consistent? |
I did a quick check that we don't have any regular application metrics that have a |
Working on it. |
As this needs a vendor update to push it to production, I merge this one already. The big and scary change will be the vendoring update for this and the Loki changes. |
This is triggered by grafana/jsonnet-libs#261 . The above PR changes the `instance` label to be actually unique within a scrape config. It also adds a `pod` and a `container` target label so that metrics can easily be joined with metrics from cAdvisor, KSM, and the Kubelet. This commit adds the same to the Loki scrape config. It also removes the `container_name` label. It is the same as the `container` label and was already added to Loki previously. However, the `container_name` label is deprecated and has disappeared in K8s 1.16, so that it will soon become useless for direct joining.
@tomwilkie Follow-up for Loki: grafana/loki#2091 |
This is triggered by grafana/jsonnet-libs#261 . The above PR removes the `instance` label. As it has turned out (see PR linked above), a sane `instance` label in Prometheus has to be unique, and that includes the case where a single container exposes metrics on two different endpoints. However, that scenario would still only result in one log stream for Loki to scrape. Therefore, Loki and Prometheus need to sync via target labels uniquely identifying a container (rather than a metrics endpoint). Those labels are namespace, pod, container, also added here. This commit removes the `container_name` label. It is the same as the `container` label and was already added to Loki previously. However, the `container_name` label is deprecated and has disappeared in K8s 1.16, so that it will soon become useless for direct joining.
This is triggered by grafana/jsonnet-libs#261 . The above PR removes the `instance` label. As it has turned out (see PR linked above), a sane `instance` label in Prometheus has to be unique, and that includes the case where a single container exposes metrics on two different endpoints. However, that scenario would still only result in one log stream for Loki to scrape. Therefore, Loki and Prometheus need to sync via target labels uniquely identifying a container (rather than a metrics endpoint). Those labels are namespace, pod, container, also added here. This commit removes the `container_name` label. It is the same as the `container` label and was already added to Loki previously. However, the `container_name` label is deprecated and has disappeared in K8s 1.16, so that it will soon become useless for direct joining.
grafana/jsonnet-libs#261 updates labels to make instance labels unique. This commit sycnes with that change, but subsequently makes an overdue change by going through all the dashboards and fixing various issues with them. The new dashboards are compatible with the new labeling scheme, but also fix some problems: 1. Make sure the unloved Agent and Agent Prometheus Remote Write run correct queries and account for the instance_name labels 2. Use proper graph label values in Agent Operational 3. Allow to filter Agent Operational graph by container As part of making the Agent dashboard useful, a new metric has been added to track samples added to the WAL over time. Closes #73.
grafana/jsonnet-libs#261 updates labels to make instance labels unique. This commit sycnes with that change, but subsequently makes an overdue change by going through all the dashboards and fixing various issues with them. The new dashboards are compatible with the new labeling scheme, but also fix some problems: 1. Make sure the unloved Agent and Agent Prometheus Remote Write run correct queries and account for the instance_name labels 2. Use proper graph label values in Agent Operational 3. Allow to filter Agent Operational graph by container As part of making the Agent dashboard useful, a new metric has been added to track samples added to the WAL over time. Closes #73.
This is triggered by grafana/jsonnet-libs#261 . The above PR removes the `instance` label. As it has turned out (see PR linked above), a sane `instance` label in Prometheus has to be unique, and that includes the case where a single container exposes metrics on two different endpoints. However, that scenario would still only result in one log stream for Loki to scrape. Therefore, Loki and Prometheus need to sync via target labels uniquely identifying a container (rather than a metrics endpoint). Those labels are namespace, pod, container, also added here. This commit removes the `container_name` label. It is the same as the `container` label and was already added to Loki previously. However, the `container_name` label is deprecated and has disappeared in K8s 1.16, so that it will soon become useless for direct joining.
grafana/jsonnet-libs#261 updates labels to make instance labels unique. This commit sycnes with that change, but subsequently makes an overdue change by going through all the dashboards and fixing various issues with them. The new dashboards are compatible with the new labeling scheme, but also fix some problems: 1. Make sure the unloved Agent and Agent Prometheus Remote Write run correct queries and account for the instance_name labels 2. Use proper graph label values in Agent Operational 3. Allow to filter Agent Operational graph by container As part of making the Agent dashboard useful, a new metric has been added to track samples added to the WAL over time. Closes #73.
This is triggered by grafana/jsonnet-libs#261 . The above PR removes the `instance` label. As it has turned out (see PR linked above), a sane `instance` label in Prometheus has to be unique, and that includes the case where a single container exposes metrics on two different endpoints. However, that scenario would still only result in one log stream for Loki to scrape. Therefore, Loki and Prometheus need to sync via target labels uniquely identifying a container (rather than a metrics endpoint). Those labels are namespace, pod, container, also added here. This commit removes the `container_name` label. It is the same as the `container` label and was already added to Loki previously. However, the `container_name` label is deprecated and has disappeared in K8s 1.16, so that it will soon become useless for direct joining.
@tomwilkie @woodsaj @malcolmholmes Please have a careful look here. This is a biggie. It changes almost every metric we have. I went through all the code underneath deployment_tools/ksonnet and tried to find any code that depends on the current instance naming. I found only grafana/loki#2080 , but of course, this is subtle enough that there might be many more code paths that break due to this change. However, we have to do something about it, and I think what I propose here is the way to go.
Commit description:
Any of the potentially many containers in a pod can expose one or more
ports with Prometheus metrics. However, with our current target
labels, all of these targets get the same instance label (just the pod
name), which leads to the dreaded
PrometheusOutOfOrderTimestamps
alert, see https://github.com/grafana/deployment_tools/issues/3441 .
(In fact, if we get the alert, we are already lucky, because the
problem can go unnoticed until someone actually needs one of the time
series that receive samples from different targets, rendering them
useless.)
In practice, we rarely have more than one port to scrape per pod, but
it does happen, and it's totally within the intended usage pattern of
K8s, which means it can happen more at any time.
The two examples I'm aware of:
Kube-state-metrics (KSM) has only one container it its pod, but that
container exposes two metrics ports (http-metrics and self-metrics).
Consul pods run a container with the consul-exporter and a container
with the statsd-exporter, each exposing their metrics on a different
port. Both ports are named http-metrics, which is possible because
they are exposed by different containers. (This is the case that
triggered the above linked issue.)
To avoid the metric duplication, we could add a container and a port
label, but it is a Prometheus convention that the instance label alone
should be unique within a job.
Which brings us to what I'm proposing in this commit: Create the
instance label by joining pod name, container name, and port name with
:
in between. In most cases, the resulting instance value willappear redundant, but I believe the consistency has some
value. Applying same magic to shorten the instance label when possible
would add complexity and remove the consistency.