Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Prometheus metrics removal #3287

Merged

Conversation

m3co-code
Copy link
Contributor

What does this PR do?

Improve Prometheus metrics removal.

Motivation

Formerly metric removal was done by tracking configuration generations. This was an easy first solution but is actually not correct and for setups where Traefik does not receive a lot of configuration updates, old metrics can hang around for quite some time.

The new approach is to keep track of Traefik's current configuration and to remove metrics that "belong" to an outdated configuration, once they were scraped. This way we can be sure that we don't keep old metrics too long and we can ensure that metrics for short-lived configuration still appear when the /metrics is scraped, as we only remove them after the scrape happened.

More

  • Added/updated tests

@m3co-code

This comment has been minimized.

@m3co-code m3co-code force-pushed the improve-prometheus-metrics-removal branch 2 times, most recently from 48d5833 to 72aef4e Compare May 8, 2018 16:00
@mmatur mmatur added this to the 1.7 milestone May 14, 2018
@m3co-code m3co-code changed the base branch from master to v1.6 May 25, 2018 07:47
@m3co-code m3co-code force-pushed the improve-prometheus-metrics-removal branch from acce4d9 to bc1c3c8 Compare May 25, 2018 07:53
@m3co-code m3co-code removed this from the 1.7 milestone May 25, 2018
@m3co-code m3co-code added the kind/bug/fix a bug fix label May 25, 2018
@ldez ldez added this to the 1.6 milestone May 25, 2018
@ldez ldez removed the kind/enhancement a new or improved feature. label May 25, 2018
Copy link
Member

@mmatur mmatur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👏 Nice job!!

@mmatur mmatur changed the title improve Prometheus metrics removal Improve Prometheus metrics removal Jun 5, 2018
Copy link
Member

@juliens juliens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@ldez ldez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

m3co-code and others added 7 commits June 5, 2018 10:14
We need to build Traefik's dynamic configuration in test cases outside
of the server package as well. Therefore, this commit moves them to the
testhelpers package in order to make them reusable.
Formerly metric removal was done by tracking configuration generations.
This was an easy first solution but is actually not really correct and
for setups where Traefik does not receive a lot of configuration
updates, old metrics can hang around for quite some time.

The new approach is to keep track of Traefik's current configuration and
to remove metrics that "belong" to an outdated configuration, once they
were scraped. This way we can be sure that we don't keep old metrics too
long and we can ensure that metrics for short-lived configuration still
appear when the /metrics is scraped, as we only remove them after the
scrape happened.
Given a metric is removed due to it belonging to an outdated part of the
configuration, we also must delete the underlying Prometheus metric.
Otherwise, given the formerly removed metric is tracked once more (with
the same label pairs), it will start off from the value it had before
the removal.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants