-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove health check filter from Marathon tasks. #2817
Remove health check filter from Marathon tasks. #2817
Conversation
I'd appreciate any additional feedback from Marathon provider users. (@aantono maybe?) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was working together with @timoreimann on that issue, so LGTM 👍
I think this makes good sense. In general the delegated provider's source (i.e. Marathon, Consul, K8s, etc) should be responsible for managing health state of the entries and Traefik only responds to what they produce as "valid endpoints". If one wants to explicitly have Traefik do health checking, that can be done via backend healthcheck config. So in short - LGTM 👍 |
@aantono thanks for your feedback. 👏 |
Do you think we can add an option to enable or disable this (with default to |
@juliens taking our latest discussion on the subject on Slack into account, do you think it's still necessary to have a switch? My argument is that the existing behavior is inherently broken and bears a risk to availability for Marathon users, so any setting matching this behavior by default won't do any better. Users should either adjust their Marathon health checks or employ Traefik health checks. Let me know your thoughts. |
After discussion, oky not to add the switch. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
When available, Traefik uses the last Marathon health check result to decide whether a task should be filtered or not. This behavior is incorrect, however, as it ignores the failure threshold denoting the number of failed health checks it would take for Marathon to actually consider a task as failing and trigger a restart. With Traefik not taking the threshold into account, a task is removed from the configuration already while Marathon may still deem it healthy. Delaying the filtering until the moment when the failure threshold is exceeded does not buy us much: At that point, Marathon will decide to restart the unhealthy task anyway and cause a task state change that will lead to a Traefik configuration update excluding the task in question. Consequently, we can remove the filter logic.
119ac80
to
2e16430
Compare
What does this PR do?
Remove health check filter from Marathon tasks.
Motivation
When available, Traefik uses the last Marathon health check result to decide whether a task should be filtered or not. This behavior is incorrect, however, as it ignores the failure threshold denoting the
number of failed health checks it would take for Marathon to actually consider a task as failing and trigger a restart. With Traefik not taking the threshold into account, a task is removed from the
configuration already while Marathon may still deem it healthy.
Delaying the filtering until the moment when the failure threshold is exceeded would not buy us much: At that point, Marathon will decide to restart the unhealthy task anyway and cause a task state change that will lead to a Traefik configuration update excluding the task in question.
Consequently, we can remove the filter logic.
More