-
Notifications
You must be signed in to change notification settings - Fork 148
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Liveness endpoint behaviour when unable to check in with fleet server #1157
Comments
cc @blakerouse, I believe the original liveness endpoint behaviour made it into v2 |
I think the issue with the original implementation was the possibility for the agent to check in and report a degraded state because of the inability to check in, which is a bit of a paradoxical situation given that you have to check in to do it. To me it makes the most sense to align what the agent's local liveness endpoint does with the equivalent of what Fleet considers the agent offline state. We released this functionality in v8.4.0 so we shouldn't break it by not reporting when the agent cannot connect to Fleet server. I think we should address this in v8.5.0 by:
@michel-laterman thoughts on this? |
Yes I think what we need here is 2 different statuses. The first is a local status, so what is my status as the Elastic Agent is local to the machine. The second is the status that is reported to Fleet Server. When |
@cmacknz, that sounds good it should also appear in |
@michel-laterman would you be able to take this one on your plate as it seems you understand the whole picture here? |
@jlind23 Would this need to be backported to 8.5? The changes we're making for 8.6 means a fix for 8.5 is a completely separate fix. |
@michel-laterman as previous fix landed in 8.5 I think it would be great to have a fix backported to 8.5 too. |
Yes lets backport to 8.5 so as not to break anything in that release. |
As a fix to the issue in #1148 made in #1152. The elastic-agent will no longer report a degraded state if the checkin to fleet-server fails.
This degraded state reporting was used by the liveness endpoint to ensure that the agent reported a 200 status when healthy.
The original changes were added as part of #569.
We need to decide what the liveness endpoint should be reporting and how that interacts with what the agent report to fleet.
The text was updated successfully, but these errors were encountered: