-
Notifications
You must be signed in to change notification settings - Fork 25.6k
Add preflight checks to Health API to ensure health is obtainable #86404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add preflight checks to Health API to ensure health is obtainable #86404
Conversation
Preflight indicators are run before other health indicators. If any preflight indicators are not GREEN then the remaining indicators are not run. Instead, UNKNOWN results will be returned for each.
|
Pinging @elastic/es-data-management (Team:Data Management) |
|
Hi @jbaiera, I've created a changelog YAML for you. |
andreidan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this Jimmy.
This is looking great.
Can we add one more check that the cluster state is RECOVERED ? If it is not everything should be UNKNOWN and we should recommend the user to retry calling the health api
| var preflight1 = new HealthIndicatorResult("preflight1", "component1", RED, null, null, null, null); | ||
| var preflight2 = new HealthIndicatorResult("preflight2", "component2", GREEN, null, null, null, null); | ||
| var indicator1 = new HealthIndicatorResult("indicator1", "component1", GREEN, null, null, null, null); | ||
| var indicator2 = new HealthIndicatorResult("indicator2", "component1", YELLOW, null, null, null, null); | ||
| var indicator3 = new HealthIndicatorResult("indicator3", "component2", GREEN, null, null, null, null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we generally use real values in tests?
Seeing a failure that has component2 and indicator3 in the error message is rather hard to "parse"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How real should these be? I want to make sure that preflight indicators that are in the same component get correctly mixed together in the results, but the master indicator doesn't share a component with anything right now. For now I'll give them some hypothetical names, but if you'd like them to be more real than that we can discuss further.
server/src/main/java/org/elasticsearch/health/HealthService.java
Outdated
Show resolved
Hide resolved
|
@elasticmachine update branch |
andreidan
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for working on this Jimmy
server/src/main/java/org/elasticsearch/health/HealthService.java
Outdated
Show resolved
Hide resolved
server/src/main/java/org/elasticsearch/health/HealthService.java
Outdated
Show resolved
Hide resolved
|
Relates to #84792 |
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
Co-authored-by: Andrei Dan <andrei.dan@elastic.co>
|
@elasticmachine update branch |
This PR introduces an idea of preflight health indicator services to the new health service. Preflight indicators are structurally identical to regular indicators, but they are executed first when calculating health and conditionally block downstream indicators from running on an unstable or unknown cluster state.
The health service is configured with an optional list of preflight indicators. If the preflight indicators are present, the health service will execute them first when calculating a health response. If any of these preflight indicators returns a non-green result status, then the remaining health indicators will not be executed. Instead, an UNKNOWN status will be created for them and returned in the response. If
computeDetailsflag is set, then UNKNOWN responses will contain details about which preflight indicator blocked its execution.