[HealthAPI] Add new indicator for peer links #119262
Labels
:Data Management/Health
:Distributed Coordination/Network
Http and internode communication implementations
>enhancement
Team:Data Management
Meta label for data/management team
Team:Distributed Coordination
Meta label for Distributed Coordination team
Description
If a connection between non-master nodes is broken down (but the links to the master are OK) there's currently little to inform the admin/operator that the cluster is experiencing problems. Yet peer-recoveries (shard relocations) could be blocked, as well as searches might return partial results only.
This is an enhancement request to add a new peer link indicator to the
_health_report
API response. This indicator's details section could provide stats such as: disconnected since timestamp, number of disconnects in time windows, lifetime of previous connections, etc. There are n^2 peer links which can in theory experience various degradation levels, so maybe we can choose to surface only an aggregate view.The text was updated successfully, but these errors were encountered: