Prometheus metric to show whether a server is a leader or follower #13169

dpw · 2022-05-20T15:45:07Z

Feature Description

There should be a consul server prometheus metric showing whether a server considers itself to be a leader, follower (or neither, in the case of a candidate). This should be possible for the current time (i.e. as of the last prometheus scrape) or for an arbitrary point in the past. Currently there is no straightforward way to determine this.

Furthermore, there should be a metric dedicated to this purpose and stable in future versions of consul (rather than being a metric for some other purpose that indicates the leader as a side effect, and so is liable to change).

Use Case(s)

For normal operation of a consul server cluster, there should be exactly one leader server, and all other servers should be followers. It should be possible to monitor that these conditions are satisfied, and alert if not, by means of simple prometheus query expressions.

Non-solutions

At first glance, it looks like the consul_raft_state_* metrics offer this. But those are counters that increment upon entry to the relevant state. So their values at a point in time do not show the leader and followers. For example, if a server reports a non-zero value of consul_raft_state_leader that means it became leader at some point, but it does not tell you that it is the leader now. (These counters do not even reliably tell the outcome of an election, as multiple elections may occur within a single prometheus scrape interval.)

In the past, there were gauge metrics that suggested the leader by their presence, for instance consul_raft_apply and consul_autopilot_healthy. But because those were only updated on the leader, when a server ceased to be leader they would contain stale values for a time controlled by the telemetry.prometheus_retention_period config setting. Furthermore, subsequent commits mean that those metrics no longer indicate the leader (#9198 exposed consul_raft_apply on every node; #12617 exposed consul_autopilot_healthy on every server).

While there are counter metrics that only increase on the leader, using them to reliably determine the leader requires a very cumbersome prometheus query expression (especially if the case of a standalone consul server is handled).

The text was updated successfully, but these errors were encountered:

huikang · 2022-05-27T13:39:24Z

Hi, @dpw , thanks for reporting and investigating this issue. You analysis totally makes sense; will work on the improvement.

…ing-joey

…y-redbird

…asing-pigeon

…ealthy-lab

huikang self-assigned this May 27, 2022

huikang mentioned this issue May 31, 2022

Add isLeader metric to track if a server is a leader #13304

Merged

4 tasks

huikang closed this as completed in #13304 Jun 3, 2022

hc-github-team-consul-core added a commit that referenced this issue Jun 3, 2022

Merge d54199f into backport/gh-13169-show-leader-metrics/legally-winn…

91fb1b4

…ing-joey

hc-github-team-consul-core added a commit that referenced this issue Jun 3, 2022

Merge 6e7a8f1 into backport/gh-13169-show-leader-metrics/legally-winn…

0be75ed

…ing-joey

hc-github-team-consul-core mentioned this issue Jun 3, 2022

Backport of Add isLeader metric to track if a server is a leader into release/1.12.x #13358

Closed

4 tasks

hc-github-team-consul-core added a commit that referenced this issue Jun 3, 2022

Merge d54199f into backport/gh-13169-show-leader-metrics/fairly-worth…

d5bcdfb

…y-redbird

hc-github-team-consul-core added a commit that referenced this issue Jun 3, 2022

Merge 6e7a8f1 into backport/gh-13169-show-leader-metrics/fairly-worth…

d4313ad

…y-redbird

hc-github-team-consul-core mentioned this issue Jun 3, 2022

Backport of Add isLeader metric to track if a server is a leader into release/1.11.x #13359

Closed

4 tasks

hc-github-team-consul-core added a commit that referenced this issue Jun 3, 2022

Merge d54199f into backport/gh-13169-show-leader-metrics/formally-ple…

ad744fe

…asing-pigeon

hc-github-team-consul-core added a commit that referenced this issue Jun 3, 2022

Merge 6e7a8f1 into backport/gh-13169-show-leader-metrics/formally-ple…

7b6c462

…asing-pigeon

hc-github-team-consul-core mentioned this issue Jun 3, 2022

Backport of Add isLeader metric to track if a server is a leader into release/1.10.x #13360

Closed

4 tasks

hc-github-team-consul-core added a commit that referenced this issue Jun 7, 2022

Merge d54199f into backport/gh-13169-show-leader-metrics/inherently-w…

bfc0a7a

…ealthy-lab

hc-github-team-consul-core added a commit that referenced this issue Jun 7, 2022

Merge 6e7a8f1 into backport/gh-13169-show-leader-metrics/inherently-w…

38d455b

…ealthy-lab

hc-github-team-consul-core mentioned this issue Jun 7, 2022

Backport of Add isLeader metric to track if a server is a leader into release/1.12.x #13380

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prometheus metric to show whether a server is a leader or follower #13169

Prometheus metric to show whether a server is a leader or follower #13169

dpw commented May 20, 2022

huikang commented May 27, 2022

Prometheus metric to show whether a server is a leader or follower #13169

Prometheus metric to show whether a server is a leader or follower #13169

Comments

dpw commented May 20, 2022

Feature Description

Use Case(s)

Non-solutions

huikang commented May 27, 2022