Improve and extend Prometheus metrics #2557

brb · 2016-10-21T14:22:12Z

#2547 has introduced Prometheus metrics for weave-net. Some metrics can be improved though. Suggestions include in no particular order:

weave_{packet,bytes}_total: remove the "flow" label.
weave_{packet,bytes}_total: add the "overlay" label.
weave_{packet,bytes}_total{overlay="fastdp"}: restore and change calculation to flushedFlowsTotal + sum(activeFlows) as flows get flushed periodically . flushedFlowsTotal should be updated before flushing flows.
Include residual metrics Expose metrics #2535 (comment) as appropriate
Write tests for metrics (?)
Distribute the metrics code (prog/weaver/metrics.go) across parts in the code which metrics are instrumenting (same as logging). Keep only those parts from prog/weave/metrics.go which are necessary for actual serving of metrics to Prometheus. The change helps with debugging. Keep in mind, that it changes the internal metrics collection model from pull to push.
Initialize labeled metrics if all label values are known in advance. That prevents from subtle bugs, e.g. dividing two metrics when one value is missing.
When exposing a metric with bunch of labels, create a static counter with all labels and collect the metric with only a reference label. When querying, you can join two metrics in order to access the labels.
Split up metrics containing a "total" label into two separate metrics, as it simplifies querying and prevents from matching mistakes. E.g. weave_dns_entries{state="local"} / ignoring(state) weave_dns_entries{state="total"} vs weave_local_dns_entries / weave_dns_entries.

The text was updated successfully, but these errors were encountered:

juliusv · 2016-11-01T17:48:53Z

Additionally, weave_connection_termination_count should be weave_connection_terminations_total.

brb · 2016-11-01T18:02:18Z

Additionally, weave_connection_termination_count should be weave_connection_terminations_total.

This is addressed by #2568

Improve metric naming

frittentheke · 2018-07-20T06:58:08Z

@brb @bboreham I am unsure whether I am supposed to open a new issue. This one is quite old and seems to be "resolved" mostly?

Anyways: At KubeCon 2018 in Kopenhagen we spoke briefly about the automatic cleanup (rmpeer) of peers when running inside Kubernetes on nodes that are part of an AWS ASG. It would be very nice if you could add a metric / count of the number of peers which were removed due to this mechanism to actively monitor this. @bboreham you suggested to simply raise an issue to request this feature.

bboreham · 2018-07-23T10:47:13Z

@frittentheke generally it's best to open a new issue; it's easier to deal with accidental duplicates than the other way round.

Since this particular issue is a random set of "stuff" I think that's an even better reason to open a new one.

brb added the feature label Oct 21, 2016

brb added this to the overflow milestone Oct 21, 2016

awh mentioned this issue Oct 24, 2016

Expose metrics #2535

Closed

juliusv mentioned this issue Nov 1, 2016

Cortex onboarding banner weaveworks/service#948

Merged

bboreham mentioned this issue Nov 1, 2016

Improve metric naming #2579

Merged

brb added a commit that referenced this issue Nov 2, 2016

Merge pull request #2579 from /issues/2557-improve-metric-naming

ec5f703

Improve metric naming

bboreham mentioned this issue Nov 29, 2016

Update documentation for Prometheus Endpoint /metrics #2677

Closed

frittentheke mentioned this issue Jul 25, 2018

Provide count / metric for automatically removed peers #3357

Open

hairyhenderson mentioned this issue Mar 18, 2020

weave_flows metric is fastdp-only, no way to see how many sleeve flows there are #3788

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve and extend Prometheus metrics #2557

Improve and extend Prometheus metrics #2557

brb commented Oct 21, 2016 •

edited

Loading

juliusv commented Nov 1, 2016

brb commented Nov 1, 2016

frittentheke commented Jul 20, 2018

bboreham commented Jul 23, 2018

Improve and extend Prometheus metrics #2557

Improve and extend Prometheus metrics #2557

Comments

brb commented Oct 21, 2016 • edited Loading

juliusv commented Nov 1, 2016

brb commented Nov 1, 2016

frittentheke commented Jul 20, 2018

bboreham commented Jul 23, 2018

brb commented Oct 21, 2016 •

edited

Loading