prometheus metrics output #291

anarcat · 2023-11-22T18:09:53Z

Hi!

We're (possibly) transitioning away from Icinga to Prometheus for our monitoring down here and it would be quite nice to have the equivalent functionality to Icinga, but as an OpenMetrics endpoint.

I am not exactly sure what the metrics would be like. It seems to me there could be different metrics for kernel, ucode, and services, possibly with a separation between user and system services. So something like this, maybe:

# HELP needrestart_timestamp information about the running version and when it was last updated
# TYPE needrestart_timestamp gauge
needrestart_timestamp{version=3.6} 1700675409
# HELP needrestart_kernel_info information about the kernel
# TYPE needrestart_kernel_info info
needrestart_kernel_info{running=6.5.0-1-amd64,expected=6.5.0-1-amd64,status="current"} 1
# HELP needrestart_ucode_info information about the CPU microcode
# TYPE needrestart_ucode_info info
needrestart_ucode_info{running=0x042c,expected=0x042c,status="current"} 1
# HELP needrestart_services_count number of services requiring a restart
# TYPE needrestart_services_count gauge
needrestart_services_count = 3

It would probably need gauges for containers and sessions too...

Would people here be open to this idea?

The text was updated successfully, but these errors were encountered:

anarcat · 2023-11-22T19:01:05Z

Note that there's some overlap between this and the node exporter's support for such thing. This was requested in prometheus/node_exporter#625 but actually implemented in the "collectors" project. It only tracks the reboot-required file, however...

liske · 2024-03-03T10:48:51Z

I'm open for changes required for a metrics endpoint. 👍

tewfik-ghariani · 2024-04-18T16:01:44Z

I just came across this project that might be a good solution: https://git.fsmpi.rwth-aachen.de/thomas/needrestart2prom/-/tree/main

This new `-o` option will make needrestart output information in a format that can be scraped by Prometheus or any other daemon that ingests OpenMetrics format. The -l, -w and -k options can be used in combination with -o in order to choose what information gets exported. (Closes: liske#291)

This new `-o` option will make needrestart output information in a format that can be scraped by Prometheus or any other daemon that ingests OpenMetrics format. The -l, -w and -k options can be used in combination with -o in order to choose what information gets exported. Note that the combination of options -ol needs root access in order to correctly determine which services use outdated libraries. The kernel and microcode statuses are output as StateSet type metrics since there are more than one states for each one. This way users can track the state with more granularity and for example decide to ignore "unknown" microcode state or "version_upgrade" (e.g. non ABI-compatible upgrade) kernel state. For kernel and microcode, there's one Info type metric each that informs of the currently running vs. the expected newer version. (Closes: liske#291)

anarcat mentioned this issue Nov 22, 2023

Consider tracking whether the machine would like a reboot (/var/run/reboot-required) prometheus/node_exporter#625

Closed

liske added the enhancement label Mar 3, 2024

liske added wishlist and removed enhancement labels Apr 21, 2024

lelutin mentioned this issue Jul 30, 2024

Implement option to output information as OpenMetrics time series #308

Merged

liske closed this as completed in #308 Aug 9, 2024

liske closed this as completed in b372b17 Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

prometheus metrics output #291

prometheus metrics output #291

anarcat commented Nov 22, 2023

anarcat commented Nov 22, 2023

liske commented Mar 3, 2024 •

edited

Loading

tewfik-ghariani commented Apr 18, 2024

prometheus metrics output #291

prometheus metrics output #291

Comments

anarcat commented Nov 22, 2023

anarcat commented Nov 22, 2023

liske commented Mar 3, 2024 • edited Loading

tewfik-ghariani commented Apr 18, 2024

liske commented Mar 3, 2024 •

edited

Loading