Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add systemd uptime metric collection #952

Merged
merged 2 commits into from
Jul 18, 2018

Conversation

langesven
Copy link
Contributor

The no longer available systemd_exporter provided uptime metrics for systemd services, as such I figured I'd propose this as an extension to node_epxorter.
This is somewhat relevant to #562 in that it exposes how long a service has been running for, not quite the timestamp of its start.
Exporting the duration makes it easier to define alerts on this metric in my opinion and would retain the functionality as provided by the previous project.

@langesven langesven force-pushed the systemd-uptime-metrics branch from 41d45c1 to 8ae6035 Compare May 24, 2018 11:33
@discordianfish
Copy link
Member

@langesven Thanks for the PR!
Though I think we should expose the start timestamp since it's much more efficient for prometheus to store. To alert on the duration, you can alert on time() - timestamp_metric.
See also this older, outdated PR with similar functionality: #709

@langesven langesven force-pushed the systemd-uptime-metrics branch from 8ae6035 to 1c2a5a7 Compare June 4, 2018 10:38
@langesven
Copy link
Contributor Author

Hi @discordianfish - fair enough, I changed it to a CounterValue that is either 0 or the unix timestamp of when the job was started. Or would a Gauge be better suited but still keep track of just the timestamp?

@arianvp
Copy link

arianvp commented Jun 25, 2018

Note that there is also this PR #709

@discordianfish
Copy link
Member

@langesven After some back and forth we decided to use gauges for timestamps in general, so yeah please change this to a gauge.

Copy link
Member

@discordianfish discordianfish left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good in general, but we should also have some tests if possible.

@@ -55,6 +56,10 @@ func NewSystemdCollector() (Collector, error) {
prometheus.BuildFQName(namespace, subsystem, "unit_state"),
"Systemd unit", []string{"name", "state"}, nil,
)
unitStartedAtDesc := prometheus.NewDesc(
prometheus.BuildFQName(namespace, subsystem, "unit_started_at_seconds"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's call this unit_start_time_seconds to be consistent with already existing process_start_time_seconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@discordianfish That won't work, because process_start_time_seconds metric family has no labels, so the collector will fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for consistency, we could name it unit_start_time_seconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SuperQ That's what I suggested! Should have put it in ```` though :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry! I didn't read what you said closely enough. 🦆

@discordianfish
Copy link
Member

@langesven Can you change the metric name? Beside this LGTM
/cc @SuperQ

@langesven langesven force-pushed the systemd-uptime-metrics branch from 1c2a5a7 to 867690e Compare July 5, 2018 13:03
@langesven
Copy link
Contributor Author

Thanks for your feedback.
I changed the metric name to unit_start_time_seconds and made it a GaugeValue instead of a counter!

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SuperQ
Copy link
Member

SuperQ commented Jul 5, 2018

Oh, would you mind adding a [FEATURE] entry to the changelog?

@langesven langesven force-pushed the systemd-uptime-metrics branch from 77fdded to 388b97b Compare July 16, 2018 12:42
@langesven
Copy link
Contributor Author

Added a feature changelog entry and rebased against master

@discordianfish
Copy link
Member

@langesven Looks like this still (again?) needs rebasing

Signed-off-by: Sven Lange <tdl@hadiko.de>
Signed-off-by: Sven Lange <tdl@hadiko.de>
@langesven langesven force-pushed the systemd-uptime-metrics branch from 388b97b to 7f505fb Compare July 17, 2018 11:27
@langesven
Copy link
Contributor Author

@discordianfish it was indeed again 😄 rebased

@SuperQ SuperQ merged commit 2ae8c1c into prometheus:master Jul 18, 2018
@pgier pgier mentioned this pull request Feb 6, 2019
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
* Add systemd uptime metric collection

Signed-off-by: Sven Lange <tdl@hadiko.de>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants