Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Export systemd's own metrics #709

Closed
wants to merge 1 commit into from
Closed

Conversation

arianvp
Copy link

@arianvp arianvp commented Oct 21, 2017

Systemd does all kinds of accounting when IOAccounting, CPUAccounting
and IPAccounting are enabled. This commit exposes some of these.

Also, systemd keeps track of how many times services restart, when
their state changes happens, and a lot more. These metrics are very
useful for detecting problems with running applications.

Still need to write tests, and add some missing metrics so not finished yet.

But I welcome any feedback!

@arianvp
Copy link
Author

arianvp commented Oct 21, 2017

I'll also fix #562 while I'm at it.

@arianvp
Copy link
Author

arianvp commented Oct 21, 2017

While fixing #562, I discovered Systemd exports the required metrics in two flavors: Wall clock time, and Monotonic time. I gues Monotonic is preferred, because then we can export it as a counter, right?

},
"NRestarts": metric{
desc: prometheus.NewDesc(
prometheus.BuildFQName(namespace, subsystem, "nrestarts"),
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably n_restarts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Counters should always end in _total, so this would be restarts_total

func (c *systemdCollector) collectUnitProperiesMetrics(ch chan<- prometheus.Metric) error {
conn, err := c.newDbus()
if err != nil {
return fmt.Errorf("couldn't get dbus connection: %s", err)
Copy link

@flokli flokli Oct 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably start with an uppercase letter.

desc: prometheus.NewDesc(
prometheus.BuildFQName(namespace, subsystem, "ip_ingress_packets_total"),
"Ingress packets total",
[]string{"name"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name is too generic, if this is the name of the unit, I would suggest unit or unit_name.

@SuperQ
Copy link
Member

SuperQ commented Oct 21, 2017

This seems like an interesting feature, but possibly overlaps too much with cAdvisor? We've been avoiding having the node_exporter duplicate cAdvisor's features.

@arianvp
Copy link
Author

arianvp commented Oct 21, 2017

I'm not very familiar with cAdvisor. Does it just export the cgroup metrics?

Systemd metrics are oriented to a specific unit file, also not limiited to the cgroups, per se.
For example for a .socket unit, the amount of packets is accounted for, and also the amount of restarts a unit file has had is accounted for.

Also, it's orthogonal to containers. Systemd unit files don't have much to do with containers at all I think so I find it hard to see why it would conflict with cAdvisor.

Also, metrics that have nothing to do with cgroups whatsoever are exported as well. Like the timestamp when the latest state change happened (as requested in #562) , and how many times a service restarted.

The systemd collector is not enabled by default, so I would not think it's a problem it overlaps with the cadvisor metrics too much? I think there are many people who are not running containers in production, but do use the systemd init system, and would like to have metrics about their unit files exported.

),
valueType: prometheus.CounterValue,
},
"AssertTimestampMonotonic": &metric{
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the following metrics still need to be renamed.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeh thanks.

@SuperQ
Copy link
Member

SuperQ commented Oct 22, 2017

cAdvisor does export metrics from cgroups, as this is how most of the container systems control and account for process groups. Technically, this means that systemd in this case is a type of container system.

I don't think we want to reproduce the cgroup data here, but having systemd-specific metrics like start/stop/restart counts would be good.

@flokli
Copy link

flokli commented Oct 22, 2017

IMHO, the memory and ingress/egress metrics are also interesting in cases where systemd is not used as a container system, as you just want to monitor resource usage of different processes.

Given those resource metrics are exposed by systemd together with other service-specific metrics and don't introduce any new dependencies, I dont really see an issue in exporting them, without having to install a second container-specific collector ;-)

@SuperQ
Copy link
Member

SuperQ commented Oct 22, 2017

The difficulty is that cgroups ~= containers, and cAdvisor provides these metrics. I don't really like it either, but our policy is to not duplicate separate functionality. Really, from a Prometheus perspective we would want a stand-alone systemd exporter, as it's not exactly a "node" specific thing, but is kinda tied 1:1 per node due to the design. This is a fine line, but only due to policy, not technical.

@arianvp
Copy link
Author

arianvp commented Oct 22, 2017

The point that SuperQ is trying to make is. Because cadvisor exports just the cgroup hierarchy, and every systemd service file has its own cgroup systemd services will actually just show up in them. Also the systemd slices will as well. So you could use the cadvisor collector perfectly well to collect systemd service metrics. Actually the systemd metrics are just a shim of the values that the cgroup exports.

So perhaps it's fair to say cadvisor is the better tool for such metrics.

Though, systemd does support some extra features, like keeping track of network traffic for a service which cadvisor does not currently do.

Also it is really nice that we can already get these metrics with just node_exporter, as we have access to these metrics through dbus already anyway - we just discard them currently. It's one runtime dependency less for people, and they already get some accounting of resources out of the box. So perhaps that's good enough reason to do include systemd metrics.

But I'll leave that up to you to decide. I can always adjust the PR to just have the changes needed for #562 and the number of restarts stuff. And then leave out the resource metrics

@arianvp
Copy link
Author

arianvp commented Oct 22, 2017

Edit: the above message I typed parallel to SuperQ's response. We seem to have come to similar conclusions

Shall I adjust this PR to just keep track of the restarts and state changes?

And then send the rest of the patch to the systemd_exporter project instead?
However that would mean there is some overlap between systemd_exporter and node_exporter . Which again gives the same philosophical issue

@flokli
Copy link

flokli commented Oct 23, 2017

@arianvp I think @SuperQ suggested adding restarts and state changes to node_exporter (this project), and using CAdvisor's prometheus endpoint to export cgroup-related resource metrics to prometheus.

I did not yet use cAdvisor, by a quick glimpse it seems to be much more than just a prometheus exporter, so I'd be tempted to pull this in just to get resource utilization metrics for systemd units.

So maybe moving the systemd restart and state changes stuff into node_exporter and keeping some lightweight cgroup/systemd resource monitoring in a separate project (although systemd_exporter might be misleading) would be what I'd use, in case it's not possible to merge everything into node_exporter.

@SuperQ
Copy link
Member

SuperQ commented Oct 23, 2017

There has been some talk about building a new cgroups/container metrics exporter that is more light weight and specific than cAdvisor. Going through systemd/dbus just to get cgroup metrics seems like the wrong idea. And a lot of people seem to have problems with cAdvisor.

@flokli
Copy link

flokli commented Oct 23, 2017

So adding just the restart and state changes metrics here seems to be the right way to go for this PR.

@SuperQ Would a collectors/cgroups_linux.go be something possible to be integrated into the node_exporter project, or does this also have to be a separate cgroup_exporter due to the above policy?

@SuperQ
Copy link
Member

SuperQ commented Oct 23, 2017

I think a separate exporter for cgroups would probably be better at this time. I was thinking we may want a a cgroup library, similar to how we have procfs. We have a lot of scope creep issues with the node_exporter as is.

@discordianfish
Copy link
Member

@arianvp Want to remove the cgroup stats from this PR and provide only the systemd-specific metrics like start/stop/restart counts, as @SuperQ suggested?

@arianvp
Copy link
Author

arianvp commented Jan 11, 2018

Yes I will see if I can make some time this weekend to clean up the PR :)

@SuperQ
Copy link
Member

SuperQ commented Jan 31, 2018

Ping, if we can get this cleaned up soon, we can include it in the next release.

@arianvp
Copy link
Author

arianvp commented Feb 1, 2018

I'll be at Fosdem this weekend. So I have 2 days of time to spend on FOSS :) So I'll move this PR to be ready for release then!

@SuperQ
Copy link
Member

SuperQ commented Feb 1, 2018

Yes, a bunch of us are also at FOSDEM. Look forward to nice graphs of the conference network.

@arianvp
Copy link
Author

arianvp commented Feb 2, 2018

Do you guys have a preferred communication channel for prometheus? Perhaps we can meet up

@SuperQ
Copy link
Member

SuperQ commented Feb 2, 2018

We have an IRC/Matrix channel: https://prometheus.io/community/

@arianvp
Copy link
Author

arianvp commented Feb 3, 2018

Last time I touched this code, dbus interface was giving uint64 for all these metrics, and now it's suddenly uint32, breaking the code. This took quite some time to debug today :) I'm not sure why dbus suddenly is spewing out other types... I don't know ver much about it. I can see if I can get it working on 236 again, but I can't guarentee it works for older versions. If the DBus interface is unstable like this, and this is different between systemd versions, I don't feel comfortable maintaining these patches.

@flokli
Copy link

flokli commented Feb 4, 2018

At least according to src/core/dbus-unit.c, (didn't try it on my own bus) numeric values should have a signature of t, which translates to SD_BUS_TYPE_UINT64, and it looks pretty much like this hasn't changed recently.

@arianvp
Copy link
Author

arianvp commented Feb 4, 2018 via email

@arianvp
Copy link
Author

arianvp commented Feb 5, 2018

It must have been an overdose of Club Mate. I can not reproduce anymore. haha. I'll clean up the PR Today

First of all, state changes of individual systemd units are now exposed.

Things like: When was the unit started, when was it stopped, how many
times did it restart before crash? How long until a timer activates.

Second of all, behind an optional flag --collector.systemd.accounting
additional metrics are defined, which expose systemd's view of CGroup
statistics of a unit. These metrics are also exposed in cAdvisor, and to
avoid duplicate metrics, this metric is explicitly opt in.
@arianvp
Copy link
Author

arianvp commented Feb 5, 2018

I have updated the PR! I have for now hidden the metrics that conflict with cAdvisor behind a flag collector.systemd.accounting. We can also decide to fully remove them from this PR. I'm fine with that too.

However, I'm running into a merge conflict. Apparently a feature landed in master which seems to implement partially what I implemented here, but in a slightly different way.

@tomwilkie seems to have implemented exporting some timer metrics, but this PR will expose those as well (under systemd's naming convetions)
a7fd6b8

We should decide whether we would want to revert @tomwilkie 's commits in favor of mine.

Also, @tomwilkie seems to have done some refactoring to only keep open one dbus connection. This is also useful for me, and shouldn't be too hard to adapt. But first we'll have to decide if we should rever that commit before I can apply the patchset and start refactoring to adapt the single connection change

@arianvp
Copy link
Author

arianvp commented Feb 5, 2018

What's left is:

  • Decide whether to keep resource accounting behind a flag, or throw away that part of the code
  • Add textual descriptions for each metric
  • normalise units so that we don't use "USec" and "MSec" etc?
  • Adapt to prometheus naming conventions
  • Decide what are Gauges and what are Counters. I now simply made everything a Counter but that's probably wrong
  • Decide whether we want the Wallclock or the Monotonic clock versions of the metrics like timer_last_triggered . I now simply export both though @tomwilkie changes seem to only be exporting the wallclock variant of the last_triggered metric

Hope to get some feedback

@tomwilkie
Copy link
Member

Sorry @arianvp! I didn't see this PR. Let me know if you need a hand.

I'll be at Fosdem this weekend

So were a bunch of us - sorry I didn't get to meet you.

under systemd's naming convetions

This means this PR export usecs instead of seconds - is this desirable? What does the rest of node_exporter do?

@arianvp
Copy link
Author

arianvp commented Feb 19, 2018

@tomwilkie could you have a look at the above questions I posted? I think we had a race condition because we seem to both have commented at the same time. You might have missed it.

@arianvp
Copy link
Author

arianvp commented Apr 4, 2018

bump @tomwilkie

@discordianfish
Copy link
Member

Regarding the time units: It should be use seconds, using only base metrics is generally best practice in Prometheus.

@discordianfish
Copy link
Member

Going to close this for now. Feel welcome to re-open this once you updated it.

@arianvp
Copy link
Author

arianvp commented Sep 3, 2024

I think a separate exporter for cgroups would probably be better at this time. I was thinking we may want a a cgroup library, similar to how we have procfs. We have a lot of scope creep issues with the node_exporter as is.

I bit the bullet and started work on this: https://github.com/arianvp/cgroup-exporter

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants