Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

service: don't use sysinfo on Linux for metric collection #6805

Closed
wants to merge 3 commits into from

Conversation

andresilva
Copy link
Contributor

@andresilva andresilva commented Aug 3, 2020

This PR updates our metrics code to not use the sysinfo crate on Linux and instead get all of the data through procfs.

Additionally it fixes an issue on platform that still use sysinfo: We were tracking metrics of all processes running in the node through sysinfo which in turn will cache file descriptors of all accesses to /proc/<pid>/stat. After this PR we will only collect metrics for the process of the node itself. Unfortunately sysinfo will still keep track of all threads spawned by the node and will keep a file descriptor for each /proc/<pid>/task/<tid>/stat and there is no way to avoid this currently. These will be bound by the number of threads we spawn though and they should not leak (i.e. they'll be collected after a given threads exits). Below you can see a plot on the number of file handles for a Kusama node before/after this change.

Screenshot_2020-08-03_18-33-25

@andresilva andresilva added A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders. labels Aug 3, 2020
@andresilva andresilva requested a review from bkchr August 3, 2020 17:39
@andresilva andresilva mentioned this pull request Aug 3, 2020
@andresilva
Copy link
Contributor Author

andresilva commented Aug 3, 2020

I might be able to drop the dependency on sysinfo in this PR for Linux and just do everything through procfs instead.

edit: I'll do that.

@andresilva andresilva added the A3-in_progress Pull request is in progress. No review needed at this stage. label Aug 3, 2020
@andresilva andresilva changed the title service: don't track metrics of all processes in system service: don't use sysinfo on Linux for metric collection Aug 4, 2020
@andresilva andresilva removed the A3-in_progress Pull request is in progress. No review needed at this stage. label Aug 4, 2020
@andresilva
Copy link
Contributor Author

Screenshot_2020-08-04_16-36-28

Results on Linux without sysinfo.

[nix-shell:~/Projects/parity/polkadot]$ lsof -p $(pidof polkadot) | wc -l
236
[nix-shell:~/Projects/parity/polkadot]$ lsof -p $(pidof polkadot) | grep "/proc" | wc -l
0

@andresilva
Copy link
Contributor Author

Closing this PR and will instead remove all system metric reporting. These metrics can easily be collected by an external process https://github.com/prometheus/node_exporter.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A0-please_review Pull request needs code review. B0-silent Changes should not be mentioned in any release notes C1-low PR touches the given topic and has a low impact on builders.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant