-
Notifications
You must be signed in to change notification settings - Fork 5.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
procstat_lookup always shows 1 when monitoring systemd_unit #5300
Comments
There are cases where |
A different way to check if the service is running could be by checking the |
That's a great suggestion. It seems like |
Since 0 is not a valid PID, we should just check for it and skip it when reading the |
Agreed, though I feel we should drop the edit: |
@danielnelson I thought that was what it was meant to do as per #4237 |
This plugin doesn't report if systemd thinks a service is running or not, it searches for PIDs and then reads the process information for them. This field should contain the number of PIDs that have been found, if MainPID=0 then it hasn't found any valid PIDs so it should be zero. This is why if you are using the pidfile method and the file exists containing a PID, the It may be that we are not finding the PIDs correctly for systemd, which would be another issue, but I don't think we need to special case systemd in any way. |
Attempt at fixing influxdata#5300. Not sure if using an OR statement or writing a separate if was a better practice. Since both check for valid PIDs I used an OR statement. Also I tested this on golang and know almost nothing of GO, so there might be better ways to write this.
Closed in #5972 |
System info:
[Include Telegraf version, operating system name, and other relevant details]
Telegraf 1.9.1 (git: HEAD 2063609)
CentOS 7
InfluxDB shell version: 1.7.2
Steps to reproduce:
sudo systemctl stop puppet.service
SELECT last("pid_count") FROM "autogen"."procstat_lookup" WHERE ("systemd_unit" = 'puppet.service') AND time > now()-2m GROUP BY "myTag"
Expected behavior:
Actual behavior:
Additional info:
I only learned a bit of GO to read the source code, but from what I understood, the problem is that here len(pids) is not enough, since running systemctl show puppet.service when the service is not running will give MainPID=0 (see also #3612) and
systemdUnitPIDs()
does not check for zero values.[Include gist of relevant config, logs, etc.]
Relevant telegraf.d/puppet.conf:
[[inputs.procstat]]
systemd_unit = "puppet.service"
The text was updated successfully, but these errors were encountered: