Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Win32_PerfRawData_PerfProc_Process collector #79

Merged
merged 3 commits into from
Jul 16, 2017

Conversation

andrewmostello
Copy link
Contributor

Adds a collector to scrape per process data metrics from WMI. Testing found that this collector increases the CPU usage of the wmi_exporter (as could be expected by the amount of data it generates), so it is not included in the default collector list. It's useful to have this available for chasing down problematic processes, and it probably makes sense to enable this only when doing so.

Collector to scrape per process data metrics from WMI. Collector not
enabled by default.
@carlpett
Copy link
Collaborator

Thanks for the PR!
It might be a good idea to add a flag for whitelisting process names, which would reduce the pressure on both the WMI host and Prometheus. It would make it possible to have it all the time, rather than just for debugging.
That would more or less require the process id:s being left out, though, otherwise there would be quite a lot of labels.

So, some open questions, but very nice contribution, thank you again!

(Note: This PR fixes #52)

@andrewmostello
Copy link
Contributor Author

Thanks!
I'll take a look into adding in a whitelisting option. I saw that this is set up in logical_disk, so I'll try to follow that as a pattern.

@carlpett
Copy link
Collaborator

You might actually want to do it slightly differently here, and make it part of the WMI query instead (second argument of wmi.CreateQuery). That way the amount of data the WMI provider has to return (and process) is significantly reduced as well.

@andrewmostello
Copy link
Contributor Author

Thanks - that sounds good. I'll give that a shot.

This flag can be used to limit the Win32_PerfRawData_PerfProc_Process
response and pull metrics for only the desired processes.
@andrewmostello
Copy link
Contributor Author

In thinking this over and trying a few things, it seemed like the best thing to do would be to let a user directly set the where clause of the WMI query through an optional flag. That should let anyone get exactly what they're looking for without any limits on what WQL allows.

@bbigras
Copy link
Contributor

bbigras commented Jun 26, 2017

Any updates on this? Seems like a good feature.

@carlpett
Copy link
Collaborator

@bbigras Yes, I just saw this while creating my pull request. I thought we had merged this!
@andrewmostello - Really sorry about forgetting this! I'll do some testing right away!

@carlpett
Copy link
Collaborator

Alright, done a bit of testing, and for the most part it looks very good! There are two things that I think needs to be discussed:

  1. Process ids as labels - this is something that would over time cause a lot of labels to be created in Prometheus. I would prefer if those labels are dropped. Would that limit the usability for your case?
  2. Multiple processes with the same name. If there are many processes with the same name, WMI will helpfully append a # and an index number to all except the first one (somewhat confusingly, this includes in the filters, so filtering for svchost for instance, the correct filter is name LIKE 'svchost%', rather than name = 'svchost'). This should probably be removed, especially given that they are not constant, and one process could have many different indices over its lifetime.

@bbigras
Copy link
Contributor

bbigras commented Jun 26, 2017

I'm not the one who asked first for this feature but in my case I would like to use this to monitor multiples processes with the same name but with different paths. It's for game servers. The processes may not be launched in the same order every time and some will auto restart if the server crash.

@carlpett
Copy link
Collaborator

@bbigras That is another interesting, and challenging, case. I guess the pids themselves do not give you much insight into which server is which (at least without some cross referencing with logs)? I'm not sure how to best solve that without making the collector quite complex (controlling labels with flags, or similar), I'm afraid. Ideas?

@bbigras
Copy link
Contributor

bbigras commented Jul 3, 2017

Ideas?

Not sure if it's a good one, but in my case a config file with the full path of the executable and a name for a label's value would be great.

That may not be ideal for someone who is running the same executable with the same path multiple times.

Another idea would be to set an environment variable per process to get the name for the label's value. This could be great even if in some case it can be a pita to run a process with an env variable.

@carlpett
Copy link
Collaborator

I have fixed the process name part now. The process id:s remain as-is, though. Cadvisor and friends do similar things (docker id as label), and I cannot figure out any good way to disambiguate between processes that does not have the same type of problem.

@carlpett carlpett merged commit 00f57c1 into prometheus-community:master Jul 16, 2017
@andrewmostello
Copy link
Contributor Author

Apologies for missing the questions about process id and name earlier. Thanks for fixing the process name issue, and I think you're right about there not being a great way around a process id label. I originally wrote this to identify a runaway process on some of our servers, and I used that label to identify processes and their parents. Thanks for merging!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants