Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

get full process path via procstat as TAG #1873

Closed
mhiller opened this issue Oct 10, 2016 · 9 comments · Fixed by #5681
Closed

get full process path via procstat as TAG #1873

mhiller opened this issue Oct 10, 2016 · 9 comments · Fixed by #5681
Assignees
Labels
area/procstat feature request Requests for new plugin and for new features to existing plugins
Milestone

Comments

@mhiller
Copy link

mhiller commented Oct 10, 2016

System info:

[Include Telegraf version, operating system name, and other relevant details]
telegraf-1.0.0-1.x86_64
RHEL6

Feature Request

include command_line as a field collected from procstat .

Proposal:

get the full command line from /proc/*/cmdline . return the contents of this file as a tag.

Current behavior:

returns only the short name for the process and pid as the identifier of the process.

Desired behavior:

list the full process path as tag so that you may identify which process is which

Use case: [Why is this important (helps with prioritizing requests)]

for example you run a java application, procstat current captures "java" as the process name. This is fine if your run one of something but it very common in enterprise architecture to run multiples of something with different configs file/arguments . Capturing only java make it impossible (or very hard) to differentiate between them for resource usage.

@zbindenren
Copy link
Contributor

👍

I think, the actual procstat plugin has several more issues:

  1. The above
  2. The pid is added as field, but a better solution would be as tag (for grouping)
  3. pgrep allows the following: pgrep -u user1,user2,user3 but then the tag is then user1,user2,user3 and not the processes username
  4. if a process exists, telegraf logs errors of missing processes

@akrus
Copy link

akrus commented Aug 11, 2017

@zbindenren, actually p.2 can be changed by using pid_tag=true. Regarding p.1, this should be quite easy? Not really familiar with Go and library used, but Cmdline should be the thing we need:
https://github.com/shirou/gopsutil/blob/master/process/process_linux.go

@mattwilmott
Copy link

+1 for this as well. Our use case is monitoring the qemu processes of Openstack's KVM hypervisors. In our case the instances' names are the filename in /var/run/libvirt/qemu/*.pid but because you cannot glob the pid_name directive in the telegraf procstat conf it is useless. In lieu of that, we currently use the pattern='qemu-system' but as mentioned the only identifier is then the pid which is useless!
Would love to see the entire process name captured so a more meaningful means of differentiation is possible!

@akrus akrus mentioned this issue Aug 17, 2017
3 tasks
@danielnelson
Copy link
Contributor

I'm not convinced that adding the cmdline is the right decision.

Often, the full command lines is very long, other times there are dynamic values in the cmdline, here is an example of both that I happen to have running on my laptop right now:

/usr/bin/qemu-system-x86_64-nameguest=debian-stretch-tomcat,debug-threads=on-S-objectsecret,id=masterKey0,format=raw,file=/var/lib/libvirt/qemu/domain-1-debian-stretch-tomca/master-key.aes-machinepc-i440fx-2.9,accel=kvm,usb=off,dump-guest-core=off-cpuSkylake-Client,ss=on,hypervisor=on,tsc_adjust=on,clflushopt=on,xsaves=on,pdpe1gb=on-m2048-realtimemlock=off-smp2,sockets=2,cores=1,threads=1-uuid10b23bf9-c255-4f21-b24c-b83f0bf5bf71-no-user-config-nodefaults-chardevsocket,id=charmonitor,path=/var/lib/libvirt/qemu/domain-1-debian-stretch-tomca/monitor.sock,server,nowait-monchardev=charmonitor,id=monitor,mode=control-rtcbase=utc,driftfix=slew-globalkvm-pit.lost_tick_policy=delay-no-hpet-no-shutdown-globalPIIX4_PM.disable_s3=1-globalPIIX4_PM.disable_s4=1-bootstrict=on-deviceich9-usb-ehci1,id=usb,bus=pci.0,addr=0x4.0x7-deviceich9-usb-uhci1,masterbus=usb.0,firstport=0,bus=pci.0,multifunction=on,addr=0x4-deviceich9-usb-uhci2,masterbus=usb.0,firstport=2,bus=pci.0,addr=0x4.0x1-deviceich9-usb-uhci3,masterbus=usb.0,firstport=4,bus=pci.0,addr=0x4.0x2-drivefile=/srv/kvm/debian-stretch-tomcat.qcow2,format=qcow2,if=none,id=drive-virtio-disk0-devicevirtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0,bootindex=1-driveif=none,id=drive-ide0-0-0,readonly=on-deviceide-cd,bus=ide.0,unit=0,drive=drive-ide0-0-0,id=ide0-0-0-netdevtap,fd=20,id=hostnet0-devicevirtio-net-pci,host_mtu=1500,netdev=hostnet0,id=net0,mac=52:54:00:42:62:12,bus=pci.0,addr=0x3-chardevpty,id=charserial0-deviceisa-serial,chardev=charserial0,id=serial0-deviceusb-tablet,id=input0,bus=usb.0,port=1-vnc127.0.0.1:0-devicecirrus-vga,id=video0,bus=pci.0,addr=0x2-devicevirtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x6-msgtimestamp=on

@mattwilmott
Copy link

Agreed it could be an issue but I havent got a completely suitable solution atm.

One idea would be to capture a regex of the full cmdline. ie for the above full_cmdline_regex = ".-nameguest=(.),.*" which would essentially capture 'debian-stretch-tomcat'. If multiple capture groups are defined, concat using a comma. This and the original command /usr/bin/qemu-system-x86_64 would be sufficient context at least in my use case. I assume individuals running java cmds for instance could similarly grep out the jar file and instance id, ports etc

Will have a think about it further though...

@danielnelson
Copy link
Contributor

Since 1.7, it would be possible to use the regex processor to tidy up cmdline values. At some point I want to allow processors to be directly attached to inputs as well, but that's another idea.

What if we added an option like this to allow limiting the number of args added to the tag:

## Maximum number of cmdline arguments to add as a tag value.
# cmdline_args = 0

If the default is 0, it will be opt-in and no one will be slammed with long tag values.

@danielnelson danielnelson added the feature request Requests for new plugin and for new features to existing plugins label Aug 3, 2018
@spali
Copy link

spali commented Nov 19, 2018

What if we added an option like this to allow limiting the number of args added to the tag:

## Maximum number of cmdline arguments to add as a tag value.
# cmdline_args = 0

I would prefer the regexp suggestion above because I bet there are enough examples where just get a specific amount of arguments from the beginning would not fit any wanted use-case.
I would suggest to add a cmdline_tag = true flag (which could default to false). And let the user decide if we want the whole string or use the regexp processor to reduce it somehow.

@danielnelson danielnelson added this to the 1.10.0 milestone Nov 19, 2018
@danielnelson danielnelson self-assigned this Nov 19, 2018
@jomi5040
Copy link

jomi5040 commented Dec 4, 2018

Correct me if I am wrong but output of the cmdline/path to the executable (in my case especially Windows) has not yet been added? I am in dire need to differentiate my processes somehow since I have about 10 instances of the same .exe which can only be differentiated by their path. If there is another input plugin that is capable of this I would also be willing to switch (but would rather stick with procstat).

@danielnelson danielnelson modified the milestones: 1.10.0, 1.11.0 Feb 4, 2019
@scottprichard
Copy link
Contributor

Would love to see this feature, thanks for anyone who is continuing to look into it.

@goller goller removed the bug unexpected problem or unintended behavior label Apr 1, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/procstat feature request Requests for new plugin and for new features to existing plugins
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants