-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use sudo for privileged process checks on filedescriptors (take 2) #1235
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @pdecat! Thanks for your contribution. Apologies about the late response, I didn't see your previous comment on the first PR. But during testing, I noticed that when running the process
check with try_sudo
, the open_file_descriptor
metric isn't being sent. The collector.log has an error message:
2018-02-09 20:14:38 UTC | ERROR | dd.collector | checks.process(process.py:218) | running psutil method num_fds with sudo failed with return code 1
Traceback (most recent call last):
File "/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/process/process.py", line 216, in psutil_wrapper
result = int(subprocess.check_output(['sudo', sys.executable, __file__, method, str(process.pid)]))
File "/opt/datadog-agent/embedded/lib/python2.7/subprocess.py", line 219, in check_output
raise CalledProcessError(retcode, cmd, output=output)
CalledProcessError: Command '['sudo', '/opt/datadog-agent/embedded/bin/python', '/opt/datadog-agent/embedded/lib/python2.7/site-packages/datadog_checks/process/process.pyc', 'num_fds', '558']' returned non-zero exit status 1
Hi @ChristineTChen, according to the path in the error message, it looks like the sudoers rules need to be adapted as following:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just 2 small things to change.
I also tried testing this locally with the updated sudoers rules and wasn't able to successfully collect the open_file_descriptors
metric. Could you give a quick rundown of how you've tested this?
# 3p | ||
import psutil | ||
|
||
|
||
# Main entry point is meant for checks needing privilege escalation with sudo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The main entry point needs to be moved after the import statements (see Travis CI check)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, fixed.
process/README.md
Outdated
As of now, only the `open_fd` metric on Unix platforms is taking advantage of this setting. | ||
Note: the appropriate sudoers rules have to be configured for this to work, e.g. if packaged with the datadog agent: | ||
``` | ||
dd-agent ALL=NOPASSWD: /opt/datadog-agent/embedded/bin/python /opt/datadog-agent/agent/checks.d/process.py num_fds * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two sudoers rules work for agents before v.5.22 and v.6.0. We should include the sudoers rules for wheels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Documentation updated to reflect that.
…ataDog/dd-agent/issues#2033)
…with psutil (`num_fds`)
…to wheel packaging
These changes are currently applied to version 1:5.17.2-1 of the datadog agent installed on debian 9 from the http://apt.datadoghq.com/ repository. The sudoers rules are put into a Here's the Ansible task I use to apply these changes after the agent installation:
The
The
@ChristineTChen do you see any related errors in |
I set up a new vagrant VM, using Agent v6.0.3, and overwrote the
I will test this with the Agent5 and update you. |
@ChristineTChen I forgot to say that I also add some configuration to For example:
Note: this example works with the patch from #1235 (comment) that did not include the |
Hi @pdecat! Apologies for the delayed response, we appreciate your patience on this issue. I have consulted with my teammates about why this patch hasn't been working for the latest Agent. We have determined that this patch is not compatible with the Agent 6 because the However, we've come up with two possible solutions to avoid running
Please let me know what your thoughts are on these alternatives. Christine |
Hi Christine, I'm willing to implement all necessary changes but as indicated in my original PR #715 (comment), allowing sudoers with wildcards is a security risk if not restrained properly. Would it be possible to package a script that does the same thing the along the omnibus packaged Agent 6?
While I did not check the actual behavior in the context of Agent 6, I did some initial tests of the github.com/sbinet/go-python lib used by Agent 6 and it seemed ok: #715 (comment) I probably need to test that part further as I may have misunderstood the behavior of go-python. Regards, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks very much for your patience with this contribution!
Note: re-submission of #715 that was merged then reverted.
What does this PR do?
Repackaged the patch I suggested at DataDog/dd-agent#2033 (comment) in a dd-agent-omnibus compatible way (no additional python file, only
process/check.py
is modified).Motivation
Datadog does not provide out of box support for monitoring file descriptors used by processes not running under the same user (
dd-agent
by default).The suggested work-around of running the datadog agent as root is explicitly not recommended which is probably reasonable: https://help.datadoghq.com/hc/en-us/articles/115000066506-Why-don-t-I-see-the-system-processes-open-file-descriptors-metric-
Additional Notes
This requires the addition of the following sudo policies in the deb and rpm packages:
Usage of sudo generates frequent logging in
/var/log/auth.log
. That may be silenced with additional PAM rules in/etc/pam.d/sudo
if desired.Resolves DataDog/dd-agent#2033
The same may be done for the
io_counters
method.edit: link