-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auditbeat: kauditd deadlocks when netlink channel is congested during startup #26031
Comments
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
This fixes a deadlock when the netlink channel is congested (initialization fails with "no buffer space available" / errno=ENOBUFS). Closes elastic#26031
@adriansr Thanks for the fix. |
@adriansr thanks for this issue as well as the fix. Apart from fixing the deadlock,
since this could also impact/break file integrity module, it seems to make sense to panic auditbeat and have it start over again. any thoughts? |
So, the problem here is that the netlink socket buffer used to do the userspace/kernel communication for the audit subsystem is full and nothing is draining it from userspace, so kernel tasks lock up when the kernel waits for space to publish a message over the socket. The file integrity module uses |
@andrewstucki I understand The fix seems to quit |
@newly12 we can discuss more in a https://discuss.elastic.co/ if you're interested, but the fix actually maintains the previous behavior of the WRT the conflict with the I understand that one module quitting a beat affects other modules that the same beat is configured to run, but I don't see how this is particular to the file integrity modules. That said, if you have a clear use-case for re-working the startup and error handling of |
Thanks. I didn't realize the difference between unifcast vs multicast modes.
Is this the case that would exit auditbeat process even with other moduke i.e. file integrity enabled? The reason I am asking is that I thought auditd as a module is not a critical component/path to auditbeat. |
This fixes a deadlock when the netlink channel is congested (initialization fails with "no buffer space available" / errno=ENOBUFS). Closes #26031
…UFS (elastic#26173) This fixes a deadlock when the netlink channel is congested (initialization fails with "no buffer space available" / errno=ENOBUFS). Closes elastic#26031 (cherry picked from commit 551baaa)
@newly12 We've had to monitor Auditbeat using
It appears to have 2 connections in a normal state, when the auditd module is operational. We restart the systemd service if this is <2. And it's mostly a very similar buffer error at boot. Would love it if Auditbeat could monitor itself, or just exit when any of the modules die/hang (auditd or file_integrity). |
If the netlink channel used to talk to kauditd is congested, Auditbeat's auditd module initialization can fail when setting the Audit PID:
cleaned up error:
This error triggers closing of the netlink channel, which deadlocks the kernel when the closing routine first tries to set the Audit PID to zero.
This causes the Auditbeat process to block indefinitely:
(kernel stack)
(usermode stack)
and any other process attempting to send a message to kauditd also blocks:
The only solution is to
kill -9
the auditbeat process.For confirmed bugs, please report:
The text was updated successfully, but these errors were encountered: