Cherry-pick #22827 to 7.10: [Auditbeat] system/socket: Monitor all online CPUs #22873
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cherry-pick of PR #22827 to 7.10 branch. Original message:
What does this PR do?
This patch updates the tracing library in Auditbeat to fetch the list of online CPUs from
/sys/devices/system/cpu/online
so that it can install kprobes in all of them regardless of its own affinity mask, and correctly skipping offline CPUs.Why is it important?
Auditbeat's system/socket dataset needs to install kprobes on all online CPUs.
Previously, it was using Go's
runtime.NumCPU()
to determine the CPUs in the system, and monitoring CPUs0
toNumCPU-1
. This was a mistake that lead to startup failures or loss of events in any of the following scenarios:Checklist
[ ] I have made corresponding changes to the documentation[ ] I have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Easier way to reproduce is to start Auditbeat with a CPU affinity mask that excludes the first CPU and only allows it to run on the second CPU:
This will pin Auditbeat to CPU1 while kprobes will be installed to CPU0, preventing guesses to work.
Alternatively, one can disable a few CPUs before launching Auditbeat:
Related issues
Related #18755
This PR fixes most of the problems reported in the above issue, but the main issue is fixed by #22787