-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve falco benchmarking, performance, and regression tooling to better track system resources impact #2296
Comments
Adding a milestone to not lose track of the conversation. Thanks for opening this! /milestone 0.34.0 |
@happy-dude please see some initial progress on adding native support for resource utilization metrics #2333. Would you have additional thoughts on the metrics collected / planned / still missing that would ultimately set the stage for perf benchmarking and regression tests. Thanks a bunch in advance! |
/milestone 0.35.0 |
@happy-dude published a public HackMD proposing a Test Matrix https://hackmd.io/-nwsFyySTEKsjmjGHCyPRg?view using the newly introduced Additional note: Creating realistic enough synthetic workloads is notoriously challenging. Benchmarking on actual real-life servers with a lot of activity tends to give more meaningful numbers. |
Hey @incertum , thanks for the test matrix! I've review some of the items in the test matrix and will be running the following:
For the following tests:
Is there an expected results or output format you would like to see the evaluation delivered as? edit: added close syscalls to |
I had to revert my changes to the rules file because it started logging a lot and ballooned the size of the events logfile relatively quickly 😅 EDIT: adjusted my alert rule into something that should never alert: - macro: spawned_process
condition: (evt.type in (execve, execveat) and evt.dir=< and proc.name=iShouldNeverAlert) |
Updated HackMD suggesting to still add a simple filter to the test rule, also forgot to add |
Thoughts:
Lastly we are working on exposing syscall counters as part of Falco's new native resource utilization metrics support (planned for Falco 0.35) -> once we have these counters, we can derive even better conclusions |
No need to report any numbers back. Hoping these tests help you understand the unique workload footprint on your servers better. In general, longer term we need to try to perhaps push some more filters kernel side for the super high volume system calls ... |
Note Lines 544 to 546 in dad382e
|
/milestone 0.36.0 |
Issues go stale after 90d of inactivity. Mark the issue as fresh with Stale issues rot after an additional 30d of inactivity and eventually close. If this issue is safe to close now please do so with Provide feedback via https://github.com/falcosecurity/community. /lifecycle stale |
/remove-lifecycle stale |
Closing this in favor of #2435. |
Motivation
Hey team, while evaluating and understanding the relationship between Falco, system resources, and detection rules, I was wondering if there was a way to better monitor and correlate the impact of Falco config and rule changes. With this information, I can better optimize and tune Falco for our unique envirionment.
The generally falls under the lines of a Falco benchmarking or instrumentation toolchain. For comparison, osquery provides a tool that provides some info on it's queries and configuration.
Additionally, it was discussed in the Slack community that something during CI/CD would be useful as well for regression testing.
Feature
Additional context
See #2222, libs#531, Slack thread for more info
The text was updated successfully, but these errors were encountered: