Falco "syscall event drop" #2657

shalevpenker97 · 2023-06-26T10:25:08Z

Describe the bug

When deploying Falco on Kubernetes we can see drop of syscalls, but it takes time for the falco pod to start dropping syscalls event , when it start dropping the event it doesnt stop until the pod is restarted, there is no different behavior to the pods running on Kubernetes in term on syscalls.

How to reproduce it

Deploy Falco at scale with these configuration:

syscall_event_drops:
# -- The messages are emitted when the percentage of dropped system calls
# with respect the number of events in the last second
# is greater than the given threshold (a double in the range [0, 1]).
threshold: .1
# -- Actions to be taken when system calls were dropped from the circular buffer.
actions:
- log
- alert
# -- Rate at which log/alert messages are emitted.
rate: .03333
# -- Max burst of messages emitted.
max_burst: 1
# -- Flag to enable drops for debug purposes.
simulate_drops: false
# -- Buffer size .
syscall_buf_size_preset: 10
# -- Custom syscalls.
base_syscalls:
custom_set: [clone, clone3, fork, vfork, execve, execveat, close]
repair: false
# -- Number of cpus for buffer.
modern_bpf:
cpus_for_each_syscall_buffer: 2

We expected the syscall event drop to trigger faster (not to take 2H) or not to happen at all.
You can see in the image below that there was high drop rate from the Falco logs and after restarting the pods at 17:00 it took another 1.5 Hours until the drop started again at around 18:35

Environment

Falco version:

0.35.0

System info:

Mon Jun 26 09:59:35 2023: Falco version: 0.35.0 (x86_64)
Mon Jun 26 09:59:35 2023: Falco initialized with configuration file: /etc/falco/falco.yaml
Mon Jun 26 09:59:35 2023: Loading plugin 'k8saudit' from file /usr/share/falco/plugins/libk8saudit.so
Mon Jun 26 09:59:35 2023: Loading plugin 'json' from file /usr/share/falco/plugins/libjson.so
Mon Jun 26 09:59:35 2023: Loading rules from file /etc/falco/falco_rules.yaml
Mon Jun 26 09:59:35 2023: Loading rules from file /etc/falco/k8s_audit_rules.yaml
{
"machine": "x86_64",
"nodename": "falco-v9bh2",
"release": "5.10.167-200.el7.x86_64",
"sysname": "Linux",
"version": "#1 SMP Sun Feb 12 13:08:57 UTC 2023"
}

Cloud provider or hardware configuration:

On prem deployment - 40 cores server with 190GB memory

OS:

NAME="CentOS Linux"
VERSION="7 (Core)"
ID="centos"
ID_LIKE="rhel fedora"
VERSION_ID="7"
PRETTY_NAME="CentOS Linux 7 (Core)"
ANSI_COLOR="0;31"
CPE_NAME="cpe:/o:centos:centos:7"
HOME_URL="https://www.centos.org/"
BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7"
CENTOS_MANTISBT_PROJECT_VERSION="7"
REDHAT_SUPPORT_PRODUCT="centos"
REDHAT_SUPPORT_PRODUCT_VERSION="7"

Kernel:

5.10.149-200.el7.x86_64 #1 SMP Sun Oct 23 08:59:29 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux

Installation method:

Kubernetes

Andreagit97 · 2023-06-26T11:18:53Z

That's interesting thank you for reporting!

Side question:
Looking at your config I saw this

# -- Buffer size .
syscall_buf_size_preset: 10
# -- Custom syscalls.
base_syscalls:
custom_set: [clone, clone3, fork, vfork, execve, execveat, close]
repair: false

Are you using the -k option? it seems quite strange to see this huge number of drops with just 6 syscalls enabled and huge buffers like in your case 🤔

shalevpenker97 · 2023-06-26T14:33:34Z

Hi
Yes im using the -k option

- /usr/bin/falco

- --modern-bpf
- --cri
- /run/containerd/containerd.sock
- -K
- /var/run/secrets/kubernetes.io/serviceaccount/token
- -k
- https://$(KUBERNETES_SERVICE_HOST)
- --k8s-node
- $(FALCO_K8S_NODE_NAME)
- -pk

Andreagit97 · 2023-06-26T16:09:22Z

Oh ok, that's not the initial scope of the issue, but if you want to drastically reduce drops I suggest you disable it. We are working on fixing the k8s client, the actual one doesn't work so well, sorry

shalevpenker97 · 2023-06-29T13:59:01Z

I have disabled it and the drops did not reduce.

Andreagit97 · 2023-08-31T12:32:06Z

ei @shalevpenker97 do you mind trying to collect some metrics with the metric config?

falco/falco.yaml

Line 742 in 63ba159

metrics:

In this way, we could try to understand from which syscalls drops come and why...thank you

leogr · 2023-09-11T08:10:22Z

cross-linking #1403

Andreagit97 · 2023-11-20T15:02:26Z

any update #2657 (comment) ?

Andreagit97 · 2024-01-03T11:50:38Z

I will close this since without further information is a duplicate of #1403. Please feel free to re-open if you have further details

shalevpenker97 added the kind/bug label Jun 26, 2023

Andreagit97 added this to the 0.36.0 milestone Aug 31, 2023

Andreagit97 modified the milestones: 0.36.0, 0.37.0 Sep 2, 2023

Andreagit97 closed this as completed Jan 3, 2024

Andreagit97 self-assigned this Jan 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Falco "syscall event drop" #2657

Falco "syscall event drop" #2657

shalevpenker97 commented Jun 26, 2023

Andreagit97 commented Jun 26, 2023

shalevpenker97 commented Jun 26, 2023

Andreagit97 commented Jun 26, 2023

shalevpenker97 commented Jun 29, 2023

Andreagit97 commented Aug 31, 2023

leogr commented Sep 11, 2023

Andreagit97 commented Nov 20, 2023

Andreagit97 commented Jan 3, 2024

Falco "syscall event drop" #2657

Falco "syscall event drop" #2657

Comments

shalevpenker97 commented Jun 26, 2023

Andreagit97 commented Jun 26, 2023

shalevpenker97 commented Jun 26, 2023

Andreagit97 commented Jun 26, 2023

shalevpenker97 commented Jun 29, 2023

Andreagit97 commented Aug 31, 2023

leogr commented Sep 11, 2023

Andreagit97 commented Nov 20, 2023

Andreagit97 commented Jan 3, 2024