Default ruleset does not ignore some Kubernetes containers #156

lazzarello · 2016-12-05T21:56:06Z

sysdig and falco installed through the project's Debian repository on existing systems running Jessie and Kubernetes 1.4

falco process consumes 2.8 GB of resident memory after running on a cluster node for more than 3 hours. Syslog has many lines which look like the following

Dec  5 20:34:33 ip-10-53-83-111 falco: 20:34:33.211744777: Warning File opened for read/write by non-privileged container (user=root command=iptables -w -N KUBE-SERVICES -t filter k8s_kube-proxy.19117fe_kube-proxy-ip-10-53-83-111.us-west-2.compute.internal_kube-system_fe7124acf1c281948c5fa97e777ed495_3048a81d (id=a4190e1cec0a) file=/lib/libip4tc.so.0)
Dec  5 20:34:33 ip-10-53-83-111 falco: 20:34:33.211772926: Warning File opened for read/write by non-privileged container (user=root command=iptables -w -N KUBE-SERVICES -t filter k8s_kube-proxy.19117fe_kube-proxy-ip-10-53-83-111.us-west-2.compute.internal_kube-system_fe7124acf1c281948c5fa97e777ed495_3048a81d (id=a4190e1cec0a) file=/lib/libip6tc.so.0)
Dec  5 20:34:33 ip-10-53-83-111 falco: 20:34:33.211795245: Warning File opened for read/write by non-privileged container (user=root command=iptables -w -N KUBE-SERVICES -t filter k8s_kube-proxy.19117fe_kube-proxy-ip-10-53-83-111.us-west-2.compute.internal_kube-system_fe7124acf1c281948c5fa97e777ed495_3048a81d (id=a4190e1cec0a) file=/lib/libxtables.so.10)
Dec  5 20:34:33 ip-10-53-83-111 falco: 20:34:33.211820300: Warning File opened for read/write by non-privileged container (user=root command=iptables -w -N KUBE-SERVICES -t filter k8s_kube-proxy.19117fe_kube-proxy-ip-10-53-83-111.us-west-2.compute.internal_kube-system_fe7124acf1c281948c5fa97e777ed495_3048a81d (id=a4190e1cec0a) file=/lib/x86_64-linux-gnu/libm.so.6)
Dec  5 20:34:33 ip-10-53-83-111 falco: 20:34:33.211844532: Warning File opened for read/write by non-privileged container (user=root command=iptables -w -N KUBE-SERVICES -t filter k8s_kube-proxy.19117fe_kube-proxy-ip-10-53-83-111.us-west-2.compute.internal_kube-system_fe7124acf1c281948c5fa97e777ed495_3048a81d (id=a4190e1cec0a) file=/lib/x86_64-linux-gnu/libc.so.6)
Dec  5 20:34:33 ip-10-53-83-111 falco: 20:34:33.211877232: Warning File opened for read/write by non-privileged container (user=root command=iptables -w -N KUBE-SERVICES -t filter k8s_kube-proxy.19117fe_kube-proxy-ip-10-53-83-111.us-west-2.compute.internal_kube-system_fe7124acf1c281948c5fa97e777ed495_3048a81d (id=a4190e1cec0a) file=/lib/x86_64-linux-gnu/libdl.so.2)

There are hundreds of thousands of warnings like this in /var/log/syslog. I'm unsure why these warnings consume resident memory within the falco process. When I add the following change to /etc/falco_rules.yaml all warnings disappear and there is no longer resident memory growth.

- macro: trusted_containers
  condition: (container.image startswith sysdig/agent or container.image startswith sysdig/falco or container.image startswith sysdig/sysdig or container.image startswith gcr.io/google_containers/hyperkube or container.image startswith gcr.io/google_containers/kube-proxy)

- rule: File Open by Privileged Container
  desc: Any open by a privileged container. Exceptions are made for known trusted images.
  condition: (open_read or open_write) and container and container.privileged=true and not trusted_containers
  output: File opened for read/write by non-privileged container (user=%user.name command=%proc.cmdline %container.info file=%fd.name)
  priority: WARNING

Not making a PR for this since it's a configuration default. I believe this decision lies on the side of the authors. It is worth noting this detail in the documentation since the default configuration makes falco non-functional on a Kubetnetes cluster of arbitrary size.

The text was updated successfully, but these errors were encountered:

mstemm · 2016-12-05T22:33:43Z

Thanks for the detailed report. The rule change seems sensible and I'll add that to the ruleset. I'll also look at the memory growth--I was seeing similar growth, although no memory leak under valgrind, in a different environment with lots of alerts. I'll investigate both adding some rate limiting on alerts as well as figuring out the source of the memory growth.

lazzarello · 2016-12-05T22:47:57Z

It's not a memory leak. It's a bizarre allocation from what I could observe as a moderate amount of syscalls in the falco process. It's just shuffling some strings into file handles and looking at /etc/timezone. That shouldn't effect much in resident, but I'm not good at memory management in C++. For example, a screenshot during peak allocation. Notice I/O bottleneck in load.

and later, trivial load on the same system.

mstemm · 2016-12-05T23:21:03Z

I did find one memory leak in sysdig: draios/sysdig#693. These libraries are used by falco to format events into strings that go into notifications. This would definitely be related for falco as almost all output fields in rules end in a "raw" string (that is, not a %XXX field).

Did you only observe memory growth when falco was sending a bunch of (probably false positive) events, for example the k8s-related events you pointed out? If so, then the sysdig issue is probably the primary cause.

lazzarello · 2016-12-05T23:27:02Z

Correct, I observed memory growth when falco was sending a bunch of events. Once I successfully filtered the kube-proxy/iptables warnings memory usage was level and quite low (around 5MB)

mstemm · 2016-12-05T23:30:40Z

Ok great, I'll get that fixed in sysdig then. I'll keep this open until the sysdig issue is fixed and I also make the other changes (rule update, rate limiting for events).

Add google_containers/kube-proxy as a trusted image (can be run privileged, can mount sensitive filesystems). While our k8s deployments run kube-proxy via the hyperkube image, evidently it's sometimes run via its own image. This is one of the fixes for #156. Also update the output message for this rule.

mstemm · 2016-12-08T17:47:21Z

The three fixes needed are all merged, so I'll close this issue. We should have a new falco release in the next week or so with these and other changes.

lazzarello · 2016-12-08T23:46:51Z

u rule, thanks.

Add google_containers/kube-proxy as a trusted image (can be run privileged, can mount sensitive filesystems). While our k8s deployments run kube-proxy via the hyperkube image, evidently it's sometimes run via its own image. This is one of the fixes for #156. Also update the output message for this rule.

mstemm mentioned this issue Dec 5, 2016

Memory leak when format string ends with non-filtercheck draios/sysdig#693

Closed

mstemm mentioned this issue Dec 5, 2016

Fix format memory leak draios/sysdig#694

Merged

This was referenced Dec 7, 2016

Make google_containers/kube-proxy a trusted image. #159

Merged

Rate limit notifications #161

Merged

mstemm closed this as completed Dec 8, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Default ruleset does not ignore some Kubernetes containers #156

Default ruleset does not ignore some Kubernetes containers #156

lazzarello commented Dec 5, 2016

mstemm commented Dec 5, 2016

lazzarello commented Dec 5, 2016 •

edited

Loading

mstemm commented Dec 5, 2016

lazzarello commented Dec 5, 2016 •

edited

Loading

mstemm commented Dec 5, 2016

mstemm commented Dec 8, 2016

lazzarello commented Dec 8, 2016

Default ruleset does not ignore some Kubernetes containers #156

Default ruleset does not ignore some Kubernetes containers #156

Comments

lazzarello commented Dec 5, 2016

mstemm commented Dec 5, 2016

lazzarello commented Dec 5, 2016 • edited Loading

mstemm commented Dec 5, 2016

lazzarello commented Dec 5, 2016 • edited Loading

mstemm commented Dec 5, 2016

mstemm commented Dec 8, 2016

lazzarello commented Dec 8, 2016

lazzarello commented Dec 5, 2016 •

edited

Loading

lazzarello commented Dec 5, 2016 •

edited

Loading