getting Null as container name #925

nebi-frame · 2019-11-14T19:43:03Z

What happened:
Sometimes we are getting null as container.name, parent process name from Falco.
What you expected to happen:
A real value
How to reproduce it (as minimally and precisely as possible):
Listen for any event and look at what you get for container.name
Anything else we need to know?:
We raised this issue on slack channel and some members suggested us to make sure falco starts after docker so we put sleep statements. Unfortunately, It seems problem still occurs.
Environment:

Falco version (use falco --version): 0.17.1 - 0.18.0
System info
Cloud provider or hardware configuration: AWS ECS AMI - ami-084f07d75acedcefa
OS (e.g: cat /etc/os-release): Amazon Linux 2, ECS
Install tools (e.g. in kubernetes, rpm, deb, from source):

curl -o /tmp/install-falco.sh -s https://s3.amazonaws.com/download.draios.com/stable/install-falco
sudo bash /tmp/install-falco.sh

sudo mv $TEMPLATE_DIR/falco/falco.yaml /etc/falco/falco.yaml
sudo mv $TEMPLATE_DIR/falco/falco_rules.yaml /etc/falco/falco_rules.yaml

# Setup logrotate to run every 5mins
sudo mv $TEMPLATE_DIR/falco/logrotate /etc/logrotate.d/falco.conf
sudo chown root /etc/logrotate.d/falco.conf
echo "*/5 * * * * /usr/sbin/logrotate -f /etc/logrotate.d/falco.conf" | sudo crontab -

Others:

The text was updated successfully, but these errors were encountered:

fntlnz · 2019-11-27T07:39:26Z

/milestone 0.19.0

fntlnz · 2019-12-03T17:14:28Z

/assign @fntlnz

fntlnz · 2019-12-03T17:14:39Z

/assign @leodido

fntlnz · 2019-12-04T16:09:24Z

From repo planning: @mfdii suggests that this has something to do with how we cache metadata.

fntlnz · 2019-12-05T13:57:42Z

I confirm that I'm able to reproduce this

docker run --rm  -it debian:jessie apt update

Dec 05 13:56:36 ip-172-31-20-33.ec2.internal falco[30945]: 13:56:36.280134044: Error Package management process launched in container (user=root command=apt update container_id=eb04a288be40 container_name=<NA> image=<NA>:<NA>)

fntlnz · 2019-12-05T15:05:37Z

After a very long bisect session I've been able to identify that this was introduced after this PR in sysdig: draios/sysdig#1326

fntlnz · 2019-12-05T15:21:40Z

The problem is that container metadata is now fetched asynchronously, so when Falco detects the event there's a possibility that container metadata hadn't been fetched yet, this is why sometimes we see the metadata and sometimes not.

leodido · 2019-12-20T14:04:20Z

We are still working on this.

I don't feel like we have room of improvement in this part until we introduce a processing queue (which is in plan for when we do the input gRPC api - #908).

We could solve this by adding some kind of synchronization mechanism like a mutex but on most of the systems this would increase a lot the drop rate. So the plan is to postpone this until gRPC input interface is done and we have the syscall input with metadata fetching done.

/milestone 1.0.0

nebi-frame · 2019-12-27T15:29:22Z

Hey guys,
Do we have any timeline for 1.0.0 release?
Thanks.

krisnova · 2020-01-06T21:58:09Z

Hey @nebi-frame,

We haven't planned a date, but we can bring it up on the Wed call. Do you think you will be able to attend so we can talk more about this?

nebi-frame · 2020-01-07T14:22:29Z

Sure I’m available

…

On Mon, Jan 6, 2020 at 16:58 Kris Nova ***@***.***> wrote: Hey @nebi-frame <https://github.com/nebi-frame>, We haven't planned a date, but we can bring it up on the Wed call. Do you think you will be able to attend so we can talk more about this? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#925?email_source=notifications&email_token=ALJTFQZ6OSUNODPNOFGKZULQ4OSPFA5CNFSM4JNQUOBKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEIG5UZI#issuecomment-571333221>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ALJTFQ4CIIPJ5LPUM3LDH2LQ4OSPFANCNFSM4JNQUOBA> .

holdenk · 2020-01-23T02:28:06Z

Is this a thing where folks are looking for help?

markjacksonfishing · 2020-01-23T03:25:47Z

Yes!!! and welcome @holdenk

krisnova · 2020-01-23T03:30:09Z

Yes - we have an end user who is asking for a fix here - I am sure they would appreciate anyone looking at this. We can pay with stickers, t-shirts, hugs, and coffee.

holdenk · 2020-01-23T22:48:56Z

Awesome, if you've got some time I'll bug you with some detail questions as I start to get lost in a new code base :)

holdenk · 2020-01-27T18:23:14Z

From a quick chat in falco.cpp look where we call next and add mutexes around there. And then look for where the container metadata is coming from idk? I think I missed something ehre @kris-nova.

krisnova · 2020-01-27T18:39:50Z

So check out the main daemon loop in falco.cpp and look how we call next(&ev) with the event pointer.

The mutex will need to lock right after this method call, and also wait for the "container metadata" context that comes from lua here

We need to make it such that Falco can't go into deadlock if for some reason Lua returns successful != true

We also need this mutex to be an opt-in feature that is exposed via a feature flag. Probably something like --wait-for-container-context. All of the feature flags are defined in falco.cpp here.

fntlnz · 2020-01-27T18:58:28Z

@kris-nova @holdenk maybe we don't need a mutex here. The asynchronous nature of container metadata is what is causing the problem here so a thing we can do is to add a flag to tell container metadata to not be asynchronous.
It doesn't solve the problem when we actually want that to be synchronous but it's a thing we can do now for those that need container metadata 100% of the time while we wait for the Input API.

Asynchronous metadata fetches can be deactivated by calling set_cri_async(false); on the sinsp class.

In our case, it can be done in falco.cpp

Wdyt?

srivastavaabhinav · 2020-02-05T23:56:42Z

@fntlnz what's the decision here?

holdenk · 2020-02-13T18:01:00Z

@fntlnz would making this call be sync slow things down unacceptably?

leodido · 2020-02-13T18:18:18Z

@holdenk IMHO yes.

Also, that would increase the drop rate noticeably.

A temporary solution could be to introduce a flag for Falco that forces the inspector - ie., sinsp - (technically, its container manager, class sinsp_container_manager) to fetch the container metadata synchronously.

To do it there is this method: inspector->set_cri_async(), source (here).

Thanks for looking into this 🤗

fntlnz · 2020-02-19T15:56:31Z

Discussing this in today's call to make a decision

leodido · 2020-02-19T16:23:41Z

/assign @fntlnz

nebi-frame · 2020-02-19T16:24:46Z

Can I join the call as well?

leodido · 2020-02-19T16:25:06Z

Sure, https://sysdig.zoom.us/my/falco

fntlnz · 2020-02-19T16:29:17Z

We made a decision, we are going to fix this by making the synchronous mode opt-in for those who want it.

In this way we can fix the problem for everyone right now and we can focus on the inputs api to solve the root cause as leo was mentioning here.

nebi-frame · 2020-02-19T16:39:33Z

Thanks everyone, I'm excited for the synchronous mode 👍

leodido · 2020-02-20T17:31:42Z

/milestone 0.21.0

nebi-frame · 2020-03-02T22:05:25Z

Hi @leodido @fntlnz , just wanted to checking in. Will the next release include fix for this bug? And when is the release due?

ahmed1smael · 2020-03-10T10:47:02Z

@fntlnz Thanks to updating this issue pls :)

fntlnz · 2020-03-18T09:16:24Z

Anyone who wants to give the fix a try, you can help us by testing the artifacts here #1099 (comment)

Remember to configure Falco to use the new flag or nothing will change!

nebi-frame · 2020-03-18T15:24:37Z

Amazing! thank you so much guys we really appreciate your efforts!

nebi-frame added the kind/bug label Nov 14, 2019

poiana added this to the 0.19.0 milestone Nov 27, 2019

poiana assigned fntlnz Dec 3, 2019

poiana assigned leodido Dec 3, 2019

poiana modified the milestones: 0.19.0, 1.0.0 Dec 20, 2019

leodido mentioned this issue Dec 20, 2019

Why are all k8s.*.* fields are null? #629

Closed

poiana modified the milestones: 1.0.0, 0.21.0 Feb 20, 2020

leodido modified the milestones: 0.21.0, 0.22.0 Mar 17, 2020

fntlnz mentioned this issue Mar 17, 2020

new(userspace/falco): add --disable-cri-async flag #1099

Merged

poiana closed this as completed in #1099 Mar 18, 2020

leogr mentioned this issue Mar 20, 2020

Output fields not filled with info showing N/A #1075

Closed

leogr mentioned this issue May 25, 2020

wip: update daemonset.yaml (--disable-cri-async) falcosecurity/charts#15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

getting Null as container name #925

getting Null as container name #925

nebi-frame commented Nov 14, 2019 •

edited by fntlnz

Loading

fntlnz commented Nov 27, 2019

fntlnz commented Dec 3, 2019

fntlnz commented Dec 3, 2019

fntlnz commented Dec 4, 2019

fntlnz commented Dec 5, 2019

fntlnz commented Dec 5, 2019

fntlnz commented Dec 5, 2019

leodido commented Dec 20, 2019

nebi-frame commented Dec 27, 2019

krisnova commented Jan 6, 2020

nebi-frame commented Jan 7, 2020 via email

holdenk commented Jan 23, 2020

markjacksonfishing commented Jan 23, 2020

krisnova commented Jan 23, 2020

holdenk commented Jan 23, 2020

holdenk commented Jan 27, 2020

krisnova commented Jan 27, 2020

fntlnz commented Jan 27, 2020

srivastavaabhinav commented Feb 5, 2020

holdenk commented Feb 13, 2020

leodido commented Feb 13, 2020 •

edited

Loading

fntlnz commented Feb 19, 2020

leodido commented Feb 19, 2020

nebi-frame commented Feb 19, 2020

leodido commented Feb 19, 2020

fntlnz commented Feb 19, 2020

nebi-frame commented Feb 19, 2020

leodido commented Feb 20, 2020

nebi-frame commented Mar 2, 2020

ahmed1smael commented Mar 10, 2020 •

edited

Loading

fntlnz commented Mar 18, 2020 •

edited

Loading

nebi-frame commented Mar 18, 2020

getting Null as container name #925

getting Null as container name #925

Comments

nebi-frame commented Nov 14, 2019 • edited by fntlnz Loading

fntlnz commented Nov 27, 2019

fntlnz commented Dec 3, 2019

fntlnz commented Dec 3, 2019

fntlnz commented Dec 4, 2019

fntlnz commented Dec 5, 2019

fntlnz commented Dec 5, 2019

fntlnz commented Dec 5, 2019

leodido commented Dec 20, 2019

nebi-frame commented Dec 27, 2019

krisnova commented Jan 6, 2020

nebi-frame commented Jan 7, 2020 via email

holdenk commented Jan 23, 2020

markjacksonfishing commented Jan 23, 2020

krisnova commented Jan 23, 2020

holdenk commented Jan 23, 2020

holdenk commented Jan 27, 2020

krisnova commented Jan 27, 2020

fntlnz commented Jan 27, 2020

srivastavaabhinav commented Feb 5, 2020

holdenk commented Feb 13, 2020

leodido commented Feb 13, 2020 • edited Loading

fntlnz commented Feb 19, 2020

leodido commented Feb 19, 2020

nebi-frame commented Feb 19, 2020

leodido commented Feb 19, 2020

fntlnz commented Feb 19, 2020

nebi-frame commented Feb 19, 2020

leodido commented Feb 20, 2020

nebi-frame commented Mar 2, 2020

ahmed1smael commented Mar 10, 2020 • edited Loading

fntlnz commented Mar 18, 2020 • edited Loading

nebi-frame commented Mar 18, 2020

nebi-frame commented Nov 14, 2019 •

edited by fntlnz

Loading

leodido commented Feb 13, 2020 •

edited

Loading

ahmed1smael commented Mar 10, 2020 •

edited

Loading

fntlnz commented Mar 18, 2020 •

edited

Loading