-
Notifications
You must be signed in to change notification settings - Fork 905
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wip: Handle sinsp::next errors in the inspect loop #746
Conversation
Signed-off-by: Lorenzo Fontana <lo@linux.com>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: If they are not already assigned, you can assign the PR to them by writing The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/hold |
While this is a solution to the problem, the fact that the sinsp exceptions are so generic (there's only one of them) doesn't allow us to rethrow the ones that should make falco die in the end and be handled by the main program loop. |
This can't be merged in the way it is because The parser in sinsp is located here: https://github.com/draios/sysdig/blob/dev/userspace/libsinsp/parsers.cpp |
I agree, we need a exception hierarchy in |
Closing this in favor of #758 |
What type of PR is this?
/kind bug
What this PR does / why we need it:
The problem was that when entering the event processing loop for sinsp events
sinsp calls the
sinsp_parser
that when doing theprocess_event
call throws an exception like the one observed by OP in #683 .While an error while parsing an event from sinsp is certainly representing a problem for the event itself it should not stop falco from operating its service.
This PR adds an error detection mechanism to allow falco to log when errors in handling the current event happen while calling
sinsp::next
, in that way the useris informed with a log of type error.
This kind of behavior should be more common when dealing with non syscall events like container information coming from a webserver or a socket (like the CRI implementation) because
the service might not be able to respond correctly to a request or the parser can't parse the response.
Which issue(s) this PR fixes:
This is related to #683 - it doesn't actually fix the underlaying problem of not being able to parse huge json files but at least solves the problem of having falco crashing when that happens.
Special notes for your reviewer:
Does this PR introduce a user-facing change?: