Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(ebpf): treat sched_process_exit corner case #4557

Merged
merged 1 commit into from
Jan 29, 2025

Conversation

geyslan
Copy link
Member

@geyslan geyslan commented Jan 27, 2025

Close: #4558

1. Explain what the PR does

cccaf7f fix(ebpf): treat sched_process_exit corner cases

The sched_process_exit event may be triggered by a standard exit, such
as a syscall, or by alternative kernel paths, making it unsafe to assume
that it is always associated with a syscall exit.

do_exit and do_exit_group, while typically invoked by the exit and
exit_group syscalls, can also be reached through internal kernel
mechanisms such as signal handling. A concrete example of this occurs
when a syscall returns, enters signal handling, and subsequently calls
do_exit after get_signal. Both get_signal and do_exit involve
tracepoints.

A real execution flow illustrating this scenario in the kernel is as
follows:

entry_SYSCALL_64
  ├── do_syscall_64
  ├── syscall_exit_to_user_mode
  ├── __syscall_exit_to_user_mode_work
  ├── exit_to_user_mode_prepare
  ├── exit_to_user_mode_loop
  ├── arch_do_signal_or_restart
  ├── get_signal  (has signal_deliver tracepoint)
  ├── do_group_exit
  └── do_exit  (has sched_process_exit tracepoint)

2. Explain how to test it

Run this before on main to get sporadic errors as negative syscall numbers (triggered by signals):

INSTTESTS="WRITABLE_DATA_SOURCE" ./tests/e2e-inst-test.sh

After that, test this PR by running the same command above and make sure that there's no error since sched_process_exit will just submit NO_SYSCALL in such cases.

3. Other comments

pkg/ebpf/c/tracee.bpf.c Outdated Show resolved Hide resolved
@geyslan geyslan requested a review from rscampos January 28, 2025 13:55
@oshaked1
Copy link
Contributor

@geyslan did you understand why -237 was showing up as the syscall number? From what I understand it should always be -1 when not in syscall context

@geyslan
Copy link
Member Author

geyslan commented Jan 28, 2025

@geyslan did you understand why -237 was showing up as the syscall number? From what I understand it should always be -1 when not in syscall context

I believe this is due to kernel signal handling. These syscalls (futex, nanosleep) are triggered by the Go runtime, and on their return, the process group has already received a signal.

Relevant references:

arch/x86/kernel/signal.c#L333-L341
kernel/signal.c#L3036-L3037

image

--- EDIT

entry_SYSCALL_64 ->
do_syscall_64 ->
syscall_exit_to_user_mode ->
__syscall_exit_to_user_mode_work ->
exit_to_user_mode_prepare ->
exit_to_user_mode_loop ->
arch_do_signal_or_restart ->
get_signal -> has signal_deliver trace point
do_group_exit ->
do_exit -> has sched_process_exit trace point

Since do_exit() sets the tracepoint in the middle of its execution, the value in the register we fetch for the syscall number may not be reliable. Depending on the origin of the call, this register could contain a non-related value, potentially clobbered by the interrupted syscall or the signal handler itself.

--- EDIT

Some debugging output

   kworker/dying-49935   [002] .... 17118.817870: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-49936   [002] .... 17118.817907: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
           <...>-49929   [002] .... 17118.817919: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-49934   [002] .... 17118.817980: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
           <...>-49925   [002] .... 17118.817998: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-49926   [002] .... 17118.818015: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-49927   [002] .... 17118.818039: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-49928   [002] .... 17118.818069: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-47732   [002] .... 17118.818087: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-49931   [002] .... 17118.818112: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
   kworker/dying-49932   [002] .... 17118.818132: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
           <...>-47731   [002] .... 17118.818147: 0: sched_process_exit: NOT signaled - syscall: -1, exit_code: 0, termination_type: 0
         
  systemd-tty-ask-54916   [014] .... 17491.417777: 0: sched_process_exit: SIGNALED - syscall: -254, exit_code: 15, termination_type: 0
  systemd-tty-ask-57583   [002] .... 17500.038283: 0: sched_process_exit: SIGNALED - syscall: -254, exit_code: 15, termination_type: 0
  systemd-tty-ask-60344   [006] .... 17508.451852: 0: sched_process_exit: SIGNALED - syscall: -254, exit_code: 15, termination_type: 0
  systemd-tty-ask-54332   [005] .... 17489.782470: 0: sched_process_exit: SIGNALED - syscall: -254, exit_code: 15, termination_type: 0
  
  ds_writer-92434   [001] .... 27440.230472: 0: sched_process_exit: SIGNALED - syscall: -254, exit_code: 0, termination_type: 0
  ds_writer-92420   [005] .... 27440.245846: 0: sched_process_exit: SIGNALED - syscall: -237, exit_code: 0, termination_type: 0
  ds_writer-92438   [011] .... 27440.247073: 0: sched_process_exit: SIGNALED - syscall: -34, exit_code: 0, termination_type: 0

Copy link
Contributor

@oshaked1 oshaked1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The sched_process_exit event may be triggered by a standard exit, such
as a syscall, or by alternative kernel paths, making it unsafe to assume
that it is always associated with a syscall exit.

do_exit and do_exit_group, while typically invoked by the exit and
exit_group syscalls, can also be reached through internal kernel
mechanisms such as signal handling. A concrete example of this occurs
when a syscall returns, enters signal handling, and subsequently calls
do_exit after get_signal. Both get_signal and do_exit involve
tracepoints.

A real execution flow illustrating this scenario in the kernel is as
follows:

entry_SYSCALL_64
  ├── do_syscall_64
  ├── syscall_exit_to_user_mode
  ├── __syscall_exit_to_user_mode_work
  ├── exit_to_user_mode_prepare
  ├── exit_to_user_mode_loop
  ├── arch_do_signal_or_restart
  ├── get_signal  (has signal_deliver tracepoint)
  ├── do_group_exit
  └── do_exit  (has sched_process_exit tracepoint)
Copy link
Collaborator

@yanivagman yanivagman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@geyslan
Copy link
Member Author

geyslan commented Jan 29, 2025

/fast-forward

@github-actions github-actions bot merged commit cccaf7f into aquasecurity:main Jan 29, 2025
41 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

WRITABLE_DATA_SOURCE e2e test: invalid syscall id -237
3 participants