Add the observation function for slow, timeout and error system calls #366
Labels
area/collector
Issues or PRs related to agent metric collector
area/probe
Issues or PRs related to agent probe
enhancement
New feature or request
Background
With the increase of the number of applications and services in the system, we often need to locate some system calls that run for a long time or have errors. They are often bottlenecks that cause service or application problems.
Solution
System call information is obtained in the kernel based on eBPF technology. System calls are analyzed, processed and correlated in user space. Long running system calls are divided into slow system calls and timeout system calls. Slow system calls will be matched immediately according to enter and exit, and overtime system calls will be judged by the timed polling mechanism.The timeout time of the timeout system call and the slow threshold determination time of the slow system call are both configured by the user.
Features
[1] Modify the configuration file structure and add parameter fields for subscription events.
[2] Achieve the Slow Syscall function.
[3] Achieve the Timeout Syscall function.
The text was updated successfully, but these errors were encountered: