-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
IO Tracing #194
Comments
block level I/OBlocklevel IO can be traced using those 2 tracepoints:
writing read begin and read end events from those tracepoints in process mode looks relatively easy. however we can not use begin-end records in system mode because there is no total order of block io event issues and completions, so writing it as a sample is the best that we can do. biosnoop from the bcc toolkit uses kprobes instead of tracepoints, but as far as I can see it the kprobes are not that different from the tracepoints above. For now we should stick with tracepoints, I think, because they don't require set-up with |
file level I/OFile Level I/O should be traceable by using:
at least it would be nice if it works like this because this seems to be the only file level tracepoints I've seen. However open,read, write, close definitively don't cover the whole zoo of file level operations, and we will probably miss a bunch (mmap? the dozen different versions of those syscalls like openat/writev ... ? And things that might bypass the classical POSIX interface alltogether). Alternatively there is kprobe based reading directly on the virtual file system layer. this would use kprobes on vfs_open/vfs_read/vfs_write/vfs_close. But as I already said in the comment above non-sucky kprobe support in lo2s might be an absolute b*****, and might struggle with the issue that some information might be in pointers to kernel memory like a |
A detailed overview of the storage stack in Linux: https://www.thomas-krenn.com/en/wiki/Linux_Storage_Stack_Diagram |
(Information based on "Understanding the Linux Kernel, which is based on 2.6, and the kernel source code for 5.something) is there an advantage to tracing vfs_open/vfs_read/... over just tracing the syscalls?No. The only thing vfs_open/vfs_read etc. apparently do is wrapping the open/read/... syscalls. Is there a generic layer below vfs_open/vfs_read without cache effects?No. The only thing |
I hope this is a half-way legible representation of what I've learned about the fs stack this week. The arrow labeled "Probe Here?" , which is the point at which the fs-dependent readpage() operation is called in generic_file_buffered_read() would be the place where we could learn if a read for a disk based filesystem* triggered an actual read on disk and didn't just end up in the page cache. This has the problem that while the Instrumenting the *if the disk based fs actually uses generic_file_read_iter() which is almost all, but not all. |
|
Actually, it seems like nobody is using the
|
focus on: block level IO, file level IO first
The text was updated successfully, but these errors were encountered: