-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Draft] Kallsyms Symbol Finder #351
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
brenns10
force-pushed
the
kallsyms_finder
branch
from
September 12, 2023 01:15
2691665
to
a0bad28
Compare
brenns10
force-pushed
the
kallsyms_finder
branch
2 times, most recently
from
October 21, 2023 06:34
cd84d55
to
73f6a5c
Compare
brenns10
force-pushed
the
kallsyms_finder
branch
from
December 8, 2023 23:52
73f6a5c
to
ac8b05a
Compare
brenns10
force-pushed
the
kallsyms_finder
branch
3 times, most recently
from
March 1, 2024 23:55
bc52003
to
78cec7a
Compare
By using __attribute__((__packed__)), we shrink each enum from the default integer size of four bytes, down to the minimum size of one. This reduces the size of drgn_symbol from 32 bytes down to 26, with 6 bytes of padding. It doesn't have a practical benefit yet, but adding fields to struct drgn_symbol in the future may not increase the size. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Symbol lookup is not yet modular, like type or object lookup. However, making it modular would enable easier development and prototyping of alternative Symbol providers, such as Linux kernel module symbol tables, vmlinux kallsyms tables, and BPF function symbols. To begin with, create a modular Symbol API within libdrgn, and refactor the ELF symbol search to use it. For now, we leave drgn_program_find_symbol_by_address_internal() alone. Its conversion will require some surgery, since the new API can return errors, whereas this function cannot. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
The following commit will modify it to use drgn_program_symbols_search(), a static function declared below. Move it underneath in preparation. No changes to the function. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
The drgn_program_find_symbol_by_address_internal() function is used when libdrgn itself may want to lookup a symbol: in particular, when formatting stack traces or objects. It does less work by possibly already having a Dwfl_Module looked up, and by avoiding memory allocation of a symbol, and it's more convenient because it doesn't return any errors, including on lookup failure. Unfortunately, the new symbol finder API breaks all of these properties: the returned symbol is now allocated via malloc() which needs cleanup on error, and errors can be returned by any finder via the lookup API. What's more, the finder API doesn't allow specifying an already-known module. Thankfully, error handling can be improved using the cleanup API, and looking up a module for an address is usually a reasonably cheap binary tree operation. Switch the internal method over to the new finder API. The major difference now is simply that lookup failures don't result in an error: they simply result in a NULL symbol. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Now that the symbol finder API is created, we can move the ELF symbol implementation into the debug_info.c file, where it more logically belongs. The only change to these functions in the move is to declare elf_symbols_search as static. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Previously, Symbol objects could not be constructed in Python. However, in order to allow Python Symbol finders, this needs to be changed. Unfortunately, Symbol name lifetimes are tricky to manage. We introduce a lifetime enumeration to handle this. The lifetime may be "static", i.e. longer than the life of the program; "external", i.e. longer than the life of the symbol, but no guarantees beyond that; or "owned", i.e. owned by the Symbol itself. Symbol objects constructed in Python are "external". The Symbol struct owns the pointer to the drgn_symbol, and it holds a reference to the Python object keeping the name valid (either the program, or a PyUnicode object). The added complexity is justified by the fact that most symbols are from the ELF file, and thus share a lifetime with the Program. It would be a waste to constantly strdup() these strings, just to support a small number of Symbols created by Python code. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Expose the Symbol finder API so that Python code can be used to lookup additional symbols by name or address. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Specify a "fake" symbol finder and then test that its results are plumbed through the API successfully. While this is a contrived test, it helps build confidence in the plumbing of the API. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
The Linux kernel can be configured to include kallsyms, a built-in compressed symbol table which is also exposed at /proc/kallsyms. The symbol table contains most (but not all) of the ELF symbol table information. It can be used as a Symbol finder. The kallsyms information can be extracted in two ways: for live systems where we have root access, the simplest approach is to simply read /proc/kallsyms. For vmcores, or live systems where we are not root, we must parse the data from the vmcore, which is significantly more involved. To avoid tying the kallsyms system too deeply into the drgn internals, the finder is exposed as a Python class, which must be created using symbol information from the vmcoreinfo. Attaching the KallsymsFinder to the program will attach the underlying C function, so we can avoid some of the inefficiencies of the Python API. Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
brenns10
force-pushed
the
kallsyms_finder
branch
from
March 2, 2024 00:48
78cec7a
to
06f5e86
Compare
Closing this because it's got a noisy history and outdated description. I will create a new pull request with the kallsyms code. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Now that #241 is no longer a draft, I'm putting the next branch that builds upon it here for easy review. Unfortunately I can't set the base branch to be my own
symbol_finder
branch, so the PR currently includes the changes from #241 as well.This branch allows the built-in kallsyms information to be used as a symbol table. For best results, it should be used with
CONFIG_KALLSYMS_ALL
. This only provides symbols for the kernel: no modules. There are two ways to support this:/proc/kallsyms
. This works on practically any kernel version!/proc/kcore
is unavailable, maybe due to permissions, see Add the ability to run drgn against the live kernel as non-root user #347), we can parse the data structures that contain the kallsyms info. This requires some upstream changes which were merged back in v6.0, which add symbol information into the vmcoreinfo note. In particular, iff09bddbd8661 ("vmcoreinfo: add kallsyms_num_syms symbol")
is present, then this should work.The API I used here is to make the kallsyms finder represented as a Python object, which can be registered via
add_symbol_finder()
. I didn't want to hook into any of theadd_debug_info()
logic because I wanted maximum flexibility - most people won't want kallsyms, at least not initially. It also has the benefit of avoiding breaking any existing logic.This can be used on Oracle Linux 7-9 with UEK 5-7, but it can also be used on the vmtest kernels, which serves as a good way to explore:
No automatic testing just yet (waiting on Symbol Finder API to be stabilized and merged). However, it will be interesting to test, since ideally we would want to test the text-based and vmcore-based parsing methods. I may want to add a toggle to allow bypassing
/proc/kcore
so that we can test the other method.Some notes on fixes / To-dos for this branch:
/proc/kallsyms
where all the addresses are zero, and bail out of that code path, since non-root users can still read it without memory addresses.KallsymsFinder()
constructor is bad. I wanted to move additional parsing of the vmcoreinfo note out into the Python code. Now, I thinklibdrgn/kallsyms.c
should be able to find the necessary information from the vmcoreinfo without the Python helper code.Fixes:
tag to specify aNUMBER(kallsyms_version)=2
in the vmcoreinfo, that way we could detect it without version number hacks.bsearch()
. However the name lookup is currently linear. I need to add a hash table.ctf
branch which implements a module kallsyms finder. That might be worth porting to C later on.