Skip to content

Commit

Permalink
kallsyms: add symbol finder for live & coredump
Browse files Browse the repository at this point in the history
The Linux kernel can be configured to include kallsyms, a built-in
compressed symbol table which is also exposed at /proc/kallsyms. The
symbol table contains most (but not all) of the ELF symbol table
information. It can be used as a Symbol finder.

The kallsyms information can be extracted in two ways: for live systems
where we have root access, the simplest approach is to simply read
/proc/kallsyms. For vmcores, or live systems where we are not root, we
must parse the data from the vmcore, which is significantly more
involved.

To avoid tying the kallsyms system too deeply into the drgn internals,
the finder is exposed as a Python class, which must be created using
symbol information from the vmcoreinfo. Attaching the KallsymsFinder to
the program will attach the underlying C function, so we can avoid some
of the inefficiencies of the Python API.

Signed-off-by: Stephen Brennan <stephen.s.brennan@oracle.com>
  • Loading branch information
brenns10 committed Mar 2, 2024
1 parent 85f31e6 commit 06f5e86
Show file tree
Hide file tree
Showing 11 changed files with 1,320 additions and 2 deletions.
67 changes: 67 additions & 0 deletions _drgn.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -1612,6 +1612,73 @@ class Symbol:
kind: Final[SymbolKind]
"""Kind of entity represented by this symbol."""

class KallsymsFinder:
"""
A symbol finder which uses vmlinux kallsyms data
"""

def __init__(
self,
prog: Program,
kallsyms_names: int,
kallsyms_token_table: int,
kallsyms_token_index: int,
kallsyms_num_syms: int,
kallsyms_offsets: int,
kallsyms_relative_base: int,
kallsyms_addresses: int,
_stext: int,
) -> None:
"""
Manually construct a ``KallsymsFinder`` given all symbol addresses
.. note::
This class should not normally be instantiated manually. See
:func:`drgn.helpers.linux.kallsyms.make_kallsyms_vmlinux_finder`
instead for a way of automatically creating the finder via
information found in the ``VMCOREINFO``.
The finder is capable of searching the compressed table of symbol names
and addresses stored within kernel memory. It requires
``CONFIG_KALLSYMS=y`` and ``CONFIG_KALLSYMS_ALL=y`` in your kernel
configuration -- this is common on desktop and server Linux
distributions. However, the quality of symbol information is not
excellent: the :meth:`Symbol.binding` and :meth:`Symbol.kind` values are
inferred from type code information provided by kallsyms which was
originally generated by ``nm(1)``. Further, the :meth:`Symbol.size` is
computed using the offset of the next symbol after it in memory. This
can create some unusual results.
In order to create a ``KallsymsFinder``, drgn must know the location of
several symbols, which creates a bit of a chicken-and-egg problem.
Thankfully, starting with Linux 6.0, these symbol addresses are included
in the VMCOREINFO note. The required symbols are addresses of variables
in the vmcore:
- ``kallsyms_names``: an array of compressed symbol name data.
- ``kallsyms_token_table``, ``kallsyms_token_index``: tables used in
decompressing symbol names.
- ``kallsyms_num_syms``: the number of kallsyms symbols
- ``_stext``: the start of the kernel text segment. This symbol addresss
is necessary for verifying decoded kallsyms data.
Depending on the way that kallsyms is configured (see
``CONFIG_KALLSYMS_ABSOLUTE_PERCPU`` and
``CONFIG_KALLSYMS_BASE_RELATIVE``), the following symbols are needed. If
the symbol names are not present, they should be given as zero.
- ``kallsyms_offsets``
- ``kallsyms_realtive_base``
- ``kallsyms_addresses``
:param prog: Program to create a finder for
:returns: A callable object suitable to provide to
:meth:`Program.add_symbol_finder()`.
"""
__call__: Callable[[Optional[str], Optional[int], bool], List[Symbol]]
"""Lookup symbol by name, address, or both."""

class SymbolBinding(enum.Enum):
"""
A ``SymbolBinding`` describes the linkage behavior and visibility of a
Expand Down
1 change: 1 addition & 0 deletions docs/api_reference.rst
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,7 @@ Symbols
.. drgndoc:: Symbol
.. drgndoc:: SymbolBinding
.. drgndoc:: SymbolKind
.. drgndoc:: KallsymsFinder

Stack Traces
------------
Expand Down
2 changes: 2 additions & 0 deletions drgn/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@
FaultError,
FindObjectFlags,
IntegerLike,
KallsymsFinder,
Language,
MissingDebugInfoError,
NoDefaultProgramError,
Expand Down Expand Up @@ -105,6 +106,7 @@
"FaultError",
"FindObjectFlags",
"IntegerLike",
"KallsymsFinder",
"Language",
"MissingDebugInfoError",
"NULL",
Expand Down
58 changes: 58 additions & 0 deletions drgn/helpers/linux/kallsyms.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
#!/usr/bin/env python3
# Copyright (c) 2023 Oracle and/or its affiliates
# SPDX-License-Identifier: LGPL-2.1-or-later
"""
Kallsyms
--------
The kallsyms module contains helpers which allow you to use the built-in
kallsyms symbol table for drgn object lookup. Combined with an alternative type
information source, this can enable debugging Linux kernel core dumps without
the corresponding DWARF debuginfo files.
"""
import re
from typing import Dict

from drgn import KallsymsFinder, Program

__all__ = ("make_kallsyms_vmlinux_finder",)


def _vmcoreinfo_symbols(prog: Program) -> Dict[str, int]:
vmcoreinfo_data = prog["VMCOREINFO"].string_().decode("ascii")
vmcoreinfo_symbols = {}
sym_re = re.compile(r"SYMBOL\(([^)]+)\)=([A-Fa-f0-9]+)")
for line in vmcoreinfo_data.strip().split("\n"):
match = sym_re.fullmatch(line)
if match:
vmcoreinfo_symbols[match.group(1)] = int(match.group(2), 16)
return vmcoreinfo_symbols


def make_kallsyms_vmlinux_finder(prog: Program) -> KallsymsFinder:
"""
Create a vmlinux kallsyms finder, which may be passed to
:meth:`drgn.Program.add_symbol_finder`.
This function automatically finds the necessary information to create a
``KallsymsFinder`` from the program's VMCOREINFO data. It may fail if the
information is not present. Please note that the debugged Linux kernel must
be 6.0 or later to find this information.
:returns: a callable symbol finder object
"""
symbol_reqd = [
"kallsyms_names",
"kallsyms_token_table",
"kallsyms_token_index",
"kallsyms_num_syms",
"kallsyms_offsets",
"kallsyms_relative_base",
"kallsyms_addresses",
"_stext",
]
symbols = _vmcoreinfo_symbols(prog)
args = []
for sym in symbol_reqd:
args.append(symbols.get(sym, 0))
return KallsymsFinder(prog, *args)
3 changes: 3 additions & 0 deletions libdrgn/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,8 @@ libdrgnimpl_la_SOURCES = $(ARCH_DEFS_PYS:_defs.py=.c) \
helpers.h \
io.c \
io.h \
kallsyms.c \
kallsyms.h \
language.c \
language.h \
language_c.c \
Expand Down Expand Up @@ -157,6 +159,7 @@ _drgn_la_SOURCES = python/constants.c \
python/drgnpy.h \
python/error.c \
python/helpers.c \
python/kallsyms_finder.c \
python/language.c \
python/main.c \
python/object.c \
Expand Down
Loading

0 comments on commit 06f5e86

Please sign in to comment.