Skip to content

Conversation

leroycep
Copy link
Contributor

@leroycep leroycep commented Nov 25, 2024

When reading std.debug.Dwarf fails, std.debug.SelfInfo will now try to load function names from symtab.

See related issue #18520

Example

//! main.zig
const std = @import("std");

noinline fn foo(x: u32) u32 {
    return x * x;
}

noinline fn bar() u32 {
    return foo(std.math.maxInt(u32));
}

pub fn main() !void {
    std.debug.print("{}", .{bar()});
}
> zig build-exe main.zig
> objcopy --strip-debug main main_strip-debug
> ls main main_strip-debug
╭───┬──────────────────┬──────┬─────────┬──────────────╮
│ # │       name       │ type │  size   │   modified   │
├───┼──────────────────┼──────┼─────────┼──────────────┤
│ 0 │ main             │ file │ 2.4 MiB │ 13 hours ago │
│ 1 │ main_strip-debug │ file │ 1.0 MiB │ 13 hours ago │
╰───┴──────────────────┴──────┴─────────┴──────────────╯
> try { ./main }
thread 2762936 panic: integer overflow
/home/geemili/code/zig-elf-symbol-debuginfo/debug/example-elfsymtab-backtrace/main.zig:4:14: 0x103ad8a in foo (main)
    return x * x;
             ^
/home/geemili/code/zig-elf-symbol-debuginfo/debug/example-elfsymtab-backtrace/main.zig:8:15: 0x103923d in bar (main)
    return foo(std.math.maxInt(u32));
              ^
/home/geemili/code/zig-elf-symbol-debuginfo/debug/example-elfsymtab-backtrace/main.zig:12:32: 0x103920c in main (main)
    std.debug.print("{}", .{bar()});
                               ^
/home/geemili/code/zig-elf-symbol-debuginfo/lib/std/start.zig:617:37: 0x10391e1 in posixCallMainAndExit (main)
            const result = root.main() catch |err| {
                                    ^
/home/geemili/code/zig-elf-symbol-debuginfo/lib/std/start.zig:248:5: 0x1038ddf in _start (main)
    asm volatile (switch (native_arch) {
    ^
???:?:?: 0x0 in ??? (???)
> try { ./main_strip-debug }
thread 2763662 panic: integer overflow
???:?:?: 0x103ad8a in main.foo (???)
???:?:?: 0x103923d in main.bar (???)
???:?:?: 0x103920c in main.main (???)
???:?:?: 0x10391e1 in start.posixCallMainAndExit (???)
???:?:?: 0x1038ddf in _start (???)
???:?:?: 0x0 in ??? (???)

Further work

  • add debug format stack trace checks that check that symbol based stack traces are correct.
  • make stack traces work when .eh_frame is included but other Dwarf debuginfo is missing.
  • test symbol based stack traces in release optimize modes
  • [ ] Make std.debug.ElfSymTab read from .dynsym section as well? .dynsym is meant to be a subset of .symtab, for now debug.ElfSymTab assumes .symtab is intentionally left in.

These changes have been moved out of this branch and into their own fdebuginfo branch:

  • [x] change -fstrip to allow -fstrip=debuginfo, which would remove the DWARF debug info but retain the ELF symbol table
    • [x] make similar changes for the std.Build API.symtab, so I think this would fit better into an issue about graceful degradation of stack traces.
  • [x] remove -fstrip, -gdwarf32, -gdwarf64; replace with -fdebuginfo=none, -fdebuginfo=symbols, -fdebuginfo=dwarf32, etc.?
  • [ ] less noisy stack trace format? Perhaps based on std: add std.options.debug_stacktrace_mode #19650 ? makes more sense as separate PR based on std: add std.options.debug_stacktrace_mode #19650
  • [ ] check that -fstrip=debuginfo works on macos and windows? move into fdebuginfo PR once that exists
This work is licensed on the same terms as the Zig project.

Copyright © 2024 TigerBeetle, Inc.

This code was written under contract for TigerBeetle. As a work made for hire, authorship and copyright goes to TigerBeetle.

Author certificate

LeRoyce Pearson <opensource@geemili.xyz> [TigerBeetle, Inc.]

This work is licensed on the same terms as this project (Zig).

@leroycep leroycep force-pushed the elf-symbol-debuginfo branch 2 times, most recently from df2397a to 3a34582 Compare November 25, 2024 23:24
@xdBronch
Copy link
Contributor

../release/bin/zig build-exe main.zig -Doptimize=ReleaseSafe -fno-strip

-Doptimize is only used in the build system, what that command is doing is defining a C macro, you want to use -OReleaseSafe

@leroycep
Copy link
Contributor Author

Oh, good catch. Had to add -fno-omit-frame-pointer as well. Here's the updated commands:

~/code/zig/build-example-elfsymtab-backtrace> ../release/bin/zig build-exe main.zig -OReleaseSafe -fno-strip -fno-omit-frame-pointer  -6 11/25/2024 04:46:46 PM
~/code/zig/build-example-elfsymtab-backtrace> objcopy --strip-debug main main_strip-debug                                                11/25/2024 04:47:23 PM
~/code/zig/build-example-elfsymtab-backtrace> ./main                                                                                     11/25/2024 04:47:25 PM
thread 772168 panic: integer overflow
Unwind error at address `exe:0x101a803` (error.MissingFDE), trace may be incomplete

/home/geemili/code/zig/build-example-elfsymtab-backtrace/main.zig:4:14: 0x100be87 in foo (main)
    return x * x;
             ^
/home/geemili/code/zig/build-example-elfsymtab-backtrace/main.zig:8:15: 0x100bc98 in bar (main)
    return foo(std.math.maxInt(u32));
              ^
/home/geemili/code/zig/build-example-elfsymtab-backtrace/main.zig:12:32: 0x100bc88 in main (main)
    std.debug.print("{}", .{bar()});
                               ^
/home/geemili/code/zig/lib/std/start.zig:617:37: 0x100bbc2 in posixCallMainAndExit (main)
            const result = root.main() catch |err| {
                                    ^
/home/geemili/code/zig/lib/std/start.zig:248:5: 0x100b89d in _start (main)
    asm volatile (switch (native_arch) {
    ^
Error: nu::shell::core_dumped

  × External command core dumped
   ╭─[entry #148:1:1]
 1 │ ./main
   · ───┬──
   ·    ╰── core dumped with SIGABRT (6)
   ╰────
~/code/zig/build-example-elfsymtab-backtrace> ./main_strip-debug                                                                      -6 11/25/2024 04:47:28 PM
thread 772213 panic: integer overflow
Unwind information for `exe:0x101a803` was not available, trace may be incomplete

???:?:?: 0x100be87 in main.foo (???)
???:?:?: 0x100bc98 in main.bar (???)
???:?:?: 0x100bc88 in main.main (???)
???:?:?: 0x100bbc2 in start.posixCallMainAndExit (???)
???:?:?: 0x100b89d in _start (???)
Error: nu::shell::core_dumped

  × External command core dumped
   ╭─[entry #149:1:1]
 1 │ ./main_strip-debug
   · ─────────┬────────
   ·          ╰── core dumped with SIGABRT (6)
   ╰────

@xdBronch
Copy link
Contributor

not sure if this would cause too much duplicated code but thoughts on omitting those ???s when the symbol table is being used? its quite noisy imo

@leroycep
Copy link
Contributor Author

leroycep commented Nov 26, 2024

The minimum change would be here:

return .{
.name = symbol.name,
};

Setting the compile_unit_name to "" and source_location to .invalid :

zig/lib/std/debug.zig

Lines 44 to 48 in 3a34582

pub const Symbol = struct {
name: []const u8 = "???",
compile_unit_name: []const u8 = "???",
source_location: ?SourceLocation = null,
};

And then it would produce something like:

:0:0: 0x100b89d in _start ()

Further changes would require some more thought

@matklad
Copy link
Contributor

matklad commented Nov 26, 2024

Are there any existing tests that could be extended to cover the new behavior?

@leroycep leroycep force-pushed the elf-symbol-debuginfo branch from 3a34582 to 958bf92 Compare November 26, 2024 20:32
@leroycep
Copy link
Contributor Author

leroycep commented Nov 27, 2024

  • [ ] add tests to check for graceful degradation when debuginfo is missing
  • [ ] add some test that would catch the mistake I made which broke stack unwinding in release modes
  • [ ] make stack traces work when other Dwarf debuginfo is missing

On that last point, objcopy --strip-debug will leave in the eh_frame sections, but std.debug will not make use of them:

~/code/zig/build-example-elfsymtab-backtrace> try { ./main_strip-debug }
thread 98264 panic: integer overflow
Unwind information for `exe:0x101b443` was not available, trace may be incomplete

~/code/zig/build-example-elfsymtab-backtrace> readelf -S main_strip-debug | rg eh
  [ 2] .eh_frame_hdr     PROGBITS         0000000001007da4  00007da4
  [ 3] .eh_frame         PROGBITS         0000000001008448  00008448

Of course, that could be pushed off to another PR, as you can work around this by using -fno-omit-frame-pointer:

~/code/zig/build-example-elfsymtab-backtrace> ../release/bin/zig build-exe main.zig -OReleaseSafe -fno-strip -fno-omit-frame-pointer
~/code/zig/build-example-elfsymtab-backtrace> objcopy --strip-debug main main_strip-debug
~/code/zig/build-example-elfsymtab-backtrace> try { ./main_strip-debug }
thread 99117 panic: integer overflow
Unwind information for `exe:0x101a803` was not available, trace may be incomplete

???:?:?: 0x100be87 in main.foo (???)
???:?:?: 0x100bc98 in main.bar (???)
???:?:?: 0x100bc88 in main.main (???)
???:?:?: 0x100bbc2 in start.posixCallMainAndExit (???)
???:?:?: 0x100b89d in _start (???)

@leroycep
Copy link
Contributor Author

leroycep commented Dec 6, 2024

I think this pull request is ready for review. Current state:

  • Adds support for falling back to the ELF symbol table when generating stack traces
  • Adds a new type of test case, DebugFormatStackTrace, which is a modified version of the StackTrace test

Some caveats:

@leroycep leroycep force-pushed the elf-symbol-debuginfo branch 3 times, most recently from f03fead to d97c614 Compare December 8, 2024 21:24
@leroycep
Copy link
Contributor Author

leroycep commented Dec 8, 2024

I removed the commit that skipped testing on aarch64, as commit e62aac3 fixes it by making -fno-omit-framepointer the default

@leroycep
Copy link
Contributor Author

leroycep commented Dec 9, 2024

Now that CI has passed, my plan is to work on a draft PR for -fdebuginfo that builds on top of this PR while I'm waiting for a review.

@leroycep leroycep mentioned this pull request Dec 9, 2024
3 tasks
@leroycep leroycep force-pushed the elf-symbol-debuginfo branch from d97c614 to f497573 Compare December 16, 2024 20:17
@leroycep leroycep force-pushed the elf-symbol-debuginfo branch 4 times, most recently from 2b6c55e to c5a4dde Compare January 23, 2025 19:12
@leroycep leroycep force-pushed the elf-symbol-debuginfo branch from 4521fdf to bec2cc0 Compare February 25, 2025 22:15
@leroycep leroycep force-pushed the elf-symbol-debuginfo branch from bec2cc0 to 53838df Compare February 26, 2025 18:44
When reading `std.debug.Dwarf ` fails, `std.debug.SelfInfo` will now try to load
function names from `symtab`.
This makes it possible for executables built with `-fstrip=debuginfo` to unwind
frames. No need to specify `-fno-omit-frame-pointer`! This is only possible
because the `eh_frame` and `eh_frame_hdr` sections are not stripped when
`-fstrip=debuginfo`.

Also, remove check in MachO code for `error.RequiresDWARFUnwind`. That
error was removed[1] way back in 2023, but wasn't cleaned up properly.

[1]: 97bda56
@leroycep leroycep force-pushed the elf-symbol-debuginfo branch from 53838df to de70e81 Compare February 27, 2025 00:36
@Khitiara
Copy link

what's the status of this PR? considering pulling it into my osdev project's std fork/branch to use there and test if its at least mostly ready

@leroycep
Copy link
Contributor Author

I haven't tested the backtrace code against freestanding targets. This code was developed for x86_64 Linux, so other targets may not work. I also haven't checked if the code still works on the master branch for a while.

As I recall this PR only touched the standard library, so you shouldn't need to compile Zig from scratch to use it.

@Khitiara
Copy link

well i guess ill test it on freestanding then. ive got a sparse checkout of just the std for my project that I use for osdev specific changes and getting fixes/features from PRs early

@Khitiara
Copy link

the functionality of this PR is working fine for me on a freestanding target (once the hurdle of actually getting debug info loading on freestanding was handled)

@leroycep
Copy link
Contributor Author

That's good to hear! Glad it worked 😄 .

@mlugg
Copy link
Member

mlugg commented Sep 9, 2025

Thanks for this work, and sorry it's been left sitting for so long.

I'm currently working on a huge diff to std.debug which pretty much rewrites SelfInfo and has some major refactors in std.debug itself, std.debug.Dwarf, etc. That branch is nearing completion, and my next step is to integrate the work from this PR into it (I'll make sure to retain your authorship information!), so my PR will eventually supersede this one. I'll let you know if I have any questions, though what I would typically have left here as a PR review I'll probably just address myself when porting the work.

Thanks again! I'll let you know if I need any help, but hopefully it shouldn't be too bad---the change you're making here is a pretty obvious one now that I've been living inside std.debug for a week, and in fact the Mach-O implementation of SelfInfo already defers to symbol names(!).

mlugg added a commit to mlugg/zig that referenced this pull request Sep 12, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 13, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
@mlugg
Copy link
Member

mlugg commented Sep 13, 2025

Superseded by #25227. While it was far from a direct cherry-pick, this PR was very helpful in my implementation there, so thank you!

@mlugg mlugg closed this Sep 13, 2025
mlugg added a commit to mlugg/zig that referenced this pull request Sep 14, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 17, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 18, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 22, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 26, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 26, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 27, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
mlugg added a commit to mlugg/zig that referenced this pull request Sep 30, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
paaloeye pushed a commit to paaloeye/zig that referenced this pull request Oct 3, 2025
This abstraction isn't really tied to DWARF at all! Really, we're just
loading some information from an ELF file which is useful for debugging.
That *includes* DWARF, but it also includes other information. For
instance, the other change here:

Now, if DWARF information is missing, `debug.SelfInfo.ElfModule` will
name symbols by finding a matching symtab entry. We actually already do
this on Mach-O, so it makes obvious sense to do the same on ELF! This
change is what motivated the restructuring to begin with.

The symtab work is derived from ziglang#22077.

Co-authored-by: geemili <opensource@geemili.xyz>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants