-
Couldn't load subscription status.
- Fork 15k
Description
Background: when linking macOS binaries with chained fixups, we need to transform initializers stored in __mod_init_func (an array of pointers rebased through the usual means at runtime) to __init_offsets (an array of 32-bit offsets to initializers).
Normally, we only need to care about the relocations in the input __mod_init_func sections. A problem arises when there are also symbols defined inside it. We currently ignore them in LLD -- that is, we don't add them to the symbol table (since the location they point to don't exist anymore). This doesn't happen in regular binaries created by Clang, swiftc, rustc, etc., but there have been a few instances, where this led to crashes:
-
In #94716, we see a go-generated binary (the repro file is broken and doesn't include a bunch of swift stuff from the SDK -- TODO!). Here, the symbol (
__rt0_arm64_ios_lib.ptr) is defined inside__mod_init_funcas a non-exported symbol; we crash when trying to add it to the symbol table (it has no corresponding output section, so we can't setn_sect).$ lipo -thin arm64 Users/snqu/Desktop/chromium/chromium/src/ios/chrome/vpn_extension/Tun2socks.framework/Tun2socks -o Tun2socks $ $(brew --prefix llvm)/bin/llvm-ar x Tun2socks go.o $ nm go.o | grep __rt0_arm64_ios_lib\.ptr 0000000001af3a60 s __rt0_arm64_ios_lib.ptrBacktrace excerpt:
* thread #6, stop reason = EXC_BAD_ACCESS (code=1, address=0x34) * frame #0: 0x00000001002a0894 ld64.lld`SymtabSectionImpl<lld::macho::LP64>::writeTo(this=<unavailable>, buf=<unavailable>) const at SyntheticSections.cpp:1415:50 [opt] -
This Chromium bug is related to the
curlRust crate, which deliberately defines a symbol among the initializers, apparently, to sidestep an old linker/compiler dead-stripping issue. Here, the symbol (__RNvCsiLjxBhyzEAX_4curl9INIT_CTOR;curl::INIT_CTOR) is externally visible, we crash when we try to query its address when adding it to the exports trie.Backtrace excerpt:
frame #3: 0x000000019cf14d20 libsystem_c.dylib`__assert_rtn + 284 frame #4: 0x00000001003053e8 ld64.lld`lld::macho::Defined::getVA() const (.cold.2) at Symbols.cpp:97:5 [opt] * frame #5: 0x0000000100289ee0 ld64.lld`lld::macho::Defined::getVA(this=0x000000014b01ca58) const at Symbols.cpp:97:5 [opt] frame #6: 0x0000000100255fe4 ld64.lld`lld::macho::TrieBuilder::sortAndBuild(llvm::MutableArrayRef<lld::macho::Symbol const*>, lld::macho::TrieNode*, unsigned long, unsigned long) [inlined] (anonymous namespace)::ExportInfo::ExportInfo(this=<unavailable>, sym=0x000000014b01ca58, imageBase=4294967296) at ExportTrie.cpp:64:21 [opt]$ ar x Users/hwennborg/chromium/src/third_party/rust-src/build/x86_64-apple-darwin/stage2-tools/x86_64-apple-darwin/release/deps/libcurl-f5fa2775c6309a20.rlib curl-f5fa2775c6309a20.curl.da8bc63371d5233d-cgu.09.rcgu.o $ nm curl-f5fa2775c6309a20.curl.da8bc63371d5233d-cgu.09.rcgu.o -s __DATA __mod_init_func 0000000000000d40 S __RNvCsiLjxBhyzEAX_4curl9INIT_CTOR
Fix ideas
-
Completely remove
__mod_init_funcfrom the list of input sectionsI thought my original patch would have this effect: we do not create an OutputSection for it and don't even include it in the global
inputSectionslist.In reality, this is not enough; they are still added to the symbol table (as
__mod_init_funcis present in thesymbolsarray,ObjFile::parseSymbolswill reach it).(+) if we encounter a
Definedsymbol during the program's execution, we know for sure that it has an address, no need to check for a poison flag.(-) sounds a bit hackish; currently there is a one-on-one correspondence between
ObjFile::sectionsand the input file's contents -
Create a "poisoned" state for
DefinedSymbols(+) Least amount of modification for the existing code
(+) There will still be an entry (though poisoned) in the symbol table, so we'll be able to emit useful warnings if someone actually refers to the symbol.
NOTE: this is basically the current workaround I ended up going for, except that I use the
isLive()mechanism from dead-stripping. -
Some other idea?
- maybe just add a few extra checks to the known crashy places to see if the symbols refer to a deleted section? sounds very fragile
These mentioned workarounds only work if the symbols are not actually referenced in relocations. If they are, we get different (but equally undesirable) behaviors.
; test.s
.globl _main
.text
_main:
leaq _init_slot(%rip), %rax
.section __DATA,__mod_init_func,mod_init_funcs
_init_slot:
.quad _mainld64 crash
❯ clang test.s -Wl,-ld_classic
ld: warning: alignment (1) of atom '_init_slot' is too small and may result in unaligned pointers
0 0x10659e807 __assert_rtn + 137
1 0x1065a79e3 ld::tool::OutputFile::addressOf(ld::Internal const&, ld::Fixup const*, ld::Atom const**) (.cold.1) + 35
2 0x1063fc5e4 ld::tool::OutputFile::addressOf(ld::Internal const&, ld::Fixup const*, ld::Atom const**) + 116
3 0x1063fd087 ld::tool::OutputFile::applyFixUps(ld::Internal&, unsigned long long, ld::Atom const*, unsigned char*) + 599
4 0x106405818 ___ZN2ld4tool10OutputFile10writeAtomsERNS_8InternalEPh_block_invoke + 504
5 0x7ff815881def _dispatch_client_callout2 + 8
6 0x7ff815893547 _dispatch_apply_invoke3 + 431
7 0x7ff815881dbc _dispatch_client_callout + 8
8 0x7ff81588304e _dispatch_once_callout + 20
9 0x7ff815892740 _dispatch_apply_invoke + 184
10 0x7ff815881dbc _dispatch_client_callout + 8
11 0x7ff8158912ca _dispatch_root_queue_drain + 871
12 0x7ff81589184f _dispatch_worker_thread2 + 152
13 0x7ff815a1fb43 _pthread_wqthread + 262
A linker snapshot was created at:
/tmp/a.out-2024-06-25-091629.ld-snapshot
ld: Assertion failed: (_mode == modeFinalAddress), function finalAddress, file ld.hpp, line 1413.
ld_prime broken (?) binary
❯ clang test.s -Wl,-ld_new
❯ objdump -d a.out
a.out: file format mach-o 64-bit x86-64
Disassembly of section __TEXT,__text:
0000000100000f9d <_main>:
100000f9d: 48 8d 05 5c f0 ff ff leaq -4004(%rip), %rax ## 0x100000000
# Relocation refers to the beginning of the file, `__mh_execute_header`???
Additional questions
There have been other similar transformations added recently: ObjC relative method lists, (etc?). Could we theoretically encounter a similar scenario there?