Skip to content

Conversation

@mysterymath
Copy link
Contributor

@mysterymath mysterymath commented Oct 24, 2025

This patch ensures that:

  1. New bitcode is not extracted for libfuncs after LTO occurs, and
  2. Extracted bitcode for libfuncs is considered external, since new
    calls to it may be emitted.

This is the patch referenced in @ilovepi's and my talk at the last LLVM devmeeting: "LT-Uh-Oh"

@github-actions
Copy link

github-actions bot commented Oct 24, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@mysterymath mysterymath force-pushed the builtins-world-full branch 2 times, most recently from ca68c17 to 33aa80b Compare November 13, 2025 23:02
@mysterymath mysterymath force-pushed the builtins-world-full branch 3 times, most recently from 071e2cd to 4c644e8 Compare December 9, 2025 00:51
@mysterymath mysterymath changed the title [DRAFT] "Builtins world" for LTO [LTO][LLD] Prevent invalid LTO libfunc transforms Dec 9, 2025
@github-actions
Copy link

github-actions bot commented Dec 9, 2025

🐧 Linux x64 Test Results

  • 194004 tests passed
  • 6290 tests skipped

✅ The build succeeded and all tests passed.

@github-actions
Copy link

github-actions bot commented Dec 9, 2025

🪟 Windows x64 Test Results

  • 130135 tests passed
  • 4020 tests skipped

✅ The build succeeded and all tests passed.

This patch ensures that:
1) New bitcode is not extracted for libfuncs after LTO occurs, and
2) Extracted bitcode for libfuncs is considered external, since new
   calls to it may be emitted.
@mysterymath mysterymath marked this pull request as ready for review December 9, 2025 20:15
@mysterymath mysterymath requested a review from arsenm December 9, 2025 20:15
@llvmbot llvmbot added lld clang:codegen IR generation bugs: mangling, exceptions, etc. lld:ELF LTO Link time optimization (regular/full LTO or ThinLTO) llvm:binary-utilities labels Dec 9, 2025
@llvmbot
Copy link
Member

llvmbot commented Dec 9, 2025

@llvm/pr-subscribers-lld
@llvm/pr-subscribers-lld-elf

@llvm/pr-subscribers-lto

Author: Daniel Thornburgh (mysterymath)

Changes

This patch ensures that:

  1. New bitcode is not extracted for libfuncs after LTO occurs, and
  2. Extracted bitcode for libfuncs is considered external, since new
    calls to it may be emitted.

This is the patch referenced in @ilovepi and my talk at the last LLVM devmeeting.


Patch is 31.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164916.diff

16 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+5-5)
  • (modified) lld/ELF/Driver.cpp (+17-2)
  • (modified) lld/ELF/LTO.cpp (+4-1)
  • (modified) lld/ELF/LTO.h (+2-1)
  • (added) lld/test/ELF/lto/libcall-archive-bitcode.test (+41)
  • (modified) llvm/include/llvm/LTO/LTO.h (+21-3)
  • (modified) llvm/include/llvm/LTO/LTOBackend.h (+5-2)
  • (modified) llvm/lib/LTO/LTO.cpp (+44-18)
  • (modified) llvm/lib/LTO/LTOBackend.cpp (+36-7)
  • (modified) llvm/lib/LTO/LTOCodeGenerator.cpp (+2-2)
  • (modified) llvm/lib/Object/CMakeLists.txt (+1)
  • (modified) llvm/lib/Object/IRSymtab.cpp (+7-1)
  • (added) llvm/test/LTO/Resolution/X86/libcall-external-bitcode.ll (+20)
  • (added) llvm/test/LTO/Resolution/X86/libcall-external-not-bitcode.ll (+20)
  • (added) llvm/test/LTO/Resolution/X86/libcall-in-tu.ll (+34)
  • (modified) llvm/tools/llvm-lto2/llvm-lto2.cpp (+7)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 5590d217e96ff..8382ae0873adb 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1451,11 +1451,11 @@ runThinLTOBackend(CompilerInstance &CI, ModuleSummaryIndex *CombinedIndex,
   // FIXME: Both ExecuteAction and thinBackend set up optimization remarks for
   // the same context.
   finalizeLLVMOptimizationRemarks(M->getContext());
-  if (Error E =
-          thinBackend(Conf, -1, AddStream, *M, *CombinedIndex, ImportList,
-                      ModuleToDefinedGVSummaries[M->getModuleIdentifier()],
-                      /*ModuleMap=*/nullptr, Conf.CodeGenOnly,
-                      /*IRAddStream=*/nullptr, CGOpts.CmdArgs)) {
+  if (Error E = thinBackend(
+          Conf, -1, AddStream, *M, *CombinedIndex, ImportList,
+          ModuleToDefinedGVSummaries[M->getModuleIdentifier()],
+          /*ModuleMap=*/nullptr, Conf.CodeGenOnly, /*BitcodeLibFuncs=*/{},
+          /*IRAddStream=*/nullptr, CGOpts.CmdArgs)) {
     handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {
       errs() << "Error running ThinLTO backend: " << EIB.message() << '\n';
     });
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 8647752be31fe..b0834e6c26b7a 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -2701,15 +2701,30 @@ static void markBuffersAsDontNeed(Ctx &ctx, bool skipLinkedOutput) {
 template <class ELFT>
 void LinkerDriver::compileBitcodeFiles(bool skipLinkedOutput) {
   llvm::TimeTraceScope timeScope("LTO");
+  // Capture the triple before moving the bitcode into the bitcode compiler.
+  std::optional<llvm::Triple> tt;
+  if (!ctx.bitcodeFiles.empty())
+    tt = llvm::Triple(ctx.bitcodeFiles.front()->obj->getTargetTriple());
   // Compile bitcode files and replace bitcode symbols.
   lto.reset(new BitcodeCompiler(ctx));
   for (BitcodeFile *file : ctx.bitcodeFiles)
     lto->add(*file);
 
-  if (!ctx.bitcodeFiles.empty())
+  llvm::BumpPtrAllocator alloc;
+  llvm::StringSaver saver(alloc);
+  SmallVector<StringRef> bitcodeLibFuncs;
+  if (!ctx.bitcodeFiles.empty()) {
     markBuffersAsDontNeed(ctx, skipLinkedOutput);
+    for (StringRef libFunc : lto::LTO::getLibFuncSymbols(*tt, saver)) {
+      Symbol *sym = ctx.symtab->find(libFunc);
+      if (!sym)
+        continue;
+      if (isa<BitcodeFile>(sym->file))
+        bitcodeLibFuncs.push_back(libFunc);
+    }
+  }
 
-  ltoObjectFiles = lto->compile();
+  ltoObjectFiles = lto->compile(bitcodeLibFuncs);
   for (auto &file : ltoObjectFiles) {
     auto *obj = cast<ObjFile<ELFT>>(file.get());
     obj->parse(/*ignoreComdats=*/true);
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 80c6d2482f9fa..839eed9956d3a 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -311,7 +311,10 @@ static void thinLTOCreateEmptyIndexFiles(Ctx &ctx) {
 
 // Merge all the bitcode files we have seen, codegen the result
 // and return the resulting ObjectFile(s).
-SmallVector<std::unique_ptr<InputFile>, 0> BitcodeCompiler::compile() {
+SmallVector<std::unique_ptr<InputFile>, 0>
+BitcodeCompiler::compile(const SmallVector<StringRef> &bitcodeLibFuncs) {
+  ltoObj->setBitcodeLibFuncs(bitcodeLibFuncs);
+
   unsigned maxTasks = ltoObj->getMaxTasks();
   buf.resize(maxTasks);
   files.resize(maxTasks);
diff --git a/lld/ELF/LTO.h b/lld/ELF/LTO.h
index acf3bcff7f2f1..8207e91460785 100644
--- a/lld/ELF/LTO.h
+++ b/lld/ELF/LTO.h
@@ -42,7 +42,8 @@ class BitcodeCompiler {
   ~BitcodeCompiler();
 
   void add(BitcodeFile &f);
-  SmallVector<std::unique_ptr<InputFile>, 0> compile();
+  SmallVector<std::unique_ptr<InputFile>, 0>
+  compile(const SmallVector<StringRef> &bitcodeLibFuncs);
 
 private:
   Ctx &ctx;
diff --git a/lld/test/ELF/lto/libcall-archive-bitcode.test b/lld/test/ELF/lto/libcall-archive-bitcode.test
new file mode 100644
index 0000000000000..20735b5c89c99
--- /dev/null
+++ b/lld/test/ELF/lto/libcall-archive-bitcode.test
@@ -0,0 +1,41 @@
+; REQUIRES: x86
+
+; RUN: rm -rf %t && split-file %s %t && cd %t
+; RUN: llvm-as main.ll -o main.o
+; RUN: llvm-as bcmp.ll -o bcmp.o
+; RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux-gnu memcmp.s -o memcmp.o
+; RUN: llvm-ar rc libc.a bcmp.o memcmp.o
+
+;; Ensure that no memcmp->bcmp translation occurs during LTO because bcmp is in
+;; bitcode, but was not brought into the link. This would fail the link by
+;; extracting bitcode after LTO.
+; RUN: ld.lld -o out main.o -L. -lc
+; RUN: llvm-nm out | FileCheck %s
+
+;--- bcmp.ll
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define i32 @bcmp(ptr %0, ptr %1, i64 %2) {
+  ret i32 0
+}
+
+;--- memcmp.s
+.globl memcmp
+memcmp:
+  ret
+
+;--- main.ll
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define i1 @_start(ptr %0, ptr %1, i64 %2) {
+  %cmp = call i32 @memcmp(ptr %0, ptr %1, i64 %2)
+  %eq = icmp eq i32 %cmp, 0
+  ret i1 %eq
+}
+
+; CHECK-NOT: bcmp
+; CHECK: memcmp
+declare i32 @memcmp(ptr, ptr, i64)
+
diff --git a/llvm/include/llvm/LTO/LTO.h b/llvm/include/llvm/LTO/LTO.h
index 3a4dc5a3dfcf8..3ee5d455774b3 100644
--- a/llvm/include/llvm/LTO/LTO.h
+++ b/llvm/include/llvm/LTO/LTO.h
@@ -264,7 +264,8 @@ class ThinBackendProc {
 using ThinBackendFunction = std::function<std::unique_ptr<ThinBackendProc>(
     const Config &C, ModuleSummaryIndex &CombinedIndex,
     const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-    AddStreamFn AddStream, FileCache Cache)>;
+    AddStreamFn AddStream, FileCache Cache,
+    const SmallVector<StringRef> &BitcodeLibFuncs)>;
 
 /// This type defines the behavior following the thin-link phase during ThinLTO.
 /// It encapsulates a backend function and a strategy for thread pool
@@ -279,10 +280,11 @@ struct ThinBackend {
   std::unique_ptr<ThinBackendProc> operator()(
       const Config &Conf, ModuleSummaryIndex &CombinedIndex,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-      AddStreamFn AddStream, FileCache Cache) {
+      AddStreamFn AddStream, FileCache Cache,
+      const SmallVector<StringRef> &BitcodeLibFuncs) {
     assert(isValid() && "Invalid backend function");
     return Func(Conf, CombinedIndex, ModuleToDefinedGVSummaries,
-                std::move(AddStream), std::move(Cache));
+                std::move(AddStream), std::move(Cache), BitcodeLibFuncs);
   }
   ThreadPoolStrategy getParallelism() const { return Parallelism; }
   bool isValid() const { return static_cast<bool>(Func); }
@@ -400,6 +402,12 @@ class LTO {
   LLVM_ABI Error add(std::unique_ptr<InputFile> Obj,
                      ArrayRef<SymbolResolution> Res);
 
+  /// Set the list of functions implemented in bitcode across the link, whether
+  /// extracted or not. Such functions may not be referenced if they were not
+  /// extracted by the time LTO occurs.
+  LLVM_ABI void
+  setBitcodeLibFuncs(const SmallVector<StringRef> &BitcodeLibFuncs);
+
   /// Returns an upper bound on the number of tasks that the client may expect.
   /// This may only be called after all IR object files have been added. For a
   /// full description of tasks see LTOBackend.h.
@@ -420,6 +428,14 @@ class LTO {
   LLVM_ABI static SmallVector<const char *>
   getRuntimeLibcallSymbols(const Triple &TT);
 
+  /// Static method that returns a list of library function symbols that can be
+  /// generated by LTO but might not be visible from bitcode symbol table.
+  /// Unlike the runtime libcalls, the linker can report to the code generator
+  /// which of these are actually available in the link, and the code generator
+  /// can then only reference that set of symbols.
+  LLVM_ABI static SmallVector<StringRef>
+  getLibFuncSymbols(const Triple &TT, llvm::StringSaver &Saver);
+
 private:
   Config Conf;
 
@@ -591,6 +607,8 @@ class LTO {
 
   // Diagnostic optimization remarks file
   LLVMRemarkFileHandle DiagnosticOutputFile;
+
+  SmallVector<StringRef> BitcodeLibFuncs;
 };
 
 /// The resolution for a symbol. The linker must provide a SymbolResolution for
diff --git a/llvm/include/llvm/LTO/LTOBackend.h b/llvm/include/llvm/LTO/LTOBackend.h
index 48ad5aa64f61f..6a7d7e0d87ac9 100644
--- a/llvm/include/llvm/LTO/LTOBackend.h
+++ b/llvm/include/llvm/LTO/LTOBackend.h
@@ -39,13 +39,15 @@ LLVM_ABI bool opt(const Config &Conf, TargetMachine *TM, unsigned Task,
                   Module &Mod, bool IsThinLTO,
                   ModuleSummaryIndex *ExportSummary,
                   const ModuleSummaryIndex *ImportSummary,
-                  const std::vector<uint8_t> &CmdArgs);
+                  const std::vector<uint8_t> &CmdArgs,
+                  const SmallVector<StringRef> &BitcodeLibFuncs);
 
 /// Runs a regular LTO backend. The regular LTO backend can also act as the
 /// regular LTO phase of ThinLTO, which may need to access the combined index.
 LLVM_ABI Error backend(const Config &C, AddStreamFn AddStream,
                        unsigned ParallelCodeGenParallelismLevel, Module &M,
-                       ModuleSummaryIndex &CombinedIndex);
+                       ModuleSummaryIndex &CombinedIndex,
+                       const SmallVector<StringRef> &BitcodeLibFuncs);
 
 /// Runs a ThinLTO backend.
 /// If \p ModuleMap is not nullptr, all the module files to be imported have
@@ -62,6 +64,7 @@ thinBackend(const Config &C, unsigned Task, AddStreamFn AddStream, Module &M,
             const FunctionImporter::ImportMapTy &ImportList,
             const GVSummaryMapTy &DefinedGlobals,
             MapVector<StringRef, BitcodeModule> *ModuleMap, bool CodeGenOnly,
+            const SmallVector<StringRef> &BitcodeLibFuncs,
             AddStreamFn IRAddStream = nullptr,
             const std::vector<uint8_t> &CmdArgs = std::vector<uint8_t>());
 
diff --git a/llvm/lib/LTO/LTO.cpp b/llvm/lib/LTO/LTO.cpp
index a02af59600c44..97d3952b05d06 100644
--- a/llvm/lib/LTO/LTO.cpp
+++ b/llvm/lib/LTO/LTO.cpp
@@ -763,6 +763,10 @@ Error LTO::add(std::unique_ptr<InputFile> Input,
   return Error::success();
 }
 
+void LTO::setBitcodeLibFuncs(const SmallVector<StringRef> &BitcodeLibFuncs) {
+  this->BitcodeLibFuncs = BitcodeLibFuncs;
+}
+
 Expected<ArrayRef<SymbolResolution>>
 LTO::addModule(InputFile &Input, ArrayRef<SymbolResolution> InputRes,
                unsigned ModI, ArrayRef<SymbolResolution> Res) {
@@ -1385,9 +1389,9 @@ Error LTO::runRegularLTO(AddStreamFn AddStream) {
   }
 
   if (!RegularLTO.EmptyCombinedModule || Conf.AlwaysEmitRegularLTOObj) {
-    if (Error Err =
-            backend(Conf, AddStream, RegularLTO.ParallelCodeGenParallelismLevel,
-                    *RegularLTO.CombinedModule, ThinLTO.CombinedIndex))
+    if (Error Err = backend(
+            Conf, AddStream, RegularLTO.ParallelCodeGenParallelismLevel,
+            *RegularLTO.CombinedModule, ThinLTO.CombinedIndex, BitcodeLibFuncs))
       return Err;
   }
 
@@ -1407,6 +1411,21 @@ SmallVector<const char *> LTO::getRuntimeLibcallSymbols(const Triple &TT) {
   return LibcallSymbols;
 }
 
+SmallVector<StringRef> LTO::getLibFuncSymbols(const Triple &TT,
+                                              StringSaver &Saver) {
+  auto TLII = std::make_unique<TargetLibraryInfoImpl>(TT);
+  TargetLibraryInfo TLI(*TLII);
+  SmallVector<StringRef> LibFuncSymbols;
+  LibFuncSymbols.reserve(LibFunc::NumLibFuncs);
+  for (unsigned I = 0, E = static_cast<unsigned>(LibFunc::NumLibFuncs); I != E;
+       ++I) {
+    LibFunc F = static_cast<LibFunc>(I);
+    if (TLI.has(F))
+      LibFuncSymbols.push_back(Saver.save(TLI.getName(F)).data());
+  }
+  return LibFuncSymbols;
+}
+
 Error ThinBackendProc::emitFiles(
     const FunctionImporter::ImportMapTy &ImportList, llvm::StringRef ModulePath,
     const std::string &NewModulePath) const {
@@ -1484,6 +1503,7 @@ class CGThinBackend : public ThinBackendProc {
 class InProcessThinBackend : public CGThinBackend {
 protected:
   FileCache Cache;
+  const SmallVector<StringRef> &BitcodeLibFuncs;
 
 public:
   InProcessThinBackend(
@@ -1491,11 +1511,12 @@ class InProcessThinBackend : public CGThinBackend {
       ThreadPoolStrategy ThinLTOParallelism,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
       AddStreamFn AddStream, FileCache Cache, lto::IndexWriteCallback OnWrite,
-      bool ShouldEmitIndexFiles, bool ShouldEmitImportsFiles)
+      bool ShouldEmitIndexFiles, bool ShouldEmitImportsFiles,
+      const SmallVector<StringRef> &BitcodeLibFuncs)
       : CGThinBackend(Conf, CombinedIndex, ModuleToDefinedGVSummaries,
                       AddStream, OnWrite, ShouldEmitIndexFiles,
                       ShouldEmitImportsFiles, ThinLTOParallelism),
-        Cache(std::move(Cache)) {}
+        Cache(std::move(Cache)), BitcodeLibFuncs(BitcodeLibFuncs) {}
 
   virtual Error runThinLTOBackendThread(
       AddStreamFn AddStream, FileCache Cache, unsigned Task, BitcodeModule BM,
@@ -1516,7 +1537,7 @@ class InProcessThinBackend : public CGThinBackend {
 
       return thinBackend(Conf, Task, AddStream, **MOrErr, CombinedIndex,
                          ImportList, DefinedGlobals, &ModuleMap,
-                         Conf.CodeGenOnly);
+                         Conf.CodeGenOnly, BitcodeLibFuncs);
     };
     if (ShouldEmitIndexFiles) {
       if (auto E = emitFiles(ImportList, ModuleID, ModuleID.str()))
@@ -1601,13 +1622,14 @@ class FirstRoundThinBackend : public InProcessThinBackend {
       const Config &Conf, ModuleSummaryIndex &CombinedIndex,
       ThreadPoolStrategy ThinLTOParallelism,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-      AddStreamFn CGAddStream, FileCache CGCache, AddStreamFn IRAddStream,
+      AddStreamFn CGAddStream, FileCache CGCache,
+      const SmallVector<StringRef> &BitcodeLibFuncs, AddStreamFn IRAddStream,
       FileCache IRCache)
       : InProcessThinBackend(Conf, CombinedIndex, ThinLTOParallelism,
                              ModuleToDefinedGVSummaries, std::move(CGAddStream),
                              std::move(CGCache), /*OnWrite=*/nullptr,
                              /*ShouldEmitIndexFiles=*/false,
-                             /*ShouldEmitImportsFiles=*/false),
+                             /*ShouldEmitImportsFiles=*/false, BitcodeLibFuncs),
         IRAddStream(std::move(IRAddStream)), IRCache(std::move(IRCache)) {}
 
   Error runThinLTOBackendThread(
@@ -1630,7 +1652,7 @@ class FirstRoundThinBackend : public InProcessThinBackend {
 
       return thinBackend(Conf, Task, CGAddStream, **MOrErr, CombinedIndex,
                          ImportList, DefinedGlobals, &ModuleMap,
-                         Conf.CodeGenOnly, IRAddStream);
+                         Conf.CodeGenOnly, BitcodeLibFuncs, IRAddStream);
     };
     // Like InProcessThinBackend, we produce index files as needed for
     // FirstRoundThinBackend. However, these files are not generated for
@@ -1697,6 +1719,7 @@ class SecondRoundThinBackend : public InProcessThinBackend {
       ThreadPoolStrategy ThinLTOParallelism,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
       AddStreamFn AddStream, FileCache Cache,
+      const SmallVector<StringRef> &BitcodeLibFuncs,
       std::unique_ptr<SmallVector<StringRef>> IRFiles,
       stable_hash CombinedCGDataHash)
       : InProcessThinBackend(Conf, CombinedIndex, ThinLTOParallelism,
@@ -1704,7 +1727,7 @@ class SecondRoundThinBackend : public InProcessThinBackend {
                              std::move(Cache),
                              /*OnWrite=*/nullptr,
                              /*ShouldEmitIndexFiles=*/false,
-                             /*ShouldEmitImportsFiles=*/false),
+                             /*ShouldEmitImportsFiles=*/false, BitcodeLibFuncs),
         IRFiles(std::move(IRFiles)), CombinedCGDataHash(CombinedCGDataHash) {}
 
   Error runThinLTOBackendThread(
@@ -1725,7 +1748,7 @@ class SecondRoundThinBackend : public InProcessThinBackend {
 
       return thinBackend(Conf, Task, AddStream, *LoadedModule, CombinedIndex,
                          ImportList, DefinedGlobals, &ModuleMap,
-                         /*CodeGenOnly=*/true);
+                         /*CodeGenOnly=*/true, BitcodeLibFuncs);
     };
     if (!Cache.isValid() || !CombinedIndex.modulePaths().count(ModuleID) ||
         all_of(CombinedIndex.getModuleHash(ModuleID),
@@ -1764,11 +1787,12 @@ ThinBackend lto::createInProcessThinBackend(ThreadPoolStrategy Parallelism,
   auto Func =
       [=](const Config &Conf, ModuleSummaryIndex &CombinedIndex,
           const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-          AddStreamFn AddStream, FileCache Cache) {
+          AddStreamFn AddStream, FileCache Cache,
+          const SmallVector<StringRef> &BitcodeLibFuncs) {
         return std::make_unique<InProcessThinBackend>(
             Conf, CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
             AddStream, Cache, OnWrite, ShouldEmitIndexFiles,
-            ShouldEmitImportsFiles);
+            ShouldEmitImportsFiles, BitcodeLibFuncs);
       };
   return ThinBackend(Func, Parallelism);
 }
@@ -1885,7 +1909,8 @@ ThinBackend lto::createWriteIndexesThinBackend(
   auto Func =
       [=](const Config &Conf, ModuleSummaryIndex &CombinedIndex,
           const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-          AddStreamFn AddStream, FileCache Cache) {
+          AddStreamFn AddStream, FileCache Cache,
+          const SmallVector<StringRef> &BitcodeLibFuncs) {
         return std::make_unique<WriteIndexesThinBackend>(
             Conf, CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
             OldPrefix, NewPrefix, NativeObjectPrefix, ShouldEmitImportsFiles,
@@ -2103,7 +2128,7 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
   if (!CodeGenDataThinLTOTwoRounds) {
     std::unique_ptr<ThinBackendProc> BackendProc =
         ThinLTO.Backend(Conf, ThinLTO.CombinedIndex, ModuleToDefinedGVSummaries,
-                        AddStream, Cache);
+                        AddStream, Cache, BitcodeLibFuncs);
     return RunBackends(BackendProc.get());
   }
 
@@ -2126,7 +2151,7 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
   LLVM_DEBUG(dbgs() << "[TwoRounds] Running the first round of codegen\n");
   auto FirstRoundLTO = std::make_unique<FirstRoundThinBackend>(
       Conf, ThinLTO.CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
-      CG.AddStream, CG.Cache, IR.AddStream, IR.Cache);
+      CG.AddStream, CG.Cache, BitcodeLibFuncs, IR.AddStream, IR.Cache);
   if (Error E = RunBackends(FirstRoundLTO.get()))
     return E;
 
@@ -2142,7 +2167,7 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
   LLVM_DEBUG(dbgs() << "[TwoRounds] Running the second round of codegen\n");
   auto SecondRoundLTO = std::make_unique<SecondRoundThinBackend>(
       Conf, ThinLTO.CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
-      AddStream, Cache, IR.getResult(), CombinedHash);
+      AddStream, Cache, BitcodeLibFuncs, IR.getResult(), CombinedHash);
   return RunBackends(SecondRoundLTO.get());
 }
 
@@ -2620,7 +2645,8 @@ ThinBackend lto::createOutOfProcessThinBackend(
   auto Func =
       [=](const Config &Conf, ModuleSummaryIndex &CombinedIndex,
           const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-          AddStreamFn AddStream, FileCache Cache) {
+          AddStreamFn AddStream, FileCache Cache,
+          const SmallVector<StringRef> &BitcodeLibFuncs) {
         return std::make_unique<OutOfProcessThinBackend>(
             Conf, CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
             AddStream, Cache, OnWrite, ShouldEmitIndexFiles,
diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp
index 93118becedbac..bd91249db50ed 100644
--- a/llvm/lib/LTO/LTOBackend.cpp
+++ b/llvm/lib/LTO/LTOBackend.cpp
@@ -239,7 +239,8 @@ createTargetMachine(const Config &Conf, const Target *TheTarget, Module &M) {
 static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM,
                            unsigned OptLevel, bool IsThinLTO,
                            ModuleSummaryIndex *ExportSummary,
-                           const ModuleSummaryIndex *ImportSummary) {
+                           const ModuleSummaryIndex *ImportSummary,
+    ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Dec 9, 2025

@llvm/pr-subscribers-llvm-binary-utilities

Author: Daniel Thornburgh (mysterymath)

Changes

This patch ensures that:

  1. New bitcode is not extracted for libfuncs after LTO occurs, and
  2. Extracted bitcode for libfuncs is considered external, since new
    calls to it may be emitted.

This is the patch referenced in @ilovepi and my talk at the last LLVM devmeeting.


Patch is 31.35 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/164916.diff

16 Files Affected:

  • (modified) clang/lib/CodeGen/BackendUtil.cpp (+5-5)
  • (modified) lld/ELF/Driver.cpp (+17-2)
  • (modified) lld/ELF/LTO.cpp (+4-1)
  • (modified) lld/ELF/LTO.h (+2-1)
  • (added) lld/test/ELF/lto/libcall-archive-bitcode.test (+41)
  • (modified) llvm/include/llvm/LTO/LTO.h (+21-3)
  • (modified) llvm/include/llvm/LTO/LTOBackend.h (+5-2)
  • (modified) llvm/lib/LTO/LTO.cpp (+44-18)
  • (modified) llvm/lib/LTO/LTOBackend.cpp (+36-7)
  • (modified) llvm/lib/LTO/LTOCodeGenerator.cpp (+2-2)
  • (modified) llvm/lib/Object/CMakeLists.txt (+1)
  • (modified) llvm/lib/Object/IRSymtab.cpp (+7-1)
  • (added) llvm/test/LTO/Resolution/X86/libcall-external-bitcode.ll (+20)
  • (added) llvm/test/LTO/Resolution/X86/libcall-external-not-bitcode.ll (+20)
  • (added) llvm/test/LTO/Resolution/X86/libcall-in-tu.ll (+34)
  • (modified) llvm/tools/llvm-lto2/llvm-lto2.cpp (+7)
diff --git a/clang/lib/CodeGen/BackendUtil.cpp b/clang/lib/CodeGen/BackendUtil.cpp
index 5590d217e96ff..8382ae0873adb 100644
--- a/clang/lib/CodeGen/BackendUtil.cpp
+++ b/clang/lib/CodeGen/BackendUtil.cpp
@@ -1451,11 +1451,11 @@ runThinLTOBackend(CompilerInstance &CI, ModuleSummaryIndex *CombinedIndex,
   // FIXME: Both ExecuteAction and thinBackend set up optimization remarks for
   // the same context.
   finalizeLLVMOptimizationRemarks(M->getContext());
-  if (Error E =
-          thinBackend(Conf, -1, AddStream, *M, *CombinedIndex, ImportList,
-                      ModuleToDefinedGVSummaries[M->getModuleIdentifier()],
-                      /*ModuleMap=*/nullptr, Conf.CodeGenOnly,
-                      /*IRAddStream=*/nullptr, CGOpts.CmdArgs)) {
+  if (Error E = thinBackend(
+          Conf, -1, AddStream, *M, *CombinedIndex, ImportList,
+          ModuleToDefinedGVSummaries[M->getModuleIdentifier()],
+          /*ModuleMap=*/nullptr, Conf.CodeGenOnly, /*BitcodeLibFuncs=*/{},
+          /*IRAddStream=*/nullptr, CGOpts.CmdArgs)) {
     handleAllErrors(std::move(E), [&](ErrorInfoBase &EIB) {
       errs() << "Error running ThinLTO backend: " << EIB.message() << '\n';
     });
diff --git a/lld/ELF/Driver.cpp b/lld/ELF/Driver.cpp
index 8647752be31fe..b0834e6c26b7a 100644
--- a/lld/ELF/Driver.cpp
+++ b/lld/ELF/Driver.cpp
@@ -2701,15 +2701,30 @@ static void markBuffersAsDontNeed(Ctx &ctx, bool skipLinkedOutput) {
 template <class ELFT>
 void LinkerDriver::compileBitcodeFiles(bool skipLinkedOutput) {
   llvm::TimeTraceScope timeScope("LTO");
+  // Capture the triple before moving the bitcode into the bitcode compiler.
+  std::optional<llvm::Triple> tt;
+  if (!ctx.bitcodeFiles.empty())
+    tt = llvm::Triple(ctx.bitcodeFiles.front()->obj->getTargetTriple());
   // Compile bitcode files and replace bitcode symbols.
   lto.reset(new BitcodeCompiler(ctx));
   for (BitcodeFile *file : ctx.bitcodeFiles)
     lto->add(*file);
 
-  if (!ctx.bitcodeFiles.empty())
+  llvm::BumpPtrAllocator alloc;
+  llvm::StringSaver saver(alloc);
+  SmallVector<StringRef> bitcodeLibFuncs;
+  if (!ctx.bitcodeFiles.empty()) {
     markBuffersAsDontNeed(ctx, skipLinkedOutput);
+    for (StringRef libFunc : lto::LTO::getLibFuncSymbols(*tt, saver)) {
+      Symbol *sym = ctx.symtab->find(libFunc);
+      if (!sym)
+        continue;
+      if (isa<BitcodeFile>(sym->file))
+        bitcodeLibFuncs.push_back(libFunc);
+    }
+  }
 
-  ltoObjectFiles = lto->compile();
+  ltoObjectFiles = lto->compile(bitcodeLibFuncs);
   for (auto &file : ltoObjectFiles) {
     auto *obj = cast<ObjFile<ELFT>>(file.get());
     obj->parse(/*ignoreComdats=*/true);
diff --git a/lld/ELF/LTO.cpp b/lld/ELF/LTO.cpp
index 80c6d2482f9fa..839eed9956d3a 100644
--- a/lld/ELF/LTO.cpp
+++ b/lld/ELF/LTO.cpp
@@ -311,7 +311,10 @@ static void thinLTOCreateEmptyIndexFiles(Ctx &ctx) {
 
 // Merge all the bitcode files we have seen, codegen the result
 // and return the resulting ObjectFile(s).
-SmallVector<std::unique_ptr<InputFile>, 0> BitcodeCompiler::compile() {
+SmallVector<std::unique_ptr<InputFile>, 0>
+BitcodeCompiler::compile(const SmallVector<StringRef> &bitcodeLibFuncs) {
+  ltoObj->setBitcodeLibFuncs(bitcodeLibFuncs);
+
   unsigned maxTasks = ltoObj->getMaxTasks();
   buf.resize(maxTasks);
   files.resize(maxTasks);
diff --git a/lld/ELF/LTO.h b/lld/ELF/LTO.h
index acf3bcff7f2f1..8207e91460785 100644
--- a/lld/ELF/LTO.h
+++ b/lld/ELF/LTO.h
@@ -42,7 +42,8 @@ class BitcodeCompiler {
   ~BitcodeCompiler();
 
   void add(BitcodeFile &f);
-  SmallVector<std::unique_ptr<InputFile>, 0> compile();
+  SmallVector<std::unique_ptr<InputFile>, 0>
+  compile(const SmallVector<StringRef> &bitcodeLibFuncs);
 
 private:
   Ctx &ctx;
diff --git a/lld/test/ELF/lto/libcall-archive-bitcode.test b/lld/test/ELF/lto/libcall-archive-bitcode.test
new file mode 100644
index 0000000000000..20735b5c89c99
--- /dev/null
+++ b/lld/test/ELF/lto/libcall-archive-bitcode.test
@@ -0,0 +1,41 @@
+; REQUIRES: x86
+
+; RUN: rm -rf %t && split-file %s %t && cd %t
+; RUN: llvm-as main.ll -o main.o
+; RUN: llvm-as bcmp.ll -o bcmp.o
+; RUN: llvm-mc -filetype=obj -triple=x86_64-unknown-linux-gnu memcmp.s -o memcmp.o
+; RUN: llvm-ar rc libc.a bcmp.o memcmp.o
+
+;; Ensure that no memcmp->bcmp translation occurs during LTO because bcmp is in
+;; bitcode, but was not brought into the link. This would fail the link by
+;; extracting bitcode after LTO.
+; RUN: ld.lld -o out main.o -L. -lc
+; RUN: llvm-nm out | FileCheck %s
+
+;--- bcmp.ll
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define i32 @bcmp(ptr %0, ptr %1, i64 %2) {
+  ret i32 0
+}
+
+;--- memcmp.s
+.globl memcmp
+memcmp:
+  ret
+
+;--- main.ll
+target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-i128:128-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-unknown-linux-gnu"
+
+define i1 @_start(ptr %0, ptr %1, i64 %2) {
+  %cmp = call i32 @memcmp(ptr %0, ptr %1, i64 %2)
+  %eq = icmp eq i32 %cmp, 0
+  ret i1 %eq
+}
+
+; CHECK-NOT: bcmp
+; CHECK: memcmp
+declare i32 @memcmp(ptr, ptr, i64)
+
diff --git a/llvm/include/llvm/LTO/LTO.h b/llvm/include/llvm/LTO/LTO.h
index 3a4dc5a3dfcf8..3ee5d455774b3 100644
--- a/llvm/include/llvm/LTO/LTO.h
+++ b/llvm/include/llvm/LTO/LTO.h
@@ -264,7 +264,8 @@ class ThinBackendProc {
 using ThinBackendFunction = std::function<std::unique_ptr<ThinBackendProc>(
     const Config &C, ModuleSummaryIndex &CombinedIndex,
     const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-    AddStreamFn AddStream, FileCache Cache)>;
+    AddStreamFn AddStream, FileCache Cache,
+    const SmallVector<StringRef> &BitcodeLibFuncs)>;
 
 /// This type defines the behavior following the thin-link phase during ThinLTO.
 /// It encapsulates a backend function and a strategy for thread pool
@@ -279,10 +280,11 @@ struct ThinBackend {
   std::unique_ptr<ThinBackendProc> operator()(
       const Config &Conf, ModuleSummaryIndex &CombinedIndex,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-      AddStreamFn AddStream, FileCache Cache) {
+      AddStreamFn AddStream, FileCache Cache,
+      const SmallVector<StringRef> &BitcodeLibFuncs) {
     assert(isValid() && "Invalid backend function");
     return Func(Conf, CombinedIndex, ModuleToDefinedGVSummaries,
-                std::move(AddStream), std::move(Cache));
+                std::move(AddStream), std::move(Cache), BitcodeLibFuncs);
   }
   ThreadPoolStrategy getParallelism() const { return Parallelism; }
   bool isValid() const { return static_cast<bool>(Func); }
@@ -400,6 +402,12 @@ class LTO {
   LLVM_ABI Error add(std::unique_ptr<InputFile> Obj,
                      ArrayRef<SymbolResolution> Res);
 
+  /// Set the list of functions implemented in bitcode across the link, whether
+  /// extracted or not. Such functions may not be referenced if they were not
+  /// extracted by the time LTO occurs.
+  LLVM_ABI void
+  setBitcodeLibFuncs(const SmallVector<StringRef> &BitcodeLibFuncs);
+
   /// Returns an upper bound on the number of tasks that the client may expect.
   /// This may only be called after all IR object files have been added. For a
   /// full description of tasks see LTOBackend.h.
@@ -420,6 +428,14 @@ class LTO {
   LLVM_ABI static SmallVector<const char *>
   getRuntimeLibcallSymbols(const Triple &TT);
 
+  /// Static method that returns a list of library function symbols that can be
+  /// generated by LTO but might not be visible from bitcode symbol table.
+  /// Unlike the runtime libcalls, the linker can report to the code generator
+  /// which of these are actually available in the link, and the code generator
+  /// can then only reference that set of symbols.
+  LLVM_ABI static SmallVector<StringRef>
+  getLibFuncSymbols(const Triple &TT, llvm::StringSaver &Saver);
+
 private:
   Config Conf;
 
@@ -591,6 +607,8 @@ class LTO {
 
   // Diagnostic optimization remarks file
   LLVMRemarkFileHandle DiagnosticOutputFile;
+
+  SmallVector<StringRef> BitcodeLibFuncs;
 };
 
 /// The resolution for a symbol. The linker must provide a SymbolResolution for
diff --git a/llvm/include/llvm/LTO/LTOBackend.h b/llvm/include/llvm/LTO/LTOBackend.h
index 48ad5aa64f61f..6a7d7e0d87ac9 100644
--- a/llvm/include/llvm/LTO/LTOBackend.h
+++ b/llvm/include/llvm/LTO/LTOBackend.h
@@ -39,13 +39,15 @@ LLVM_ABI bool opt(const Config &Conf, TargetMachine *TM, unsigned Task,
                   Module &Mod, bool IsThinLTO,
                   ModuleSummaryIndex *ExportSummary,
                   const ModuleSummaryIndex *ImportSummary,
-                  const std::vector<uint8_t> &CmdArgs);
+                  const std::vector<uint8_t> &CmdArgs,
+                  const SmallVector<StringRef> &BitcodeLibFuncs);
 
 /// Runs a regular LTO backend. The regular LTO backend can also act as the
 /// regular LTO phase of ThinLTO, which may need to access the combined index.
 LLVM_ABI Error backend(const Config &C, AddStreamFn AddStream,
                        unsigned ParallelCodeGenParallelismLevel, Module &M,
-                       ModuleSummaryIndex &CombinedIndex);
+                       ModuleSummaryIndex &CombinedIndex,
+                       const SmallVector<StringRef> &BitcodeLibFuncs);
 
 /// Runs a ThinLTO backend.
 /// If \p ModuleMap is not nullptr, all the module files to be imported have
@@ -62,6 +64,7 @@ thinBackend(const Config &C, unsigned Task, AddStreamFn AddStream, Module &M,
             const FunctionImporter::ImportMapTy &ImportList,
             const GVSummaryMapTy &DefinedGlobals,
             MapVector<StringRef, BitcodeModule> *ModuleMap, bool CodeGenOnly,
+            const SmallVector<StringRef> &BitcodeLibFuncs,
             AddStreamFn IRAddStream = nullptr,
             const std::vector<uint8_t> &CmdArgs = std::vector<uint8_t>());
 
diff --git a/llvm/lib/LTO/LTO.cpp b/llvm/lib/LTO/LTO.cpp
index a02af59600c44..97d3952b05d06 100644
--- a/llvm/lib/LTO/LTO.cpp
+++ b/llvm/lib/LTO/LTO.cpp
@@ -763,6 +763,10 @@ Error LTO::add(std::unique_ptr<InputFile> Input,
   return Error::success();
 }
 
+void LTO::setBitcodeLibFuncs(const SmallVector<StringRef> &BitcodeLibFuncs) {
+  this->BitcodeLibFuncs = BitcodeLibFuncs;
+}
+
 Expected<ArrayRef<SymbolResolution>>
 LTO::addModule(InputFile &Input, ArrayRef<SymbolResolution> InputRes,
                unsigned ModI, ArrayRef<SymbolResolution> Res) {
@@ -1385,9 +1389,9 @@ Error LTO::runRegularLTO(AddStreamFn AddStream) {
   }
 
   if (!RegularLTO.EmptyCombinedModule || Conf.AlwaysEmitRegularLTOObj) {
-    if (Error Err =
-            backend(Conf, AddStream, RegularLTO.ParallelCodeGenParallelismLevel,
-                    *RegularLTO.CombinedModule, ThinLTO.CombinedIndex))
+    if (Error Err = backend(
+            Conf, AddStream, RegularLTO.ParallelCodeGenParallelismLevel,
+            *RegularLTO.CombinedModule, ThinLTO.CombinedIndex, BitcodeLibFuncs))
       return Err;
   }
 
@@ -1407,6 +1411,21 @@ SmallVector<const char *> LTO::getRuntimeLibcallSymbols(const Triple &TT) {
   return LibcallSymbols;
 }
 
+SmallVector<StringRef> LTO::getLibFuncSymbols(const Triple &TT,
+                                              StringSaver &Saver) {
+  auto TLII = std::make_unique<TargetLibraryInfoImpl>(TT);
+  TargetLibraryInfo TLI(*TLII);
+  SmallVector<StringRef> LibFuncSymbols;
+  LibFuncSymbols.reserve(LibFunc::NumLibFuncs);
+  for (unsigned I = 0, E = static_cast<unsigned>(LibFunc::NumLibFuncs); I != E;
+       ++I) {
+    LibFunc F = static_cast<LibFunc>(I);
+    if (TLI.has(F))
+      LibFuncSymbols.push_back(Saver.save(TLI.getName(F)).data());
+  }
+  return LibFuncSymbols;
+}
+
 Error ThinBackendProc::emitFiles(
     const FunctionImporter::ImportMapTy &ImportList, llvm::StringRef ModulePath,
     const std::string &NewModulePath) const {
@@ -1484,6 +1503,7 @@ class CGThinBackend : public ThinBackendProc {
 class InProcessThinBackend : public CGThinBackend {
 protected:
   FileCache Cache;
+  const SmallVector<StringRef> &BitcodeLibFuncs;
 
 public:
   InProcessThinBackend(
@@ -1491,11 +1511,12 @@ class InProcessThinBackend : public CGThinBackend {
       ThreadPoolStrategy ThinLTOParallelism,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
       AddStreamFn AddStream, FileCache Cache, lto::IndexWriteCallback OnWrite,
-      bool ShouldEmitIndexFiles, bool ShouldEmitImportsFiles)
+      bool ShouldEmitIndexFiles, bool ShouldEmitImportsFiles,
+      const SmallVector<StringRef> &BitcodeLibFuncs)
       : CGThinBackend(Conf, CombinedIndex, ModuleToDefinedGVSummaries,
                       AddStream, OnWrite, ShouldEmitIndexFiles,
                       ShouldEmitImportsFiles, ThinLTOParallelism),
-        Cache(std::move(Cache)) {}
+        Cache(std::move(Cache)), BitcodeLibFuncs(BitcodeLibFuncs) {}
 
   virtual Error runThinLTOBackendThread(
       AddStreamFn AddStream, FileCache Cache, unsigned Task, BitcodeModule BM,
@@ -1516,7 +1537,7 @@ class InProcessThinBackend : public CGThinBackend {
 
       return thinBackend(Conf, Task, AddStream, **MOrErr, CombinedIndex,
                          ImportList, DefinedGlobals, &ModuleMap,
-                         Conf.CodeGenOnly);
+                         Conf.CodeGenOnly, BitcodeLibFuncs);
     };
     if (ShouldEmitIndexFiles) {
       if (auto E = emitFiles(ImportList, ModuleID, ModuleID.str()))
@@ -1601,13 +1622,14 @@ class FirstRoundThinBackend : public InProcessThinBackend {
       const Config &Conf, ModuleSummaryIndex &CombinedIndex,
       ThreadPoolStrategy ThinLTOParallelism,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-      AddStreamFn CGAddStream, FileCache CGCache, AddStreamFn IRAddStream,
+      AddStreamFn CGAddStream, FileCache CGCache,
+      const SmallVector<StringRef> &BitcodeLibFuncs, AddStreamFn IRAddStream,
       FileCache IRCache)
       : InProcessThinBackend(Conf, CombinedIndex, ThinLTOParallelism,
                              ModuleToDefinedGVSummaries, std::move(CGAddStream),
                              std::move(CGCache), /*OnWrite=*/nullptr,
                              /*ShouldEmitIndexFiles=*/false,
-                             /*ShouldEmitImportsFiles=*/false),
+                             /*ShouldEmitImportsFiles=*/false, BitcodeLibFuncs),
         IRAddStream(std::move(IRAddStream)), IRCache(std::move(IRCache)) {}
 
   Error runThinLTOBackendThread(
@@ -1630,7 +1652,7 @@ class FirstRoundThinBackend : public InProcessThinBackend {
 
       return thinBackend(Conf, Task, CGAddStream, **MOrErr, CombinedIndex,
                          ImportList, DefinedGlobals, &ModuleMap,
-                         Conf.CodeGenOnly, IRAddStream);
+                         Conf.CodeGenOnly, BitcodeLibFuncs, IRAddStream);
     };
     // Like InProcessThinBackend, we produce index files as needed for
     // FirstRoundThinBackend. However, these files are not generated for
@@ -1697,6 +1719,7 @@ class SecondRoundThinBackend : public InProcessThinBackend {
       ThreadPoolStrategy ThinLTOParallelism,
       const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
       AddStreamFn AddStream, FileCache Cache,
+      const SmallVector<StringRef> &BitcodeLibFuncs,
       std::unique_ptr<SmallVector<StringRef>> IRFiles,
       stable_hash CombinedCGDataHash)
       : InProcessThinBackend(Conf, CombinedIndex, ThinLTOParallelism,
@@ -1704,7 +1727,7 @@ class SecondRoundThinBackend : public InProcessThinBackend {
                              std::move(Cache),
                              /*OnWrite=*/nullptr,
                              /*ShouldEmitIndexFiles=*/false,
-                             /*ShouldEmitImportsFiles=*/false),
+                             /*ShouldEmitImportsFiles=*/false, BitcodeLibFuncs),
         IRFiles(std::move(IRFiles)), CombinedCGDataHash(CombinedCGDataHash) {}
 
   Error runThinLTOBackendThread(
@@ -1725,7 +1748,7 @@ class SecondRoundThinBackend : public InProcessThinBackend {
 
       return thinBackend(Conf, Task, AddStream, *LoadedModule, CombinedIndex,
                          ImportList, DefinedGlobals, &ModuleMap,
-                         /*CodeGenOnly=*/true);
+                         /*CodeGenOnly=*/true, BitcodeLibFuncs);
     };
     if (!Cache.isValid() || !CombinedIndex.modulePaths().count(ModuleID) ||
         all_of(CombinedIndex.getModuleHash(ModuleID),
@@ -1764,11 +1787,12 @@ ThinBackend lto::createInProcessThinBackend(ThreadPoolStrategy Parallelism,
   auto Func =
       [=](const Config &Conf, ModuleSummaryIndex &CombinedIndex,
           const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-          AddStreamFn AddStream, FileCache Cache) {
+          AddStreamFn AddStream, FileCache Cache,
+          const SmallVector<StringRef> &BitcodeLibFuncs) {
         return std::make_unique<InProcessThinBackend>(
             Conf, CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
             AddStream, Cache, OnWrite, ShouldEmitIndexFiles,
-            ShouldEmitImportsFiles);
+            ShouldEmitImportsFiles, BitcodeLibFuncs);
       };
   return ThinBackend(Func, Parallelism);
 }
@@ -1885,7 +1909,8 @@ ThinBackend lto::createWriteIndexesThinBackend(
   auto Func =
       [=](const Config &Conf, ModuleSummaryIndex &CombinedIndex,
           const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-          AddStreamFn AddStream, FileCache Cache) {
+          AddStreamFn AddStream, FileCache Cache,
+          const SmallVector<StringRef> &BitcodeLibFuncs) {
         return std::make_unique<WriteIndexesThinBackend>(
             Conf, CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
             OldPrefix, NewPrefix, NativeObjectPrefix, ShouldEmitImportsFiles,
@@ -2103,7 +2128,7 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
   if (!CodeGenDataThinLTOTwoRounds) {
     std::unique_ptr<ThinBackendProc> BackendProc =
         ThinLTO.Backend(Conf, ThinLTO.CombinedIndex, ModuleToDefinedGVSummaries,
-                        AddStream, Cache);
+                        AddStream, Cache, BitcodeLibFuncs);
     return RunBackends(BackendProc.get());
   }
 
@@ -2126,7 +2151,7 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
   LLVM_DEBUG(dbgs() << "[TwoRounds] Running the first round of codegen\n");
   auto FirstRoundLTO = std::make_unique<FirstRoundThinBackend>(
       Conf, ThinLTO.CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
-      CG.AddStream, CG.Cache, IR.AddStream, IR.Cache);
+      CG.AddStream, CG.Cache, BitcodeLibFuncs, IR.AddStream, IR.Cache);
   if (Error E = RunBackends(FirstRoundLTO.get()))
     return E;
 
@@ -2142,7 +2167,7 @@ Error LTO::runThinLTO(AddStreamFn AddStream, FileCache Cache,
   LLVM_DEBUG(dbgs() << "[TwoRounds] Running the second round of codegen\n");
   auto SecondRoundLTO = std::make_unique<SecondRoundThinBackend>(
       Conf, ThinLTO.CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
-      AddStream, Cache, IR.getResult(), CombinedHash);
+      AddStream, Cache, BitcodeLibFuncs, IR.getResult(), CombinedHash);
   return RunBackends(SecondRoundLTO.get());
 }
 
@@ -2620,7 +2645,8 @@ ThinBackend lto::createOutOfProcessThinBackend(
   auto Func =
       [=](const Config &Conf, ModuleSummaryIndex &CombinedIndex,
           const DenseMap<StringRef, GVSummaryMapTy> &ModuleToDefinedGVSummaries,
-          AddStreamFn AddStream, FileCache Cache) {
+          AddStreamFn AddStream, FileCache Cache,
+          const SmallVector<StringRef> &BitcodeLibFuncs) {
         return std::make_unique<OutOfProcessThinBackend>(
             Conf, CombinedIndex, Parallelism, ModuleToDefinedGVSummaries,
             AddStream, Cache, OnWrite, ShouldEmitIndexFiles,
diff --git a/llvm/lib/LTO/LTOBackend.cpp b/llvm/lib/LTO/LTOBackend.cpp
index 93118becedbac..bd91249db50ed 100644
--- a/llvm/lib/LTO/LTOBackend.cpp
+++ b/llvm/lib/LTO/LTOBackend.cpp
@@ -239,7 +239,8 @@ createTargetMachine(const Config &Conf, const Target *TheTarget, Module &M) {
 static void runNewPMPasses(const Config &Conf, Module &Mod, TargetMachine *TM,
                            unsigned OptLevel, bool IsThinLTO,
                            ModuleSummaryIndex *ExportSummary,
-                           const ModuleSummaryIndex *ImportSummary) {
+                           const ModuleSummaryIndex *ImportSummary,
+    ...
[truncated]

@Andarwinux
Copy link
Contributor

In order to fix LTO+llvm-libc, is it only the ELF backend of LLD that needs special handling? Or only ELF is considered as a prototype for now? I experimented with LTO+overlay llvm-libc+-fbuiltin on Windows/COFF and everything seems to work fine, even without these changes.

@mysterymath
Copy link
Contributor Author

mysterymath commented Dec 9, 2025

In order to fix LTO+llvm-libc, is it only the ELF backend of LLD that needs special handling? Or only ELF is considered as a prototype for now? I experimented with LTO+overlay llvm-libc+-fbuiltin on Windows/COFF and everything seems to work fine, even without these changes.

I'll admit; I haven't spent much time looking at the other file formats. I suspect that all of them would have this issue, but I'd need to spend some time on them, and I didn't want that to block getting this out for review. This change should still partially fix the semantics, since effect (2) in the description doesn't need any linker support.

It's worth noting that LTO-ing in libc often works without encountering this issue. Most libcall->libcall transforms occur in compilcation before LTO; these are fine. The trouble only occurs when libcall->libcall transforms occur during LTO; and even then only a subset of these transforms.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still leaves open what I guess I'll refer to as the "malloc problem", although it isn't entirely specific to malloc. LLVM makes assumptions about libc functions which aren't valid if you can see the implementation. Like, BasicAAResult::getModRefInfo assumes that malloc doesn't modify memory... but that's only correct if the optimizer can't see the global variables used to represent the heap.

That said, I think this is a step in the right direction.


// FIXME: Functions that are somewhere in a ThinLTO link (just not imported
// in this module) should not be disabled, as they have already been
// extracted.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you need some additional datastructure for this, to track the complete list of functions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, something like that. I punted on this because I mostly care about the FullLTO case in practice for baremetal libc LTO usage, but I do want to get back to this. I don't think it'll be horrible, but I wasn't sure about the right distribution of responsibility between the linker and LTO library. This seemed like an easy cop-out in the meantime.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is something we should think about filling in as part of the symbol resolution step? I wonder if we could make it work by just setting some bits in the symtab?

@mysterymath
Copy link
Contributor Author

This still leaves open what I guess I'll refer to as the "malloc problem", although it isn't entirely specific to malloc. LLVM makes assumptions about libc functions which aren't valid if you can see the implementation. Like, BasicAAResult::getModRefInfo assumes that malloc doesn't modify memory... but that's only correct if the optimizer can't see the global variables used to represent the heap.

Yeah, I think we'd need to go through piecemeal looking for optimizations that assume a fully-external libc in this way and relax the assumption. The bitcode libfuncs list here may assist with this. I think that's been the shift in our model somewhat: treating LTO-ing libc as something that should work, but is currently buggy, then looking at what specifically the bugs are. As the commenter above noted, it does already kinda work. This patch just eliminates a particular class of "bugs".

Copy link
Collaborator

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The LLD changes look good to me, I have a few small suggestions. I know less about the LTO side so I'll that for other reviewers.

}

ltoObjectFiles = lto->compile();
ltoObjectFiles = lto->compile(bitcodeLibFuncs);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the callsite alone this could be interpreted as just compiling the bitcodeLibFuncs.

Suggestions:

  • Does the ltoObj->setBitcodeLibFuncs(bitcodeLibFuncs); need to be done in compile? Perhaps add a lto->setBitcodeLibFuncs(bitcodeLibFuncs) as it looks like ltoObj is private.
  • Rename to compileObjectsAndLibFuncs(bitcodeLibFuncs).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former seems pretty okay; done.

template <class ELFT>
void LinkerDriver::compileBitcodeFiles(bool skipLinkedOutput) {
llvm::TimeTraceScope timeScope("LTO");
// Capture the triple before moving the bitcode into the bitcode compiler.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are making an assumption that the triple is homogenous (modulo libcall differences) across all the bitcode files. Ideally this would be the case, but build systems being build systems there could be some differences, I'm thinking about normalised triples where -march might differ.

Looking at getLibFuncSymbols it seems like most if not all the variation in libcall availability is in the environment and I wouldn't expect that to vary across the build system.

I don't think we need to change anything here, but could be worth stating our assumptions in a comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point; we may need to relax this later, but we can do that with a union if need be. Added a comment.

if (!ctx.bitcodeFiles.empty()) {
markBuffersAsDontNeed(ctx, skipLinkedOutput);
for (StringRef libFunc : lto::LTO::getLibFuncSymbols(*tt, saver)) {
Symbol *sym = ctx.symtab->find(libFunc);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the test for sym == nullptr be combined with the find?

if (Symbol *sym = ctx.symtab->find(libFunc))
  if (isa<BitcodeFile>(sym->file))
        bitcodeLibFuncs.push_back(libFunc);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I was able to smash it down a bit more even; done.

Comment on lines +2707 to +2709
std::optional<llvm::Triple> tt;
if (!ctx.bitcodeFiles.empty())
tt = llvm::Triple(ctx.bitcodeFiles.front()->obj->getTargetTriple());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't all of this just move into the block below? I think you also can drop the optional, right, since if you only do this in that block there has to be a triple of some kind...


// FIXME: Functions that are somewhere in a ThinLTO link (just not imported
// in this module) should not be disabled, as they have already been
// extracted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe this is something we should think about filling in as part of the symbol resolution step? I wonder if we could make it work by just setting some bits in the symtab?

@@ -365,7 +387,8 @@ static bool isEmptyModule(const Module &Mod) {
bool lto::opt(const Config &Conf, TargetMachine *TM, unsigned Task, Module &Mod,
bool IsThinLTO, ModuleSummaryIndex *ExportSummary,
const ModuleSummaryIndex *ImportSummary,
const std::vector<uint8_t> &CmdArgs) {
const std::vector<uint8_t> &CmdArgs,
const SmallVector<StringRef> &BitcodeLibFuncs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is never modified, it can be an ArrayRef<StringRef>, right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang:codegen IR generation bugs: mangling, exceptions, etc. lld:ELF lld llvm:binary-utilities LTO Link time optimization (regular/full LTO or ThinLTO)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants