Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[lld][WebAssembly] Match the ELF linker in transitioning away from archive indexes. #78658

Merged
merged 1 commit into from
Jan 20, 2024

Conversation

sbc100
Copy link
Collaborator

@sbc100 sbc100 commented Jan 19, 2024

The ELF linker transitioned away from archive indexes in https://reviews.llvm.org/D117284.

This paves the way for supporting --start-lib/--end-lib (See #77960)

The ELF linker unified library handling with --start-lib/--end-lib and removed
the ArchiveFile class in https://reviews.llvm.org/D119074.

@sbc100 sbc100 changed the title [lld][WebAssembly] Match the ELF linker in transitioning away from ar… [lld][WebAssembly] Match the ELF linker in transitioning away from archive indexes. Jan 19, 2024
@sbc100 sbc100 requested a review from dschuff January 19, 2024 02:01
@llvmbot
Copy link
Member

llvmbot commented Jan 19, 2024

@llvm/pr-subscribers-lld

@llvm/pr-subscribers-lld-wasm

Author: Sam Clegg (sbc100)

Changes

The ELF linker transitioned away from archive indexes in https://reviews.llvm.org/D117284.

This paves the way for supporting --start-lib/--end-lib (See #77960)


Patch is 25.24 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/78658.diff

9 Files Affected:

  • (removed) lld/test/wasm/archive-no-index.s (-14)
  • (modified) lld/test/wasm/bad-archive-member.s (+1-1)
  • (modified) lld/wasm/Driver.cpp (+25-24)
  • (modified) lld/wasm/InputFiles.cpp (+42-52)
  • (modified) lld/wasm/InputFiles.h (+16-24)
  • (modified) lld/wasm/SymbolTable.cpp (+19-17)
  • (modified) lld/wasm/SymbolTable.h (+1-1)
  • (modified) lld/wasm/Symbols.cpp (+10-16)
  • (modified) lld/wasm/Symbols.h (+10-15)
diff --git a/lld/test/wasm/archive-no-index.s b/lld/test/wasm/archive-no-index.s
deleted file mode 100644
index 99ca5a367d3c6d4..000000000000000
--- a/lld/test/wasm/archive-no-index.s
+++ /dev/null
@@ -1,14 +0,0 @@
-# Tests error on archive file without a symbol table
-# RUN: llvm-mc -filetype=obj -triple=wasm32-unknown-unknown -o %t.o %s
-# RUN: llvm-as -o %t.archive.o %S/Inputs/archive1.ll
-# RUN: rm -f %t.a
-# RUN: llvm-ar crS %t.a %t.archive.o
-
-# RUN: not wasm-ld -o out.wasm %t.o %t.a 2>&1 | FileCheck %s
-
-  .globl  _start
-_start:
-  .functype _start () -> ()
-  end_function
-
-# CHECK: archive has no index; run ranlib to add one
diff --git a/lld/test/wasm/bad-archive-member.s b/lld/test/wasm/bad-archive-member.s
index 029027a8517a364..77bf16871ca5b58 100644
--- a/lld/test/wasm/bad-archive-member.s
+++ b/lld/test/wasm/bad-archive-member.s
@@ -5,7 +5,7 @@
 # RUN: llvm-mc -filetype=obj -triple=x86_64-pc-linux -o %t.dir/elf.o %s
 # RUN: llvm-ar rcs %t.dir/libfoo.a %t.dir/elf.o
 # RUN: not wasm-ld %t.dir/libfoo.a -o /dev/null 2>&1 | FileCheck %s
-# CHECK: error: unknown file type: {{.*}}libfoo.a(elf.o)
+# CHECK: warning: {{.*}}libfoo.a: archive member 'elf.o' is neither Wasm object file nor LLVM bitcode
 
 .globl _start
 _start:
diff --git a/lld/wasm/Driver.cpp b/lld/wasm/Driver.cpp
index 4a4f9a96227946d..9e038b6db652818 100644
--- a/lld/wasm/Driver.cpp
+++ b/lld/wasm/Driver.cpp
@@ -218,19 +218,20 @@ static void readImportFile(StringRef filename) {
 
 // Returns slices of MB by parsing MB as an archive file.
 // Each slice consists of a member file in the archive.
-std::vector<MemoryBufferRef> static getArchiveMembers(MemoryBufferRef mb) {
+std::vector<std::pair<MemoryBufferRef, uint64_t>> static getArchiveMembers(
+    MemoryBufferRef mb) {
   std::unique_ptr<Archive> file =
       CHECK(Archive::create(mb),
             mb.getBufferIdentifier() + ": failed to parse archive");
 
-  std::vector<MemoryBufferRef> v;
+  std::vector<std::pair<MemoryBufferRef, uint64_t>> v;
   Error err = Error::success();
   for (const Archive::Child &c : file->children(err)) {
     MemoryBufferRef mbref =
         CHECK(c.getMemoryBufferRef(),
               mb.getBufferIdentifier() +
                   ": could not get the buffer for a child of the archive");
-    v.push_back(mbref);
+    v.push_back(std::make_pair(mbref, c.getChildOffset()));
   }
   if (err)
     fatal(mb.getBufferIdentifier() +
@@ -256,10 +257,12 @@ void LinkerDriver::addFile(StringRef path) {
     if (fs::exists(importFile))
       readImportFile(importFile.str());
 
+    auto members = getArchiveMembers(mbref);
+
     // Handle -whole-archive.
     if (inWholeArchive) {
-      for (MemoryBufferRef &m : getArchiveMembers(mbref)) {
-        auto *object = createObjectFile(m, path);
+      for (const std::pair<MemoryBufferRef, uint64_t> &p : members) {
+        auto *object = createObjectFile(p.first, path);
         // Mark object as live; object members are normally not
         // live by default but -whole-archive is designed to treat
         // them as such.
@@ -273,17 +276,22 @@ void LinkerDriver::addFile(StringRef path) {
     std::unique_ptr<Archive> file =
         CHECK(Archive::create(mbref), path + ": failed to parse archive");
 
-    if (!file->isEmpty() && !file->hasSymbolTable()) {
-      error(mbref.getBufferIdentifier() +
-            ": archive has no index; run ranlib to add one");
+    for (const std::pair<MemoryBufferRef, uint64_t> &p : members) {
+      auto magic = identify_magic(p.first.getBuffer());
+      if (magic == file_magic::wasm_object) {
+        files.push_back(createObjectFile(p.first, path, 0, true));
+      } else if (magic == file_magic::bitcode)
+        files.push_back(make<BitcodeFile>(p.first, path, p.second, true));
+      else
+        warn(path + ": archive member '" + p.first.getBufferIdentifier() +
+             "' is neither Wasm object file nor LLVM bitcode");
     }
 
-    files.push_back(make<ArchiveFile>(mbref));
     return;
   }
   case file_magic::bitcode:
   case file_magic::wasm_object:
-    files.push_back(createObjectFile(mbref));
+    files.push_back(createObjectFile(mbref, "", 0));
     break;
   case file_magic::unknown:
     if (mbref.getBuffer().starts_with("#STUB")) {
@@ -705,7 +713,7 @@ static Symbol *handleUndefined(StringRef name, const char *option) {
   // eliminate it. Mark the symbol as "used" to prevent it.
   sym->isUsedInRegularObj = true;
 
-  if (auto *lazySym = dyn_cast<LazySymbol>(sym)) {
+  if (auto *lazySym = dyn_cast<LazyObject>(sym)) {
     lazySym->extract();
     if (!config->whyExtract.empty())
       ctx.whyExtractRecords.emplace_back(option, sym->getFile(), *sym);
@@ -716,17 +724,10 @@ static Symbol *handleUndefined(StringRef name, const char *option) {
 
 static void handleLibcall(StringRef name) {
   Symbol *sym = symtab->find(name);
-  if (!sym)
-    return;
-
-  if (auto *lazySym = dyn_cast<LazySymbol>(sym)) {
-    MemoryBufferRef mb = lazySym->getMemberBuffer();
-    if (isBitcode(mb)) {
-      if (!config->whyExtract.empty())
-        ctx.whyExtractRecords.emplace_back("<libcall>", sym->getFile(), *sym);
-      lazySym->extract();
-    }
-  }
+  if (sym)
+    if (auto lazySym = dyn_cast<LazyObject>(sym))
+      if (isa<BitcodeFile>(lazySym->getFile()))
+        lazySym->extract();
 }
 
 static void writeWhyExtract() {
@@ -751,7 +752,7 @@ static void writeWhyExtract() {
 // Equivalent of demote demoteSharedAndLazySymbols() in the ELF linker
 static void demoteLazySymbols() {
   for (Symbol *sym : symtab->symbols()) {
-    if (auto* s = dyn_cast<LazySymbol>(sym)) {
+    if (auto* s = dyn_cast<LazyObject>(sym)) {
       if (s->signature) {
         LLVM_DEBUG(llvm::dbgs()
                    << "demoting lazy func: " << s->getName() << "\n");
@@ -963,7 +964,7 @@ static void processStubLibraries() {
               LLVM_DEBUG(llvm::dbgs()
                          << "force export: " << toString(*needed) << "\n");
             needed->forceExport = true;
-            if (auto *lazy = dyn_cast<LazySymbol>(needed)) {
+            if (auto *lazy = dyn_cast<LazyObject>(needed)) {
               depsAdded = true;
               lazy->extract();
               if (!config->whyExtract.empty())
diff --git a/lld/wasm/InputFiles.cpp b/lld/wasm/InputFiles.cpp
index 19c76e490278962..92a1694ae050676 100644
--- a/lld/wasm/InputFiles.cpp
+++ b/lld/wasm/InputFiles.cpp
@@ -75,7 +75,8 @@ std::optional<MemoryBufferRef> readFile(StringRef path) {
 }
 
 InputFile *createObjectFile(MemoryBufferRef mb, StringRef archiveName,
-                            uint64_t offsetInArchive) {
+                            uint64_t offsetInArchive, bool lazy) {
+  log("createObjectFile " + archiveName + " lazy=" + Twine(lazy));
   file_magic magic = identify_magic(mb.getBuffer());
   if (magic == file_magic::wasm_object) {
     std::unique_ptr<Binary> bin =
@@ -83,18 +84,11 @@ InputFile *createObjectFile(MemoryBufferRef mb, StringRef archiveName,
     auto *obj = cast<WasmObjectFile>(bin.get());
     if (obj->isSharedObject())
       return make<SharedFile>(mb);
-    return make<ObjFile>(mb, archiveName);
+    return make<ObjFile>(mb, archiveName, lazy);
   }
 
-  if (magic == file_magic::bitcode)
-    return make<BitcodeFile>(mb, archiveName, offsetInArchive);
-
-  std::string name = mb.getBufferIdentifier().str();
-  if (!archiveName.empty()) {
-    name = archiveName.str() + "(" + name + ")";
-  }
-
-  fatal("unknown file type: " + name);
+  assert(magic == file_magic::bitcode);
+  return make<BitcodeFile>(mb, archiveName, offsetInArchive, lazy);
 }
 
 // Relocations contain either symbol or type indices.  This function takes a
@@ -391,9 +385,22 @@ static bool shouldMerge(const WasmSegment &seg) {
   return true;
 }
 
-void ObjFile::parse(bool ignoreComdats) {
-  // Parse a memory buffer as a wasm file.
-  LLVM_DEBUG(dbgs() << "Parsing object: " << toString(this) << "\n");
+void ObjFile::parseLazy() {
+  LLVM_DEBUG(dbgs() << "ObjFile::parseLazy: " << toString(this) << "\n");
+  for (const SymbolRef &sym : wasmObj->symbols()) {
+    const WasmSymbol &wasmSym = wasmObj->getWasmSymbol(sym.getRawDataRefImpl());
+    if (!wasmSym.isDefined())
+      continue;
+    symtab->addLazy(wasmSym.Info.Name, this);
+    // addLazy() may trigger this->extract() if an existing symbol is an undefined
+    // symbol. If that happens, this function has served its purpose, and we can
+    // exit from the loop early.
+    if (!lazy)
+      break;
+  }
+}
+
+void ObjFile::init() {
   std::unique_ptr<Binary> bin = CHECK(createBinary(mb), toString(this));
 
   auto *obj = dyn_cast<WasmObjectFile>(bin.get());
@@ -406,6 +413,11 @@ void ObjFile::parse(bool ignoreComdats) {
   wasmObj.reset(obj);
 
   checkArch(obj->getArch());
+}
+
+void ObjFile::parse(bool ignoreComdats) {
+  // Parse a memory buffer as a wasm file.
+  LLVM_DEBUG(dbgs() << "ObjFile::parse: " << toString(this) << "\n");
 
   // Build up a map of function indices to table indices for use when
   // verifying the existing table index relocations
@@ -717,43 +729,6 @@ void StubFile::parse() {
   }
 }
 
-void ArchiveFile::parse() {
-  // Parse a MemoryBufferRef as an archive file.
-  LLVM_DEBUG(dbgs() << "Parsing library: " << toString(this) << "\n");
-  file = CHECK(Archive::create(mb), toString(this));
-
-  // Read the symbol table to construct Lazy symbols.
-  int count = 0;
-  for (const Archive::Symbol &sym : file->symbols()) {
-    symtab->addLazy(this, &sym);
-    ++count;
-  }
-  LLVM_DEBUG(dbgs() << "Read " << count << " symbols\n");
-  (void) count;
-}
-
-void ArchiveFile::addMember(const Archive::Symbol *sym) {
-  const Archive::Child &c =
-      CHECK(sym->getMember(),
-            "could not get the member for symbol " + sym->getName());
-
-  // Don't try to load the same member twice (this can happen when members
-  // mutually reference each other).
-  if (!seen.insert(c.getChildOffset()).second)
-    return;
-
-  LLVM_DEBUG(dbgs() << "loading lazy: " << sym->getName() << "\n");
-  LLVM_DEBUG(dbgs() << "from archive: " << toString(this) << "\n");
-
-  MemoryBufferRef mb =
-      CHECK(c.getMemoryBufferRef(),
-            "could not get the buffer for the member defining symbol " +
-                sym->getName());
-
-  InputFile *obj = createObjectFile(mb, getName(), c.getChildOffset());
-  symtab->addFile(obj, sym->getName());
-}
-
 static uint8_t mapVisibility(GlobalValue::VisibilityTypes gvVisibility) {
   switch (gvVisibility) {
   case GlobalValue::DefaultVisibility:
@@ -790,8 +765,9 @@ static Symbol *createBitcodeSymbol(const std::vector<bool> &keptComdats,
 }
 
 BitcodeFile::BitcodeFile(MemoryBufferRef m, StringRef archiveName,
-                         uint64_t offsetInArchive)
+                         uint64_t offsetInArchive, bool lazy)
     : InputFile(BitcodeKind, m) {
+  this->lazy = lazy;
   this->archiveName = std::string(archiveName);
 
   std::string path = mb.getBufferIdentifier().str();
@@ -817,6 +793,20 @@ BitcodeFile::BitcodeFile(MemoryBufferRef m, StringRef archiveName,
 
 bool BitcodeFile::doneLTO = false;
 
+void BitcodeFile::parseLazy() {
+  for (auto [i, irSym] : llvm::enumerate(obj->symbols())) {
+    if (irSym.isUndefined())
+      continue;
+    StringRef name = saver().save(irSym.getName());
+    symtab->addLazy(name, this);
+    // addLazy() may trigger this->extract() if an existing symbol is an
+    // undefined symbol. If that happens, this function has served its purpose,
+    // and we can exit from the loop early.
+    if (!lazy)
+      break;
+  }
+}
+
 void BitcodeFile::parse(StringRef symName) {
   if (doneLTO) {
     error(toString(this) + ": attempt to add bitcode file after LTO (" + symName + ")");
diff --git a/lld/wasm/InputFiles.h b/lld/wasm/InputFiles.h
index d9a8b5306603249..4963a7bd95bc24e 100644
--- a/lld/wasm/InputFiles.h
+++ b/lld/wasm/InputFiles.h
@@ -14,7 +14,6 @@
 #include "llvm/ADT/DenseMap.h"
 #include "llvm/ADT/DenseSet.h"
 #include "llvm/LTO/LTO.h"
-#include "llvm/Object/Archive.h"
 #include "llvm/Object/Wasm.h"
 #include "llvm/Support/MemoryBuffer.h"
 #include "llvm/TargetParser/Triple.h"
@@ -36,6 +35,9 @@ class InputTag;
 class InputTable;
 class InputSection;
 
+// Add symbols in File to the symbol table.
+void parseFile(InputFile *file);
+
 // If --reproduce option is given, all input files are written
 // to this tar archive.
 extern std::unique_ptr<llvm::TarWriter> tar;
@@ -45,7 +47,6 @@ class InputFile {
   enum Kind {
     ObjectKind,
     SharedKind,
-    ArchiveKind,
     BitcodeKind,
     StubKind,
   };
@@ -69,6 +70,10 @@ class InputFile {
   void markLive() { live = true; }
   bool isLive() const { return live; }
 
+  // True if this is a relocatable object file/bitcode file between --start-lib
+  // and --end-lib.
+  bool lazy = false;
+
 protected:
   InputFile(Kind k, MemoryBufferRef m)
       : mb(m), fileKind(k), live(!config->gcSections) {}
@@ -85,35 +90,24 @@ class InputFile {
   bool live;
 };
 
-// .a file (ar archive)
-class ArchiveFile : public InputFile {
-public:
-  explicit ArchiveFile(MemoryBufferRef m) : InputFile(ArchiveKind, m) {}
-  static bool classof(const InputFile *f) { return f->kind() == ArchiveKind; }
-
-  void addMember(const llvm::object::Archive::Symbol *sym);
-
-  void parse();
-
-private:
-  std::unique_ptr<llvm::object::Archive> file;
-  llvm::DenseSet<uint64_t> seen;
-};
-
 // .o file (wasm object file)
 class ObjFile : public InputFile {
 public:
-  explicit ObjFile(MemoryBufferRef m, StringRef archiveName)
+  explicit ObjFile(MemoryBufferRef m, StringRef archiveName, bool lazy = false)
       : InputFile(ObjectKind, m) {
+    this->lazy = lazy;
     this->archiveName = std::string(archiveName);
 
     // If this isn't part of an archive, it's eagerly linked, so mark it live.
     if (archiveName.empty())
       markLive();
+
+    init();
   }
   static bool classof(const InputFile *f) { return f->kind() == ObjectKind; }
 
   void parse(bool ignoreComdats = false);
+  void parseLazy();
 
   // Returns the underlying wasm file.
   const WasmObjectFile *getWasmObj() const { return wasmObj.get(); }
@@ -153,6 +147,7 @@ class ObjFile : public InputFile {
   TableSymbol *getTableSymbol(uint32_t index) const;
 
 private:
+  void init();
   Symbol *createDefined(const WasmSymbol &sym);
   Symbol *createUndefined(const WasmSymbol &sym, bool isCalledDirectly);
 
@@ -173,10 +168,11 @@ class SharedFile : public InputFile {
 class BitcodeFile : public InputFile {
 public:
   BitcodeFile(MemoryBufferRef m, StringRef archiveName,
-              uint64_t offsetInArchive);
+              uint64_t offsetInArchive, bool lazy);
   static bool classof(const InputFile *f) { return f->kind() == BitcodeKind; }
 
   void parse(StringRef symName);
+  void parseLazy();
   std::unique_ptr<llvm::lto::InputFile> obj;
 
   // Set to true once LTO is complete in order prevent further bitcode objects
@@ -196,14 +192,10 @@ class StubFile : public InputFile {
   llvm::DenseMap<StringRef, std::vector<StringRef>> symbolDependencies;
 };
 
-inline bool isBitcode(MemoryBufferRef mb) {
-  return identify_magic(mb.getBuffer()) == llvm::file_magic::bitcode;
-}
-
 // Will report a fatal() error if the input buffer is not a valid bitcode
 // or wasm object file.
 InputFile *createObjectFile(MemoryBufferRef mb, StringRef archiveName = "",
-                            uint64_t offsetInArchive = 0);
+                            uint64_t offsetInArchive = 0, bool lazy = false);
 
 // Opens a given file.
 std::optional<MemoryBufferRef> readFile(StringRef path);
diff --git a/lld/wasm/SymbolTable.cpp b/lld/wasm/SymbolTable.cpp
index 9988490e14b0bc1..bc2685564c8dacb 100644
--- a/lld/wasm/SymbolTable.cpp
+++ b/lld/wasm/SymbolTable.cpp
@@ -24,11 +24,15 @@ namespace lld::wasm {
 SymbolTable *symtab;
 
 void SymbolTable::addFile(InputFile *file, StringRef symName) {
-  log("Processing: " + toString(file));
+  log("Processing " + toString(file) + " (lazy=" + Twine(file->lazy) + " symName=" + symName + ")");
 
-  // .a file
-  if (auto *f = dyn_cast<ArchiveFile>(file)) {
-    f->parse();
+  // Lazy object file
+  if (file->lazy) {
+    if (auto *f = dyn_cast<BitcodeFile>(file)) {
+      f->parseLazy();
+    } else {
+      cast<ObjFile>(file)->parseLazy();
+    }
     return;
   }
 
@@ -528,7 +532,7 @@ Symbol *SymbolTable::addUndefinedFunction(StringRef name,
 
   if (wasInserted) {
     replaceSym();
-  } else if (auto *lazy = dyn_cast<LazySymbol>(s)) {
+  } else if (auto *lazy = dyn_cast<LazyObject>(s)) {
     if ((flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_WEAK) {
       lazy->setWeak();
       lazy->signature = sig;
@@ -581,7 +585,7 @@ Symbol *SymbolTable::addUndefinedData(StringRef name, uint32_t flags,
 
   if (wasInserted) {
     replaceSymbol<UndefinedData>(s, name, flags, file);
-  } else if (auto *lazy = dyn_cast<LazySymbol>(s)) {
+  } else if (auto *lazy = dyn_cast<LazyObject>(s)) {
     if ((flags & WASM_SYMBOL_BINDING_MASK) == WASM_SYMBOL_BINDING_WEAK)
       lazy->setWeak();
     else
@@ -611,7 +615,7 @@ Symbol *SymbolTable::addUndefinedGlobal(StringRef name,
   if (wasInserted)
     replaceSymbol<UndefinedGlobal>(s, name, importName, importModule, flags,
                                    file, type);
-  else if (auto *lazy = dyn_cast<LazySymbol>(s))
+  else if (auto *lazy = dyn_cast<LazyObject>(s))
     lazy->extract();
   else if (s->isDefined())
     checkGlobalType(s, file, type);
@@ -637,7 +641,7 @@ Symbol *SymbolTable::addUndefinedTable(StringRef name,
   if (wasInserted)
     replaceSymbol<UndefinedTable>(s, name, importName, importModule, flags,
                                   file, type);
-  else if (auto *lazy = dyn_cast<LazySymbol>(s))
+  else if (auto *lazy = dyn_cast<LazyObject>(s))
     lazy->extract();
   else if (s->isDefined())
     checkTableType(s, file, type);
@@ -663,7 +667,7 @@ Symbol *SymbolTable::addUndefinedTag(StringRef name,
   if (wasInserted)
     replaceSymbol<UndefinedTag>(s, name, importName, importModule, flags, file,
                                 sig);
-  else if (auto *lazy = dyn_cast<LazySymbol>(s))
+  else if (auto *lazy = dyn_cast<LazyObject>(s))
     lazy->extract();
   else if (s->isDefined())
     checkTagType(s, file, sig);
@@ -737,16 +741,15 @@ TableSymbol *SymbolTable::resolveIndirectFunctionTable(bool required) {
   return nullptr;
 }
 
-void SymbolTable::addLazy(ArchiveFile *file, const Archive::Symbol *sym) {
-  LLVM_DEBUG(dbgs() << "addLazy: " << sym->getName() << "\n");
-  StringRef name = sym->getName();
+void SymbolTable::addLazy(StringRef name, InputFile *file) {
+  LLVM_DEBUG(dbgs() << "addLazy: " << name << "\n");
 
   Symbol *s;
   bool wasInserted;
   std::tie(s, wasInserted) = insertName(name);
 
   if (wasInserted) {
-    replaceSymbol<LazySymbol>(s, name, 0, file, *sym);
+    replaceSymbol<LazyObject>(s, name, 0, file);
     return;
   }
 
@@ -755,7 +758,7 @@ void SymbolTable::addLazy(ArchiveFile *file, const Archive::Symbol *sym) {
 
   // The existing symbol is undefined, load a new one from the archive,
   // unless the existing symbol is weak in which case replace the undefined
-  // symbols with a LazySymbol.
+  // symbols with a LazyObject.
   if (s->isWeak()) {
     const WasmSignature *oldSig = nullptr;
     // In the case of an UndefinedFunction we need to preserve the expected
@@ -763,15 +766,14 @@ void SymbolTable::addLazy(ArchiveFile *file, const Archive::Symbol *sym) {
     if (auto *f = dyn_cast<UndefinedFunction>(s))
       oldSig = f->signature;
     LLVM_DEBUG(dbgs() << "replacing existing weak undefined symbol\n");
-    auto newSym = replaceSymbol<LazySymbol>(s, name, WASM_SYMBOL_BINDING_WEAK,
-                                            file, *sym);
+    auto newSym = replaceSymbol<LazyObject>(s, name, WASM_SYMBOL_BINDING_WEAK, file);
     newSym->signature = oldSig;
     return;
   }
 
   LLVM_DEBUG(dbgs() << "replacing existing undefined\n");
   const InputFile *oldFile = s->getFile();
-  file->addMember(sym);
+  replaceSymbol<LazyObject>(s, name, 0, file)->extract();
   if (!config->whyExtract.empty())
     ctx.whyExtractRecords.emplace_back(toString(oldFile), s->getFile(), *s);
 }
diff --git a/lld/wasm/SymbolTable.h b/lld/wasm/SymbolTable.h
index c5518ee23da26da..42ebb8be8eb3f8e 100644
--- a/lld/wasm/SymbolTable.h
+++ b/lld/wasm/SymbolTable.h
@@ -83,7 +83,7 @@ class SymbolTable {
 
   TableSymbol *resolveIndirectFunctionTable(bool required);
 
-  void addLazy(Ar...
[truncated]

@sbc100 sbc100 requested a review from MaskRay January 19, 2024 02:01
Copy link

github-actions bot commented Jan 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@sbc100 sbc100 force-pushed the ignore_archive_indexes branch 2 times, most recently from 5f9776e to 32d9c12 Compare January 19, 2024 02:11
sbc100 added a commit to sbc100/llvm-project that referenced this pull request Jan 19, 2024
I noticed this while working on llvm#78658
Copy link
Member

@dschuff dschuff left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a bigger change than just the allowing index-less archives (presumably the same code/logic from ELF); can you update the commit message to include the other changes too?

lld/wasm/Driver.cpp Outdated Show resolved Hide resolved
lazySym->extract();
}
}
if (sym)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I prefer the original's early-return style over having the entire logic in a then-clause (especially non-braced, that seems error-prone). Is there a particular reason for that change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about now.. I refactored to match #78659 which seems clearly better I hope?

lld/wasm/Symbols.h Outdated Show resolved Hide resolved
@MaskRay
Copy link
Member

MaskRay commented Jan 19, 2024

The ELF linker transitioned away from archive indexes in reviews.llvm.org/D117284.

This patch also removes ArchiveFile. The ELF reference probably should be https://reviews.llvm.org/D119074 ([ELF] Parse archives as --start-lib object files).

sbc100 added a commit that referenced this pull request Jan 19, 2024
@sbc100 sbc100 force-pushed the ignore_archive_indexes branch 7 times, most recently from b073693 to c959c60 Compare January 19, 2024 23:03
@sbc100 sbc100 requested a review from dschuff January 19, 2024 23:04
@sbc100
Copy link
Collaborator Author

sbc100 commented Jan 19, 2024

The ELF linker transitioned away from archive indexes in reviews.llvm.org/D117284.

This patch also removes ArchiveFile. The ELF reference probably should be https://reviews.llvm.org/D119074 ([ELF] Parse archives as --start-lib object files).

Done

lld/wasm/Symbols.h Outdated Show resolved Hide resolved
@sbc100 sbc100 force-pushed the ignore_archive_indexes branch 3 times, most recently from 3131757 to 2e00302 Compare January 19, 2024 23:48
…chive indexes.

The ELF linker transitioned away from archive indexes in
https://reviews.llvm.org/D117284.

This paves the way for supporting `--start-lib`/`--end-lib` (See llvm#77960)

The ELF linker unified library handling with `--start-lib`/`--end-lib`
handling in https://reviews.llvm.org/D119074.
@sbc100
Copy link
Collaborator Author

sbc100 commented Jan 30, 2024

We got a report of this issue in emscripten too: emscripten-core/emscripten#21212

@alexcrichton
Copy link
Contributor

I don't know about the ELF side of things, but yes --fatal-warnings has now been removed from wasm targets

@sbc100
Copy link
Collaborator Author

sbc100 commented Jan 30, 2024

I just did some investigation and found out why this is not and issue for the ELF rust toolchain: Their rlib.meta files are valid ELF objects:

$ objdump -h lib.rmeta 

lib.rmeta:	file format elf64-x86-64

Sections:
Idx Name               Size     VMA              Type
  0                    00000000 0000000000000000 
  1 .note.gnu.property 00000020 0000000000000000 
  2 .rmeta             00546bfa 0000000000000000 
  3 .symtab            00000018 0000000000000000 
  4 .strtab            00000001 0000000000000000 
  5 .shstrtab          00000035 0000000000000000 

If we want to fix this properly I guess we would need to do that same for wasm.. but the removal of --fatal-warnings should work too. The downside is that since (IIUC) rustc hides the linker stderr by default that means rust users will no longer see the signature mismatch warnings that wasm-ld produces. Not a huge deal perhaps..

@alexcrichton
Copy link
Contributor

Aha it appears my memory has definitely failed me! IIRC we didn't use the wasm object file format because we couldn't figure out at the time how to encode the metadata somewhere such that wasm-ld wouldn't include it in the output. In other backends flags are set such as SHF_EXCLUDE to keep the .rmeta section out of the final output but we couldn't figure out the equivalent for wasm.

Do you know of a way that metadata could be placed into a wasm-looking object file in a way that doesn't end up in the final output?

@sbc100
Copy link
Collaborator Author

sbc100 commented Jan 30, 2024

Aha it appears my memory has definitely failed me! IIRC we didn't use the wasm object file format because we couldn't figure out at the time how to encode the metadata somewhere such that wasm-ld wouldn't include it in the output. In other backends flags are set such as SHF_EXCLUDE to keep the .rmeta section out of the final output but we couldn't figure out the equivalent for wasm.

Do you know of a way that metadata could be placed into a wasm-looking object file in a way that doesn't end up in the final output?

Since these object would never be referenced via relocations they should never be included at all, right?

I suppose the exception would be if --whole-archive was used. If you are worried about that case then I think we should be able to just use custom sections (IIRC the default behaviour is to ignore custom sections.. but I would need to check).

@alexcrichton
Copy link
Contributor

Last I tested even unferenced files with custom sections still had their custom sections included in the output. Testing now though that appears to at least no longer be the case so putting the metadata in a custom section would work, except for --whole-archive like you mention. However I don't believe rustc ever uses --whole-archive for rlibs, only other user-provided things, so I believe that using custom sections for this should suffice well. I'll work on getting that into rustc!

@sbc100
Copy link
Collaborator Author

sbc100 commented Feb 2, 2024

Thanks alex!

Just to confirm, when a file is referenced (i.e. pulled in from the archive) the custom section is included? (IIRC maybe it just defaults to concatenation?)

@dschuff
Copy link
Member

dschuff commented Feb 2, 2024

Yes, this is definitely the case, because it's also how debug info works.

@MaskRay
Copy link
Member

MaskRay commented Feb 2, 2024

Thanks for removing --fatal-warnings. I agree that switching to an object file for wasm as well is the right move.

The ELF port warning about a non-object file is related to: traditional linkers ignore these files, and these files if textual, cannot be differentiated with linker scripts. A warning informs the user that the text file is not interpreted as a linker script. Non-object-files as archive members are weird anyway.

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 2, 2024
The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. To avoid bringing
in any new dependencies I've opted to hand-code this encoding at this
time. If the object gets more complicated though it'd probably be best
to pull in `wasmparser` and `wasm-encoder`. For now though there's two
adjacent functions reading/writing wasm.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
@alexcrichton
Copy link
Contributor

when a file is referenced (i.e. pulled in from the archive) the custom section is included?

Can confirm what @dschuff said as well. We rely on that behavior as well for the component model to smuggle component type information at the Rust source level through LLVM out to the final binary to get slurped up by the component tooling. (that's totally unrelated to everything said on this so far but if you're curious that's a case we found for it)

I've got a PR for rustc now -- rust-lang/rust#120588 -- to use the wasm object file format for rustc's metadata. It also needed an empty linking section to appease wasm-ld, but I've confirmed that when using a tip-of-tree wasm-ld that no warnings are emitted and no compiler-generated-metadata makes its way into the final artifact.

alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 9, 2024
The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. To avoid bringing
in any new dependencies I've opted to hand-code this encoding at this
time. If the object gets more complicated though it'd probably be best
to pull in `wasmparser` and `wasm-encoder`. For now though there's two
adjacent functions reading/writing wasm.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 9, 2024
The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. This enables the
`wasm` feature of the `object` crate to be able to read the custom
section in the same manner as other platforms, but currently `object`
doesn't support writing wasm object files so a handwritten encoder is
used instead.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 9, 2024
The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. This enables the
`wasm` feature of the `object` crate to be able to read the custom
section in the same manner as other platforms, but currently `object`
doesn't support writing wasm object files so a handwritten encoder is
used instead.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 9, 2024
The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. This enables the
`wasm` feature of the `object` crate to be able to read the custom
section in the same manner as other platforms, but currently `object`
doesn't support writing wasm object files so a handwritten encoder is
used instead.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 10, 2024
…wesleywiser

wasm: Store rlib metadata in wasm object files

The goal of this commit is to remove warnings using LLVM tip-of-tree `wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer looks at archive indices and instead looks at all the objects in archives. Previously `lib.rmeta` files were simply raw rustc metadata bytes, not wasm objects, meaning that `wasm-ld` would emit a warning indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by default which meant that if Rust were to update to LLVM 18 then all wasm targets would not work. This immediate blocker was resolved in rust-lang#120278 which removed `--fatal-warnings` which enabled a theoretical update to LLVM 18 for wasm targets. This current state is ok-enough for now because rustc squashes all linker output by default if it doesn't fail. This means, for example, that rustc squashes all the linker warnings coming out of `wasm-ld` about `lib.rmeta` files with LLVM 18. This again isn't a pressing issue because the information is all hidden, but it runs the risk of being annoying if another linker error were to happen and then the output would have all these unrelated warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve these warnings by using the WebAssembly object file format on wasm targets instead of using raw rustc metadata. When I first implemented the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that `wasm-ld` would either include the metadata in the output or I thought we didn't have to do anything there at all. I think I was wrong on both counts as `wasm-ld` does not include the metadata in the final output unless the object is referenced and we do actually need to do something to resolve these warnings.

This PR updates the object file format containing rustc metadata on WebAssembly targets to be an actual WebAssembly file. To avoid bringing in any new dependencies I've opted to hand-code this encoding at this time. If the object gets more complicated though it'd probably be best to pull in `wasmparser` and `wasm-encoder`. For now though there's two adjacent functions reading/writing wasm.

The only caveat I know of with this is that if `wasm-ld` does indeed look at the object file then the metadata will be included in the final output. I believe the only thing that could cause that at this time is `--whole-archive` which I don't think is passed for rlibs. I would clarify that I'm not 100% certain about this, however.
matthiaskrgr added a commit to matthiaskrgr/rust that referenced this pull request Feb 10, 2024
…wesleywiser

wasm: Store rlib metadata in wasm object files

The goal of this commit is to remove warnings using LLVM tip-of-tree `wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer looks at archive indices and instead looks at all the objects in archives. Previously `lib.rmeta` files were simply raw rustc metadata bytes, not wasm objects, meaning that `wasm-ld` would emit a warning indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by default which meant that if Rust were to update to LLVM 18 then all wasm targets would not work. This immediate blocker was resolved in rust-lang#120278 which removed `--fatal-warnings` which enabled a theoretical update to LLVM 18 for wasm targets. This current state is ok-enough for now because rustc squashes all linker output by default if it doesn't fail. This means, for example, that rustc squashes all the linker warnings coming out of `wasm-ld` about `lib.rmeta` files with LLVM 18. This again isn't a pressing issue because the information is all hidden, but it runs the risk of being annoying if another linker error were to happen and then the output would have all these unrelated warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve these warnings by using the WebAssembly object file format on wasm targets instead of using raw rustc metadata. When I first implemented the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that `wasm-ld` would either include the metadata in the output or I thought we didn't have to do anything there at all. I think I was wrong on both counts as `wasm-ld` does not include the metadata in the final output unless the object is referenced and we do actually need to do something to resolve these warnings.

This PR updates the object file format containing rustc metadata on WebAssembly targets to be an actual WebAssembly file. To avoid bringing in any new dependencies I've opted to hand-code this encoding at this time. If the object gets more complicated though it'd probably be best to pull in `wasmparser` and `wasm-encoder`. For now though there's two adjacent functions reading/writing wasm.

The only caveat I know of with this is that if `wasm-ld` does indeed look at the object file then the metadata will be included in the final output. I believe the only thing that could cause that at this time is `--whole-archive` which I don't think is passed for rlibs. I would clarify that I'm not 100% certain about this, however.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 12, 2024
The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. This enables the
`wasm` feature of the `object` crate to be able to read the custom
section in the same manner as other platforms, but currently `object`
doesn't support writing wasm object files so a handwritten encoder is
used instead.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
alexcrichton added a commit to alexcrichton/rust that referenced this pull request Feb 20, 2024
The goal of this commit is to remove warnings using LLVM tip-of-tree
`wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer
looks at archive indices and instead looks at all the objects in
archives. Previously `lib.rmeta` files were simply raw rustc metadata
bytes, not wasm objects, meaning that `wasm-ld` would emit a warning
indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by
default which meant that if Rust were to update to LLVM 18 then all wasm
targets would not work. This immediate blocker was resolved in
rust-lang#120278 which removed `--fatal-warnings` which enabled a
theoretical update to LLVM 18 for wasm targets. This current state is
ok-enough for now because rustc squashes all linker output by default if
it doesn't fail. This means, for example, that rustc squashes all the
linker warnings coming out of `wasm-ld` about `lib.rmeta` files with
LLVM 18. This again isn't a pressing issue because the information is
all hidden, but it runs the risk of being annoying if another linker
error were to happen and then the output would have all these unrelated
warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve
these warnings by using the WebAssembly object file format on wasm
targets instead of using raw rustc metadata. When I first implemented
the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that
`wasm-ld` would either include the metadata in the output or I thought
we didn't have to do anything there at all. I think I was wrong on both
counts as `wasm-ld` does not include the metadata in the final output
unless the object is referenced and we do actually need to do something
to resolve these warnings.

This PR updates the object file format containing rustc metadata on
WebAssembly targets to be an actual WebAssembly file. This enables the
`wasm` feature of the `object` crate to be able to read the custom
section in the same manner as other platforms, but currently `object`
doesn't support writing wasm object files so a handwritten encoder is
used instead.

The only caveat I know of with this is that if `wasm-ld` does indeed
look at the object file then the metadata will be included in the final
output. I believe the only thing that could cause that at this time is
`--whole-archive` which I don't think is passed for rlibs. I would
clarify that I'm not 100% certain about this, however.
bors added a commit to rust-lang-ci/rust that referenced this pull request Feb 21, 2024
…sleywiser,bjorn3

wasm: Store rlib metadata in wasm object files

The goal of this commit is to remove warnings using LLVM tip-of-tree `wasm-ld`. In llvm/llvm-project#78658 the `wasm-ld` LLD driver no longer looks at archive indices and instead looks at all the objects in archives. Previously `lib.rmeta` files were simply raw rustc metadata bytes, not wasm objects, meaning that `wasm-ld` would emit a warning indicating so.

WebAssembly targets previously passed `--fatal-warnings` to `wasm-ld` by default which meant that if Rust were to update to LLVM 18 then all wasm targets would not work. This immediate blocker was resolved in rust-lang#120278 which removed `--fatal-warnings` which enabled a theoretical update to LLVM 18 for wasm targets. This current state is ok-enough for now because rustc squashes all linker output by default if it doesn't fail. This means, for example, that rustc squashes all the linker warnings coming out of `wasm-ld` about `lib.rmeta` files with LLVM 18. This again isn't a pressing issue because the information is all hidden, but it runs the risk of being annoying if another linker error were to happen and then the output would have all these unrelated warnings that couldn't be fixed.

Thus, this PR comes into the picture. The goal of this PR is to resolve these warnings by using the WebAssembly object file format on wasm targets instead of using raw rustc metadata. When I first implemented the rlib-in-objects scheme in rust-lang#84449 I remember either concluding that `wasm-ld` would either include the metadata in the output or I thought we didn't have to do anything there at all. I think I was wrong on both counts as `wasm-ld` does not include the metadata in the final output unless the object is referenced and we do actually need to do something to resolve these warnings.

This PR updates the object file format containing rustc metadata on WebAssembly targets to be an actual WebAssembly file. To avoid bringing in any new dependencies I've opted to hand-code this encoding at this time. If the object gets more complicated though it'd probably be best to pull in `wasmparser` and `wasm-encoder`. For now though there's two adjacent functions reading/writing wasm.

The only caveat I know of with this is that if `wasm-ld` does indeed look at the object file then the metadata will be included in the final output. I believe the only thing that could cause that at this time is `--whole-archive` which I don't think is passed for rlibs. I would clarify that I'm not 100% certain about this, however.
sbc100 added a commit to sbc100/llvm-project that referenced this pull request Aug 19, 2024
This was broken back in llvm#78658 when we transitioned away from archive
indexes to parsing lazy object files.
sbc100 added a commit to sbc100/llvm-project that referenced this pull request Aug 19, 2024
This was broken back in llvm#78658 when we transitioned away from archive
indexes to parsing lazy object files.

Fixes: llvm#94077
Fixes: emscripten-core/emscripten#22008
sbc100 added a commit to sbc100/llvm-project that referenced this pull request Aug 19, 2024
This was broken back in llvm#78658 when we transitioned away from archive
indexes to parsing lazy object files.

Fixes: llvm#94077
Fixes: emscripten-core/emscripten#22008
sbc100 added a commit that referenced this pull request Aug 20, 2024
…s. (#104876)

This was broken back in #78658 when we transitioned away from archive
indexes to parsing lazy object files.

Fixes: #94077
Fixes: emscripten-core/emscripten#22008
radekdoulik pushed a commit to dotnet/llvm-project that referenced this pull request Nov 15, 2024
…s. (llvm#104876)

This was broken back in llvm#78658 when we transitioned away from archive
indexes to parsing lazy object files.

Fixes: llvm#94077
Fixes: emscripten-core/emscripten#22008
radekdoulik pushed a commit to radekdoulik/llvm-project that referenced this pull request Nov 22, 2024
…s. (llvm#104876)

This was broken back in llvm#78658 when we transitioned away from archive
indexes to parsing lazy object files.

Fixes: llvm#94077
Fixes: emscripten-core/emscripten#22008
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants