[lld][MachO] Tail merge strings #161262

ellishg · 2025-09-29T19:22:50Z

Add the flag --tail-merge-strings to enable tail merging of cstrings. For example, if we have strings mystring\0 and ring\0, we could place mystring\0 at address 0x1000 and ring\0 at address 0x1004 and have them share the same underlying data.

It turns out that many ObjC method names can be tail merged. For example, error: and doFoo:error:. On a large iOS binary, we saw nearly a 15% size improvement in the __TEXT__objc_methname section and negligible impact on link time.

$ bloaty --domain=vm merged.o.stripped -- base.o.stripped
     VM SIZE
 --------------
   +95% +5.85Ki    [__TEXT]
  -2.4%  -239Ki    __TEXT,__cstring
 -14.5%  -710Ki    __TEXT,__objc_methname
  -1.0%  -944Ki    TOTAL

Tail merging for MachO was originally removed in 7c269db. The previous implementation used StringTableBuilder, but that was removed in 4308f03 to ensure deduplicated strings are aligned correctly. This implementation ensures that tail merged strings are also aligned correctly.

Special thanks to nocchijiang for pointing this out in #158720 (comment).

Depends on #161253.

llvmbot · 2025-09-29T23:12:22Z

@llvm/pr-subscribers-lld-macho

Author: Ellis Hoag (ellishg)

Changes

Tail merge cstrings. For example, if we have strings mystring\0 and ring\0, we could place mystring\0 at address 0x1000 and ring\0 at address 0x1004 and have them share the same underlying data.

It turns out that many ObjC method names can be tail merged. For example, error: and doFoo:error:. On a large iOS binary, we saw nearly a 15% size improvement in the __TEXT__objc_methname section and negligible impact on link time.

$ bloaty --domain=vm merged.o.stripped -- base.o.stripped
     VM SIZE
 --------------
   +95% +5.85Ki    [__TEXT]
  -2.4%  -239Ki    __TEXT,__cstring
 -14.5%  -710Ki    __TEXT,__objc_methname
  -1.0%  -944Ki    TOTAL

Tail merging for MachO was originally removed in 7c269db. The previous implementation used StringTableBuilder, but that was removed in 4308f03 to ensure deduplicated strings are aligned correctly. This implementation ensures that tail merged strings are also aligned correctly.

Special thanks to nocchijiang for pointing this out in #158720 (comment).

Depends on #161253.

Full diff: https://github.com/llvm/llvm-project/pull/161262.diff

4 Files Affected:

(modified) lld/MachO/SyntheticSections.cpp (+58-1)
(modified) lld/docs/ReleaseNotes.rst (+3)
(modified) lld/test/MachO/cstring-dedup.s (+1-2)
(added) lld/test/MachO/cstring-tailmerge.s (+85)

diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
index 903ba78a27c75..edb90fe7fcbcc 100644
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -1746,6 +1746,7 @@ void CStringSection::finalizeContents() {
 void DeduplicatedCStringSection::finalizeContents() {
   // Find the largest alignment required for each string.
   DenseMap<CachedHashStringRef, Align> strToAlignment;
+  std::vector<CachedHashStringRef> deduplicatedStrs;
   for (const CStringInputSection *isec : inputs) {
     for (const auto &[i, piece] : llvm::enumerate(isec->pieces)) {
       if (!piece.live)
@@ -1754,17 +1755,57 @@ void DeduplicatedCStringSection::finalizeContents() {
       assert(isec->align != 0);
       auto align = getStringPieceAlignment(isec, piece);
       auto [it, wasInserted] = strToAlignment.try_emplace(s, align);
+      if (wasInserted)
+        deduplicatedStrs.push_back(s);
       if (!wasInserted && it->second < align)
         it->second = align;
     }
   }
 
+  // Like lexigraphical sort, except we read strings in reverse and take the
+  // longest string first
+  // TODO: We could improve performance by implementing our own sort that avoids
+  // comparing characters we know to be the same. See
+  // StringTableBuilder::multikeySort() for details
+  llvm::sort(deduplicatedStrs, [](const auto &left, const auto &right) {
+    for (const auto &[leftChar, rightChar] :
+         llvm::zip(llvm::reverse(left.val()), llvm::reverse(right.val()))) {
+      if (leftChar == rightChar)
+        continue;
+      return leftChar < rightChar;
+    }
+    return left.size() > right.size();
+  });
+  std::optional<CachedHashStringRef> mergeCandidate;
+  DenseMap<CachedHashStringRef, std::pair<CachedHashStringRef, uint64_t>>
+      tailMergeMap;
+  for (auto &s : deduplicatedStrs) {
+    if (!mergeCandidate || !mergeCandidate->val().ends_with(s.val())) {
+      mergeCandidate = s;
+      continue;
+    }
+    uint64_t tailOffset = mergeCandidate->size() - s.size();
+    // TODO: If the tail offset is incompatible with this string's alignment, we
+    // might be able to find another superstring with a compatible tail offset.
+    // The difficulty is how to do this efficiently
+    const auto &align = strToAlignment.at(s);
+    if (!isAligned(align, tailOffset))
+      continue;
+    auto &mergeCandidateAlign = strToAlignment[*mergeCandidate];
+    if (align > mergeCandidateAlign)
+      mergeCandidateAlign = align;
+    tailMergeMap.try_emplace(s, *mergeCandidate, tailOffset);
+  }
+
   // Sort the strings for performance and compression size win, and then
   // assign an offset for each string and save it to the corresponding
   // StringPieces for easy access.
   for (auto &[isec, i] : priorityBuilder.buildCStringPriorities(inputs)) {
     auto &piece = isec->pieces[i];
     auto s = isec->getCachedHashStringRef(i);
+    // Skip tail merged strings until their superstring offsets are resolved
+    if (tailMergeMap.count(s))
+      continue;
     auto [it, wasInserted] = stringOffsetMap.try_emplace(s, /*placeholder*/ 0);
     if (wasInserted) {
       // Avoid computing the offset until we are sure we will need to
@@ -1776,8 +1817,24 @@ void DeduplicatedCStringSection::finalizeContents() {
     // only need to assign the offset.
     piece.outSecOff = it->second;
   }
-  for (CStringInputSection *isec : inputs)
+  for (CStringInputSection *isec : inputs) {
+    for (const auto &[i, piece] : llvm::enumerate(isec->pieces)) {
+      if (!piece.live)
+        continue;
+      auto s = isec->getCachedHashStringRef(i);
+      auto it = tailMergeMap.find(s);
+      if (it == tailMergeMap.end())
+        continue;
+      const auto &[superString, tailOffset] = it->second;
+      assert(superString.val().ends_with(s.val()));
+      assert(!tailMergeMap.count(superString));
+      auto &outSecOff = stringOffsetMap[s];
+      outSecOff = stringOffsetMap.at(superString) + tailOffset;
+      piece.outSecOff = outSecOff;
+      assert(isAligned(strToAlignment.at(s), piece.outSecOff));
+    }
     isec->isFinal = true;
+  }
 }
 
 void DeduplicatedCStringSection::writeTo(uint8_t *buf) const {
diff --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 6ea1ea0fd6c2f..fa2247d64b690 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -44,6 +44,9 @@ MinGW Improvements
 MachO Improvements
 ------------------
 
+* cstrings sections are now tail merged
+  (`#161262 <https://github.com/llvm/llvm-project/pull/161262>`_)
+
 WebAssembly Improvements
 ------------------------
 
diff --git a/lld/test/MachO/cstring-dedup.s b/lld/test/MachO/cstring-dedup.s
index a4b15f26afff0..0a42b3d6fcff3 100644
--- a/lld/test/MachO/cstring-dedup.s
+++ b/lld/test/MachO/cstring-dedup.s
@@ -8,11 +8,10 @@
 # RUN: llvm-objdump --macho --section="__DATA,ptrs" --syms %t/test | FileCheck %s
 # RUN: llvm-readobj --section-headers %t/test | FileCheck %s --check-prefix=HEADER
 
-## Make sure we only have 3 deduplicated strings in __cstring.
+## Make sure we only have 2 deduplicated strings in __cstring.
 # STR: Contents of (__TEXT,__cstring) section
 # STR: {{[[:xdigit:]]+}} foo
 # STR: {{[[:xdigit:]]+}} barbaz
-# STR: {{[[:xdigit:]]+}} {{$}}
 
 ## Make sure both symbol and section relocations point to the right thing.
 # CHECK:      Contents of (__DATA,ptrs) section
diff --git a/lld/test/MachO/cstring-tailmerge.s b/lld/test/MachO/cstring-tailmerge.s
new file mode 100644
index 0000000000000..83d2810a78139
--- /dev/null
+++ b/lld/test/MachO/cstring-tailmerge.s
@@ -0,0 +1,85 @@
+# REQUIRES: x86
+# RUN: rm -rf %t; split-file %s %t
+
+# RUN: sed "s/<ALIGN>/0/g" %t/align.s.template > %t/align-1.s
+# RUN: sed "s/<ALIGN>/1/g" %t/align.s.template > %t/align-2.s
+# RUN: sed "s/<ALIGN>/2/g" %t/align.s.template > %t/align-4.s
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/first.s -o %t/first.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/align-1.s -o %t/align-1.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/align-2.s -o %t/align-2.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/align-4.s -o %t/align-4.o
+
+# RUN: %lld -dylib --deduplicate-strings %t/first.o %t/align-1.o -o %t/align-1
+# RUN: llvm-objdump --macho --section="__TEXT,__cstring" --syms %t/align-1 | FileCheck %s --check-prefixes=CHECK,ALIGN1
+
+# RUN: %lld -dylib --deduplicate-strings %t/first.o %t/align-2.o -o %t/align-2
+# RUN: llvm-objdump --macho --section="__TEXT,__cstring" --syms %t/align-2 | FileCheck %s --check-prefixes=CHECK,ALIGN2
+
+# RUN: %lld -dylib --deduplicate-strings %t/first.o %t/align-4.o -o %t/align-4
+# RUN: llvm-objdump --macho --section="__TEXT,__cstring" --syms %t/align-4 | FileCheck %s --check-prefixes=CHECK,ALIGN4
+
+# CHECK: Contents of (__TEXT,__cstring) section
+# CHECK: [[#%.16x,START:]] get awkward offset{{$}}
+
+# ALIGN1: [[#%.16x,START+19]] myotherlongstr{{$}}
+# ALIGN1: [[#%.16x,START+19+15]] otherstr{{$}}
+
+# ALIGN2: [[#%.16x,START+20]] myotherlongstr{{$}}
+# ALIGN2: [[#%.16x,START+20+16]] longstr{{$}}
+# ALIGN2: [[#%.16x,START+20+16+8]] otherstr{{$}}
+# ALIGN2: [[#%.16x,START+20+16+8+10]] str{{$}}
+
+# ALIGN4: [[#%.16x,START+20]] myotherlongstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16]] otherlongstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16+16]] longstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16+16+8]] otherstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16+16+8+12]] str{{$}}
+
+# CHECK: SYMBOL TABLE:
+
+# ALIGN1: [[#%.16x,START+19]] l     O __TEXT,__cstring _myotherlongstr
+# ALIGN1: [[#%.16x,START+21]] l     O __TEXT,__cstring _otherlongstr
+# ALIGN1: [[#%.16x,START+26]] l     O __TEXT,__cstring _longstr
+# ALIGN1: [[#%.16x,START+34]] l     O __TEXT,__cstring _otherstr
+# ALIGN1: [[#%.16x,START+39]] l     O __TEXT,__cstring _str
+
+# ALIGN2: [[#%.16x,START+20]] l     O __TEXT,__cstring _myotherlongstr
+# ALIGN2: [[#%.16x,START+20+2]] l     O __TEXT,__cstring _otherlongstr
+# ALIGN2: [[#%.16x,START+20+16]] l     O __TEXT,__cstring _longstr
+# ALIGN2: [[#%.16x,START+20+16+8]] l     O __TEXT,__cstring _otherstr
+# ALIGN2: [[#%.16x,START+20+16+8+10]] l     O __TEXT,__cstring _str
+
+# ALIGN4: [[#%.16x,START+20]] l     O __TEXT,__cstring _myotherlongstr
+# ALIGN4: [[#%.16x,START+20+16]] l     O __TEXT,__cstring _otherlongstr
+# ALIGN4: [[#%.16x,START+20+16+16]] l     O __TEXT,__cstring _longstr
+# ALIGN4: [[#%.16x,START+20+16+16+8]] l     O __TEXT,__cstring _otherstr
+# ALIGN4: [[#%.16x,START+20+16+16+8+12]] l     O __TEXT,__cstring _str
+
+#--- first.s
+.cstring
+.p2align 2
+.asciz "get awkward offset"  # length = 19
+
+#--- align.s.template
+.cstring
+
+.p2align <ALIGN>
+  _myotherlongstr:
+.asciz "myotherlongstr"      # length = 15
+
+.p2align <ALIGN>
+  _otherlongstr:
+.asciz   "otherlongstr"      # length = 13, tail offset = 2
+
+.p2align <ALIGN>
+  _longstr:
+.asciz        "longstr"      # length = 8, tail offset = 7
+
+.p2align <ALIGN>
+  _otherstr:
+.asciz       "otherstr"      # length = 9
+
+.p2align <ALIGN>
+  _str:
+.asciz            "str"      # length = 4, tail offset = 5

llvmbot · 2025-09-29T23:12:23Z

@llvm/pr-subscribers-lld

Author: Ellis Hoag (ellishg)

Changes

Tail merge cstrings. For example, if we have strings mystring\0 and ring\0, we could place mystring\0 at address 0x1000 and ring\0 at address 0x1004 and have them share the same underlying data.

It turns out that many ObjC method names can be tail merged. For example, error: and doFoo:error:. On a large iOS binary, we saw nearly a 15% size improvement in the __TEXT__objc_methname section and negligible impact on link time.

$ bloaty --domain=vm merged.o.stripped -- base.o.stripped
     VM SIZE
 --------------
   +95% +5.85Ki    [__TEXT]
  -2.4%  -239Ki    __TEXT,__cstring
 -14.5%  -710Ki    __TEXT,__objc_methname
  -1.0%  -944Ki    TOTAL

Tail merging for MachO was originally removed in 7c269db. The previous implementation used StringTableBuilder, but that was removed in 4308f03 to ensure deduplicated strings are aligned correctly. This implementation ensures that tail merged strings are also aligned correctly.

Special thanks to nocchijiang for pointing this out in #158720 (comment).

Depends on #161253.

Full diff: https://github.com/llvm/llvm-project/pull/161262.diff

4 Files Affected:

(modified) lld/MachO/SyntheticSections.cpp (+58-1)
(modified) lld/docs/ReleaseNotes.rst (+3)
(modified) lld/test/MachO/cstring-dedup.s (+1-2)
(added) lld/test/MachO/cstring-tailmerge.s (+85)

diff --git a/lld/MachO/SyntheticSections.cpp b/lld/MachO/SyntheticSections.cpp
index 903ba78a27c75..edb90fe7fcbcc 100644
--- a/lld/MachO/SyntheticSections.cpp
+++ b/lld/MachO/SyntheticSections.cpp
@@ -1746,6 +1746,7 @@ void CStringSection::finalizeContents() {
 void DeduplicatedCStringSection::finalizeContents() {
   // Find the largest alignment required for each string.
   DenseMap<CachedHashStringRef, Align> strToAlignment;
+  std::vector<CachedHashStringRef> deduplicatedStrs;
   for (const CStringInputSection *isec : inputs) {
     for (const auto &[i, piece] : llvm::enumerate(isec->pieces)) {
       if (!piece.live)
@@ -1754,17 +1755,57 @@ void DeduplicatedCStringSection::finalizeContents() {
       assert(isec->align != 0);
       auto align = getStringPieceAlignment(isec, piece);
       auto [it, wasInserted] = strToAlignment.try_emplace(s, align);
+      if (wasInserted)
+        deduplicatedStrs.push_back(s);
       if (!wasInserted && it->second < align)
         it->second = align;
     }
   }
 
+  // Like lexigraphical sort, except we read strings in reverse and take the
+  // longest string first
+  // TODO: We could improve performance by implementing our own sort that avoids
+  // comparing characters we know to be the same. See
+  // StringTableBuilder::multikeySort() for details
+  llvm::sort(deduplicatedStrs, [](const auto &left, const auto &right) {
+    for (const auto &[leftChar, rightChar] :
+         llvm::zip(llvm::reverse(left.val()), llvm::reverse(right.val()))) {
+      if (leftChar == rightChar)
+        continue;
+      return leftChar < rightChar;
+    }
+    return left.size() > right.size();
+  });
+  std::optional<CachedHashStringRef> mergeCandidate;
+  DenseMap<CachedHashStringRef, std::pair<CachedHashStringRef, uint64_t>>
+      tailMergeMap;
+  for (auto &s : deduplicatedStrs) {
+    if (!mergeCandidate || !mergeCandidate->val().ends_with(s.val())) {
+      mergeCandidate = s;
+      continue;
+    }
+    uint64_t tailOffset = mergeCandidate->size() - s.size();
+    // TODO: If the tail offset is incompatible with this string's alignment, we
+    // might be able to find another superstring with a compatible tail offset.
+    // The difficulty is how to do this efficiently
+    const auto &align = strToAlignment.at(s);
+    if (!isAligned(align, tailOffset))
+      continue;
+    auto &mergeCandidateAlign = strToAlignment[*mergeCandidate];
+    if (align > mergeCandidateAlign)
+      mergeCandidateAlign = align;
+    tailMergeMap.try_emplace(s, *mergeCandidate, tailOffset);
+  }
+
   // Sort the strings for performance and compression size win, and then
   // assign an offset for each string and save it to the corresponding
   // StringPieces for easy access.
   for (auto &[isec, i] : priorityBuilder.buildCStringPriorities(inputs)) {
     auto &piece = isec->pieces[i];
     auto s = isec->getCachedHashStringRef(i);
+    // Skip tail merged strings until their superstring offsets are resolved
+    if (tailMergeMap.count(s))
+      continue;
     auto [it, wasInserted] = stringOffsetMap.try_emplace(s, /*placeholder*/ 0);
     if (wasInserted) {
       // Avoid computing the offset until we are sure we will need to
@@ -1776,8 +1817,24 @@ void DeduplicatedCStringSection::finalizeContents() {
     // only need to assign the offset.
     piece.outSecOff = it->second;
   }
-  for (CStringInputSection *isec : inputs)
+  for (CStringInputSection *isec : inputs) {
+    for (const auto &[i, piece] : llvm::enumerate(isec->pieces)) {
+      if (!piece.live)
+        continue;
+      auto s = isec->getCachedHashStringRef(i);
+      auto it = tailMergeMap.find(s);
+      if (it == tailMergeMap.end())
+        continue;
+      const auto &[superString, tailOffset] = it->second;
+      assert(superString.val().ends_with(s.val()));
+      assert(!tailMergeMap.count(superString));
+      auto &outSecOff = stringOffsetMap[s];
+      outSecOff = stringOffsetMap.at(superString) + tailOffset;
+      piece.outSecOff = outSecOff;
+      assert(isAligned(strToAlignment.at(s), piece.outSecOff));
+    }
     isec->isFinal = true;
+  }
 }
 
 void DeduplicatedCStringSection::writeTo(uint8_t *buf) const {
diff --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 6ea1ea0fd6c2f..fa2247d64b690 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -44,6 +44,9 @@ MinGW Improvements
 MachO Improvements
 ------------------
 
+* cstrings sections are now tail merged
+  (`#161262 <https://github.com/llvm/llvm-project/pull/161262>`_)
+
 WebAssembly Improvements
 ------------------------
 
diff --git a/lld/test/MachO/cstring-dedup.s b/lld/test/MachO/cstring-dedup.s
index a4b15f26afff0..0a42b3d6fcff3 100644
--- a/lld/test/MachO/cstring-dedup.s
+++ b/lld/test/MachO/cstring-dedup.s
@@ -8,11 +8,10 @@
 # RUN: llvm-objdump --macho --section="__DATA,ptrs" --syms %t/test | FileCheck %s
 # RUN: llvm-readobj --section-headers %t/test | FileCheck %s --check-prefix=HEADER
 
-## Make sure we only have 3 deduplicated strings in __cstring.
+## Make sure we only have 2 deduplicated strings in __cstring.
 # STR: Contents of (__TEXT,__cstring) section
 # STR: {{[[:xdigit:]]+}} foo
 # STR: {{[[:xdigit:]]+}} barbaz
-# STR: {{[[:xdigit:]]+}} {{$}}
 
 ## Make sure both symbol and section relocations point to the right thing.
 # CHECK:      Contents of (__DATA,ptrs) section
diff --git a/lld/test/MachO/cstring-tailmerge.s b/lld/test/MachO/cstring-tailmerge.s
new file mode 100644
index 0000000000000..83d2810a78139
--- /dev/null
+++ b/lld/test/MachO/cstring-tailmerge.s
@@ -0,0 +1,85 @@
+# REQUIRES: x86
+# RUN: rm -rf %t; split-file %s %t
+
+# RUN: sed "s/<ALIGN>/0/g" %t/align.s.template > %t/align-1.s
+# RUN: sed "s/<ALIGN>/1/g" %t/align.s.template > %t/align-2.s
+# RUN: sed "s/<ALIGN>/2/g" %t/align.s.template > %t/align-4.s
+
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/first.s -o %t/first.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/align-1.s -o %t/align-1.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/align-2.s -o %t/align-2.o
+# RUN: llvm-mc -filetype=obj -triple=x86_64-apple-darwin %t/align-4.s -o %t/align-4.o
+
+# RUN: %lld -dylib --deduplicate-strings %t/first.o %t/align-1.o -o %t/align-1
+# RUN: llvm-objdump --macho --section="__TEXT,__cstring" --syms %t/align-1 | FileCheck %s --check-prefixes=CHECK,ALIGN1
+
+# RUN: %lld -dylib --deduplicate-strings %t/first.o %t/align-2.o -o %t/align-2
+# RUN: llvm-objdump --macho --section="__TEXT,__cstring" --syms %t/align-2 | FileCheck %s --check-prefixes=CHECK,ALIGN2
+
+# RUN: %lld -dylib --deduplicate-strings %t/first.o %t/align-4.o -o %t/align-4
+# RUN: llvm-objdump --macho --section="__TEXT,__cstring" --syms %t/align-4 | FileCheck %s --check-prefixes=CHECK,ALIGN4
+
+# CHECK: Contents of (__TEXT,__cstring) section
+# CHECK: [[#%.16x,START:]] get awkward offset{{$}}
+
+# ALIGN1: [[#%.16x,START+19]] myotherlongstr{{$}}
+# ALIGN1: [[#%.16x,START+19+15]] otherstr{{$}}
+
+# ALIGN2: [[#%.16x,START+20]] myotherlongstr{{$}}
+# ALIGN2: [[#%.16x,START+20+16]] longstr{{$}}
+# ALIGN2: [[#%.16x,START+20+16+8]] otherstr{{$}}
+# ALIGN2: [[#%.16x,START+20+16+8+10]] str{{$}}
+
+# ALIGN4: [[#%.16x,START+20]] myotherlongstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16]] otherlongstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16+16]] longstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16+16+8]] otherstr{{$}}
+# ALIGN4: [[#%.16x,START+20+16+16+8+12]] str{{$}}
+
+# CHECK: SYMBOL TABLE:
+
+# ALIGN1: [[#%.16x,START+19]] l     O __TEXT,__cstring _myotherlongstr
+# ALIGN1: [[#%.16x,START+21]] l     O __TEXT,__cstring _otherlongstr
+# ALIGN1: [[#%.16x,START+26]] l     O __TEXT,__cstring _longstr
+# ALIGN1: [[#%.16x,START+34]] l     O __TEXT,__cstring _otherstr
+# ALIGN1: [[#%.16x,START+39]] l     O __TEXT,__cstring _str
+
+# ALIGN2: [[#%.16x,START+20]] l     O __TEXT,__cstring _myotherlongstr
+# ALIGN2: [[#%.16x,START+20+2]] l     O __TEXT,__cstring _otherlongstr
+# ALIGN2: [[#%.16x,START+20+16]] l     O __TEXT,__cstring _longstr
+# ALIGN2: [[#%.16x,START+20+16+8]] l     O __TEXT,__cstring _otherstr
+# ALIGN2: [[#%.16x,START+20+16+8+10]] l     O __TEXT,__cstring _str
+
+# ALIGN4: [[#%.16x,START+20]] l     O __TEXT,__cstring _myotherlongstr
+# ALIGN4: [[#%.16x,START+20+16]] l     O __TEXT,__cstring _otherlongstr
+# ALIGN4: [[#%.16x,START+20+16+16]] l     O __TEXT,__cstring _longstr
+# ALIGN4: [[#%.16x,START+20+16+16+8]] l     O __TEXT,__cstring _otherstr
+# ALIGN4: [[#%.16x,START+20+16+16+8+12]] l     O __TEXT,__cstring _str
+
+#--- first.s
+.cstring
+.p2align 2
+.asciz "get awkward offset"  # length = 19
+
+#--- align.s.template
+.cstring
+
+.p2align <ALIGN>
+  _myotherlongstr:
+.asciz "myotherlongstr"      # length = 15
+
+.p2align <ALIGN>
+  _otherlongstr:
+.asciz   "otherlongstr"      # length = 13, tail offset = 2
+
+.p2align <ALIGN>
+  _longstr:
+.asciz        "longstr"      # length = 8, tail offset = 7
+
+.p2align <ALIGN>
+  _otherstr:
+.asciz       "otherstr"      # length = 9
+
+.p2align <ALIGN>
+  _str:
+.asciz            "str"      # length = 4, tail offset = 5

kyulee-com

Thanks for the great work!
Overall it looks good to me

kyulee-com · 2025-10-01T06:48:48Z

lld/MachO/SyntheticSections.cpp

-  for (CStringInputSection *isec : inputs)
+  for (CStringInputSection *isec : inputs) {
+    for (const auto &[i, piece] : llvm::enumerate(isec->pieces)) {
+      if (!piece.live)


As we've already skipped this for tailMergeMap in the prior loop, we can drop this check.

That's true. If the string is in tailMergeMap then it will be live. However, we can skip the lookup if the string is dead, so this is more for performance than correctness.

kyulee-com · 2025-10-01T06:58:56Z

lld/MachO/SyntheticSections.cpp

    piece.outSecOff = it->second;
  }
-  for (CStringInputSection *isec : inputs)
+  for (CStringInputSection *isec : inputs) {


Can you add some comment on this loop? It appears the previous loop layouts the substrings while the loop resolves the substring offsets.

kyulee-com · 2025-10-01T07:09:55Z

lld/MachO/SyntheticSections.cpp

+    // might be able to find another superstring with a compatible tail offset.
+    // The difficulty is how to do this efficiently
+    const auto &align = strToAlignment.at(s);
+    if (!isAligned(align, tailOffset))


I was a bit confused initially. Because the alignment of superstring will be promoted by the following code, here we can just check if the tailOffset is aligned and bail-out the failing case early.

The isAligned(align, tailOffset) check assumes that the superstring is laid out at address zero. If this check fails, it's impossible to tail merge with this candidate correctly.

But if the check passes, then we also need to make sure the superstring does not have a weaker alignment than the substring.

Suppose the superstring is at address 0x1004 and the substring could tail merge at address 0x1004 + 8, but requires an alignment of 8. That would break alignment, so instead we need to force the superstring to have alignment 8 so it is laid out at address 0x1008.

ellishg · 2025-10-01T15:38:39Z

Thanks for the accept! I think I will guard this behind a flag. One reason is for safety, but it also impacts the layout order which could impact performance. The order of the tail merged strings will effectively be ignored, and I want to think about how I could mitigate that.

ellishg · 2025-10-01T21:26:58Z

I've added the flag --tail-merge-strings. Once I've verified this is safe and doesn't regress performance, I'd like to eventually turn this on by default. Although we may need an RFC for that.

ellishg added 7 commits September 29, 2025 10:20

[lld][macho][NFC] Factor count zeros into helper function

2e997db

fix format

ee87e10

[lld][MachO] Use llvm::Align and remove StringOffset type

d47f1b9

init size to zero

4e0bde2

[lld][MachO] Tail merge strings

4742cca

add to release notes

73693d6

add assert

6a1da5f

ellishg requested review from SharonXSharon, int3, kyulee-com and thevinster and removed request for thevinster September 29, 2025 23:11

ellishg marked this pull request as ready for review September 29, 2025 23:11

llvmbot added lld lld:MachO labels Sep 29, 2025

ellishg mentioned this pull request Sep 29, 2025

[LLD][MachO] Option to emit separate cstring sections #158720

Merged

ellishg force-pushed the users/ellishg/lld-align branch from 4e0bde2 to 72bc6a8 Compare September 30, 2025 16:49

Base automatically changed from users/ellishg/lld-align to main September 30, 2025 16:57

Merge branch 'main' into users/ellishg/lld-macho-tail-merge

3c3233c

kyulee-com approved these changes Oct 1, 2025

View reviewed changes

ellishg added 2 commits October 1, 2025 14:15

order correctly and add more tests

fdaca77

fix release notes

4fbcb74

ellishg and others added 2 commits October 3, 2025 09:24

Merge branch 'main' into users/ellishg/lld-macho-tail-merge

37bcd0d

remove --deduplicate-strings from tests which is the default already

43ac021

ellishg enabled auto-merge (squash) October 3, 2025 16:31

ellishg merged commit d0e9890 into main Oct 3, 2025
8 of 9 checks passed

ellishg deleted the users/ellishg/lld-macho-tail-merge branch October 3, 2025 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[lld][MachO] Tail merge strings #161262

[lld][MachO] Tail merge strings #161262

Uh oh!

ellishg commented Sep 29, 2025 •

edited

Loading

Uh oh!

llvmbot commented Sep 29, 2025

Uh oh!

llvmbot commented Sep 29, 2025

Uh oh!

kyulee-com left a comment

Uh oh!

kyulee-com Oct 1, 2025

Uh oh!

ellishg Oct 1, 2025

Uh oh!

kyulee-com Oct 1, 2025

Uh oh!

kyulee-com Oct 1, 2025

Uh oh!

ellishg Oct 1, 2025

Uh oh!

ellishg commented Oct 1, 2025

Uh oh!

ellishg commented Oct 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[lld][MachO] Tail merge strings #161262

[lld][MachO] Tail merge strings #161262

Uh oh!

Conversation

ellishg commented Sep 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 29, 2025

Uh oh!

llvmbot commented Sep 29, 2025

Uh oh!

kyulee-com left a comment

Choose a reason for hiding this comment

Uh oh!

kyulee-com Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

ellishg Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

kyulee-com Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

kyulee-com Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

ellishg Oct 1, 2025

Choose a reason for hiding this comment

Uh oh!

ellishg commented Oct 1, 2025

Uh oh!

ellishg commented Oct 1, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ellishg commented Sep 29, 2025 •

edited

Loading