-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebAssembly] Handle symbols in .init_array
sections
#119127
Conversation
@llvm/pr-subscribers-backend-webassembly @llvm/pr-subscribers-mc Author: George Stagg (georgestagg) ChangesFollow on from #111008. Consider the following C code, which supplies a function void init() {}
typedef void (*fn)(void);
__attribute__((section(".init_array"))) fn p_init[1] = { &init };
int main() { return 0; } Note that a label In any case, compile this code to find:
What is happening is the array to be written to llvm-project/llvm/lib/MC/WasmObjectWriter.cpp Lines 1485 to 1487 in 4e0ba80
This existing code skips writing out the array in a data segment. Later this is indeed dealt with by writing each function in the array into a Wasm custom linking section, in the Unfortunately, the symbol In #111008 I suggested a change to simply not write the In this patch I instead remove the skip, so that data segments are also written for Some tests related to the change have been modified to reflect the additional emitted data, and a new test Full diff: https://github.com/llvm/llvm-project/pull/119127.diff 4 Files Affected:
diff --git a/llvm/lib/MC/WasmObjectWriter.cpp b/llvm/lib/MC/WasmObjectWriter.cpp
index a66c5713ff8a6e..04466ae0fddf13 100644
--- a/llvm/lib/MC/WasmObjectWriter.cpp
+++ b/llvm/lib/MC/WasmObjectWriter.cpp
@@ -1482,10 +1482,6 @@ uint64_t WasmObjectWriter::writeOneObject(MCAssembler &Asm,
LLVM_DEBUG(dbgs() << "Processing Section " << SectionName << " group "
<< Section.getGroup() << "\n";);
- // .init_array sections are handled specially elsewhere.
- if (SectionName.starts_with(".init_array"))
- continue;
-
// Code is handled separately
if (Section.isText())
continue;
diff --git a/llvm/test/MC/WebAssembly/global-ctor-dtor.ll b/llvm/test/MC/WebAssembly/global-ctor-dtor.ll
index f1ec71da1ebb64..b1b39ce02cfe85 100644
--- a/llvm/test/MC/WebAssembly/global-ctor-dtor.ll
+++ b/llvm/test/MC/WebAssembly/global-ctor-dtor.ll
@@ -63,7 +63,7 @@ declare void @func3()
; CHECK-NEXT: Value: 1
; CHECK-NEXT: Functions: [ 5, 7 ]
; CHECK-NEXT: - Type: DATACOUNT
-; CHECK-NEXT: Count: 1
+; CHECK-NEXT: Count: 3
; CHECK-NEXT: - Type: CODE
; CHECK-NEXT: Relocations:
; CHECK-NEXT: - Type: R_WASM_FUNCTION_INDEX_LEB
@@ -111,6 +111,18 @@ declare void @func3()
; CHECK-NEXT: Opcode: I32_CONST
; CHECK-NEXT: Value: 0
; CHECK-NEXT: Content: '01040000'
+; CHECK-NEXT: - SectionOffset: 15
+; CHECK-NEXT: InitFlags: 0
+; CHECK-NEXT: Offset:
+; CHECK-NEXT: Opcode: I32_CONST
+; CHECK-NEXT: Value: 4
+; CHECK-NEXT: Content: '0000000000000000'
+; CHECK-NEXT: - SectionOffset: 28
+; CHECK-NEXT: InitFlags: 0
+; CHECK-NEXT: Offset:
+; CHECK-NEXT: Opcode: I32_CONST
+; CHECK-NEXT: Value: 12
+; CHECK-NEXT: Content: '0000000000000000'
; CHECK-NEXT: - Type: CUSTOM
; CHECK-NEXT: Name: linking
; CHECK-NEXT: Version: 2
@@ -174,7 +186,15 @@ declare void @func3()
; CHECK-NEXT: - Index: 0
; CHECK-NEXT: Name: .data.global1
; CHECK-NEXT: Alignment: 3
-; CHECK-NEXT: Flags: [ ]
+; CHECK-NEXT: Flags: [ ]
+; CHECK-NEXT: - Index: 1
+; CHECK-NEXT: Name: .init_array.42
+; CHECK-NEXT: Alignment: 2
+; CHECK-NEXT: Flags: [ ]
+; CHECK-NEXT: - Index: 2
+; CHECK-NEXT: Name: .init_array
+; CHECK-NEXT: Alignment: 2
+; CHECK-NEXT: Flags: [ ]
; CHECK-NEXT: InitFunctions:
; CHECK-NEXT: - Priority: 42
; CHECK-NEXT: Symbol: 9
diff --git a/llvm/test/MC/WebAssembly/init-array-label.s b/llvm/test/MC/WebAssembly/init-array-label.s
new file mode 100644
index 00000000000000..4fe0b64b3bc512
--- /dev/null
+++ b/llvm/test/MC/WebAssembly/init-array-label.s
@@ -0,0 +1,86 @@
+# RUN: llvm-mc -triple=wasm32-unknown-unknown -filetype=obj < %s | obj2yaml | FileCheck %s
+
+init1:
+ .functype init1 () -> ()
+ end_function
+
+init2:
+ .functype init2 () -> ()
+ end_function
+
+ .section .init_array,"",@
+ .globl p_init1
+ .p2align 2, 0x0
+p_init1:
+ .section .init_array,"",@
+ .p2align 2, 0
+ .int32 init1
+ .size p_init1, 4
+
+ .section .init_array,"",@
+ .globl p_init2
+ .p2align 2, 0x0
+p_init2:
+ .section .init_array,"",@
+ .p2align 2
+ .int32 init2
+ .size p_init2, 4
+
+# CHECK: - Type: FUNCTION
+# CHECK-NEXT: FunctionTypes: [ 0, 0 ]
+# CHECK-NEXT: - Type: DATACOUNT
+# CHECK-NEXT: Count: 1
+# CHECK-NEXT: - Type: CODE
+# CHECK-NEXT: Functions:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Locals: []
+# CHECK-NEXT: Body: 0B
+# CHECK-NEXT: - Index: 1
+# CHECK-NEXT: Locals: []
+# CHECK-NEXT: Body: 0B
+# CHECK-NEXT: - Type: DATA
+# CHECK-NEXT: Segments:
+# CHECK-NEXT: - SectionOffset: 6
+# CHECK-NEXT: InitFlags: 0
+# CHECK-NEXT: Offset:
+# CHECK-NEXT: Opcode: I32_CONST
+# CHECK-NEXT: Value: 0
+# CHECK-NEXT: Content: '0000000000000000'
+# CHECK-NEXT: - Type: CUSTOM
+# CHECK-NEXT: Name: linking
+# CHECK-NEXT: Version: 2
+# CHECK-NEXT: SymbolTable:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Kind: FUNCTION
+# CHECK-NEXT: Name: init1
+# CHECK-NEXT: Flags: [ BINDING_LOCAL ]
+# CHECK-NEXT: Function: 0
+# CHECK-NEXT: - Index: 1
+# CHECK-NEXT: Kind: FUNCTION
+# CHECK-NEXT: Name: init2
+# CHECK-NEXT: Flags: [ BINDING_LOCAL ]
+# CHECK-NEXT: Function: 1
+# CHECK-NEXT: - Index: 2
+# CHECK-NEXT: Kind: DATA
+# CHECK-NEXT: Name: p_init1
+# CHECK-NEXT: Flags: [ ]
+# CHECK-NEXT: Segment: 0
+# CHECK-NEXT: Size: 4
+# CHECK-NEXT: - Index: 3
+# CHECK-NEXT: Kind: DATA
+# CHECK-NEXT: Name: p_init2
+# CHECK-NEXT: Flags: [ ]
+# CHECK-NEXT: Segment: 0
+# CHECK-NEXT: Offset: 4
+# CHECK-NEXT: Size: 4
+# CHECK-NEXT: SegmentInfo:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Name: .init_array
+# CHECK-NEXT: Alignment: 2
+# CHECK-NEXT: Flags: [ ]
+# CHECK-NEXT: InitFunctions:
+# CHECK-NEXT: - Priority: 65535
+# CHECK-NEXT: Symbol: 0
+# CHECK-NEXT: - Priority: 65535
+# CHECK-NEXT: Symbol: 1
+# CHECK-NEXT: ...
diff --git a/llvm/test/MC/WebAssembly/init-array.s b/llvm/test/MC/WebAssembly/init-array.s
index e79fb453ec12a3..ab8be11b0ff4a2 100644
--- a/llvm/test/MC/WebAssembly/init-array.s
+++ b/llvm/test/MC/WebAssembly/init-array.s
@@ -18,6 +18,8 @@ init2:
# CHECK: - Type: FUNCTION
# CHECK-NEXT: FunctionTypes: [ 0, 0 ]
+# CHECK-NEXT: - Type: DATACOUNT
+# CHECK-NEXT: Count: 1
# CHECK-NEXT: - Type: CODE
# CHECK-NEXT: Functions:
# CHECK-NEXT: - Index: 0
@@ -26,6 +28,14 @@ init2:
# CHECK-NEXT: - Index: 1
# CHECK-NEXT: Locals: []
# CHECK-NEXT: Body: 0B
+# CHECK-NEXT: - Type: DATA
+# CHECK-NEXT: Segments:
+# CHECK-NEXT: - SectionOffset: 6
+# CHECK-NEXT: InitFlags: 0
+# CHECK-NEXT: Offset:
+# CHECK-NEXT: Opcode: I32_CONST
+# CHECK-NEXT: Value: 0
+# CHECK-NEXT: Content: '0000000000000000'
# CHECK-NEXT: - Type: CUSTOM
# CHECK-NEXT: Name: linking
# CHECK-NEXT: Version: 2
@@ -40,6 +50,11 @@ init2:
# CHECK-NEXT: Name: init2
# CHECK-NEXT: Flags: [ BINDING_LOCAL ]
# CHECK-NEXT: Function: 1
+# CHECK-NEXT: SegmentInfo:
+# CHECK-NEXT: - Index: 0
+# CHECK-NEXT: Name: .init_array
+# CHECK-NEXT: Alignment: 2
+# CHECK-NEXT: Flags: [ ]
# CHECK-NEXT: InitFunctions:
# CHECK-NEXT: - Priority: 65535
# CHECK-NEXT: Symbol: 0
|
0426b8d
to
f048af2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, having asked you to add this code, I am now wondering if it is needed after all.
Since the linker runs with --gc-sections
enabled by default I think the presence of these unreferenced sections in the object file should be harmless. However keeping the object files free of unused sections also makes some sense. I'll let you decided if you want to keep this code.
lgtm either way.
llvm/lib/MC/WasmObjectWriter.cpp
Outdated
break; | ||
} | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we split this out into a helper function?
e.g:
if (!isSectionReferenced(SectionName))
continue;
f048af2
to
741f3bb
Compare
Now that it's been written, let's keep it. Keeping the objects clean is better.
Yes, but I've used a slightly different calling form since we want to check the name too. if (SectionName.starts_with(".init_array") &&
!isSectionReferenced(Asm, Section))
continue; The latest version of the commit has the logic in a separate function. If all looks good, could you hit the merge button for me? I don't have write access to the repo. Also, if you don't mind, would you help me set up the llvmbot so as to request a backport of both #111008 and this PR to the 19.x release branch, please? Thanks again for taking the time to review and feedback. |
It looks like you can request backpatches by commenting on an issue or PR: https://llvm.org/docs/GitHub.html#backporting-fixes-to-the-release-branches |
Follow on from llvm#111008. (cherry picked from commit ed91843)
Follow on from #111008.
Consider the following C code, which supplies a function
init()
and an array written to the.init_array
section so that the function runs beforemain()
.Note that a label
p_init
is also provided for the array. I'm not sure if there exists a way to do this in C without additionally creating thep_init
symbol.In any case, compile this code to find:
What is happening is the array to be written to
.init_array
is being handled specially byWasmObjectWriter.cpp
:llvm-project/llvm/lib/MC/WasmObjectWriter.cpp
Lines 1485 to 1487 in 4e0ba80
This existing code skips writing out the array in a data segment. Later this is indeed dealt with by writing each function in the array into a Wasm custom linking section, in the
InitFunctions
subsection.Unfortunately, the symbol
p_init
is also written to the Wasm custom linking section as awasm::WASM_SYMBOL_TYPE_DATA
, pointing to a data segment that now does not exist. As far as I can tell, we don't really indend to ever reference this symbol, the pattern is just used in C (and a similar construct can be used in Rust) to ensure that the.init_array
array exists. So, while we could just throw an error here it's not ideal as it breaks this useful pattern for life-before-main.In #111008 I suggested a change to simply not write the
p_init
symbol to the custom linking section of the Wasm object. But, after some prompting by @sbc100 it seems this is not a good idea, since it breaks code that does happen to reference thep_init
symbol for whatever reason.In this patch I instead remove the skip, so that data segments are also written for
.init_array
sections. This works around the problem above, since thep_init
entry now references a data section that exists. I don't think the data written there is useful, but the change at least avoids the error message above. It also fixes related issues when you have multiple.init_array
sections.Some tests related to the change have been modified to reflect the additional emitted data, and a new test
init-array-label.s
has been added to test the specific problem above.