-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Aarch64: Emit a minimal SEH prologue when needed #158173
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Aarch64: Emit a minimal SEH prologue when needed #158173
Conversation
✅ With the latest revision this PR passed the C/C++ code formatter. |
I think we only want the .seh_endprologue in the cases where we actually have an epilogue. If we don't need a prologue or epilogue, we can completely omit the .pdata record as an optimization. (https://learn.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=msvc-170#pdata-records) Strictly speaking, I think we something with end_c to correctly express callee-pop calls, so the computed stack pointer on entry is consistent. I'm not sure how much it matters in practice, though; the caller will usually have a frame pointer. |
The issue is happening with Swift where we end up with an assembler error, see swiftlang#11377. This is probably not the right way to fix it, though. These changes did solve the issue I was seeing but clearly break other things. I am not sure what the right thing do to is, maybe the generate LLVM IR by the Swift compiler is incorrect? |
I think you're pretty close here; just need to narrow the condition where the seh_endprologue is emitted. |
In some cases, with very simple thunks, it is possible that the `.seh_endprologue` is not emitted. This causes issues in the assembler because the epilogue ends up starting before the prologue has ended.
546dee6
to
8ae1e3a
Compare
@llvm/pr-subscribers-backend-aarch64 Author: Fabrice de Gans (Steelskin) ChangesIn some cases, with very simple thunks, it is possible that the Full diff: https://github.com/llvm/llvm-project/pull/158173.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp b/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
index 7947469b6c04f..63b2e2d67f4b1 100644
--- a/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+++ b/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
@@ -541,6 +541,9 @@ void AArch64PrologueEmitter::emitPrologue() {
// to determine the end of the prologue.
DebugLoc DL;
+ if (AFI->getArgumentStackToRestore())
+ HasWinCFI = true;
+
if (AFI->shouldSignReturnAddress(MF)) {
// If pac-ret+leaf is in effect, PAUTH_PROLOGUE pseudo instructions
// are inserted by emitPacRetPlusLeafHardening().
diff --git a/llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll b/llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll
new file mode 100644
index 0000000000000..e495b25690744
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll
@@ -0,0 +1,85 @@
+; RUN: llc -mtriple=aarch64-windows %s -o - | FileCheck %s
+
+; This test verifies that functions requiring Windows CFI that have minimal
+; or no prologue instructions still emit proper SEH directives, specifically
+; ensuring .seh_endprologue is emitted before .seh_startepilogue.
+;
+; This reproduces the issue where Swift async functions with swifttailcc
+; calling convention would fail with:
+; "error: starting epilogue (.seh_startepilogue) before prologue has ended (.seh_endprologue)"
+
+; Test 1: Swift-style tail call function with minimal prologue
+define swifttailcc void @test_swifttailcc_minimal(ptr %async_ctx, ptr %arg1, ptr %arg2) {
+; CHECK-LABEL: test_swifttailcc_minimal:
+; CHECK-NOT: .seh_proc test_swifttailcc_minimal
+; CHECK-NOT: .seh_endprologue
+; CHECK-NOT: .seh_startepilogue
+; CHECK-NOT: .seh_endepilogue
+; CHECK-NOT: .seh_endproc
+entry:
+ %ptr1 = getelementptr inbounds i8, ptr %async_ctx, i64 16
+ %ptr2 = getelementptr inbounds i8, ptr %async_ctx, i64 24
+ store ptr %arg1, ptr %ptr1, align 8
+ store ptr %arg2, ptr %ptr2, align 8
+ musttail call swifttailcc void @external_swift_function(ptr %async_ctx, ptr %arg1)
+ ret void
+}
+
+; Test 2: Regular function with no stack frame but needs epilogue
+define void @test_no_stack_frame() {
+; CHECK-LABEL: test_no_stack_frame:
+; CHECK-NEXT: .seh_proc test_no_stack_frame
+; CHECK: .seh_endprologue
+; CHECK: .seh_startepilogue
+; CHECK: .seh_endepilogue
+; CHECK: .seh_endproc
+entry:
+ call void @external_function()
+ ret void
+}
+
+; Test 3: Function with minimal stack adjustment only in epilogue
+define void @test_minimal_stack_adjust(ptr %ptr) {
+; CHECK-LABEL: test_minimal_stack_adjust:
+; CHECK-NEXT: .seh_proc test_minimal_stack_adjust
+; CHECK: .seh_endprologue
+; CHECK: .seh_startepilogue
+; CHECK: add sp, sp, #16
+; CHECK: .seh_stackalloc 16
+; CHECK: .seh_endepilogue
+; CHECK: .seh_endproc
+entry:
+ %local = alloca i64, align 8
+ store i64 42, ptr %local, align 8
+ %value = load i64, ptr %local, align 8
+ store i64 %value, ptr %ptr, align 8
+ ret void
+}
+
+; Test 4: Function similar to the original failing case
+define linkonce_odr hidden swifttailcc void @test_linkonce_swifttailcc(ptr swiftasync %async_ctx, ptr %arg1, ptr noalias dereferenceable(40) %arg2, ptr %arg3, i64 %value, ptr %arg4, ptr %arg5, ptr %arg6, i1 %flag, ptr %arg7, ptr noalias dereferenceable(40) %arg8) {
+; CHECK-LABEL: test_linkonce_swifttailcc:
+; CHECK-NEXT: .seh_proc
+; CHECK: .seh_endprologue
+; CHECK: .seh_startepilogue
+; CHECK: .seh_endepilogue
+; CHECK: .seh_endproc
+entry:
+ %frame_ptr = getelementptr inbounds nuw i8, ptr %async_ctx, i64 16
+ %ctx1 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 400
+ %ctx2 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 1168
+ %spill1 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2392
+ store ptr %arg8, ptr %spill1, align 8
+ %spill2 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2384
+ store ptr %arg7, ptr %spill2, align 8
+ %spill3 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2225
+ store i1 %flag, ptr %spill3, align 1
+ %spill4 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2376
+ store ptr %arg6, ptr %spill4, align 8
+ musttail call swifttailcc void @external_swift_continuation(ptr swiftasync %async_ctx, i64 0, i64 0)
+ ret void
+}
+
+declare swifttailcc void @external_swift_function(ptr, ptr)
+declare swifttailcc void @external_swift_continuation(ptr swiftasync, i64, i64)
+declare void @external_function()
diff --git a/llvm/test/CodeGen/AArch64/wincfi-seh-only-in-epilogue.ll b/llvm/test/CodeGen/AArch64/wincfi-minimal-seh-prologue.ll
similarity index 78%
rename from llvm/test/CodeGen/AArch64/wincfi-seh-only-in-epilogue.ll
rename to llvm/test/CodeGen/AArch64/wincfi-minimal-seh-prologue.ll
index 7daceae3dd4c0..8308108b84f08 100644
--- a/llvm/test/CodeGen/AArch64/wincfi-seh-only-in-epilogue.ll
+++ b/llvm/test/CodeGen/AArch64/wincfi-minimal-seh-prologue.ll
@@ -5,8 +5,9 @@ entry:
ret void
}
-; Check that there is no .seh_endprologue but there is seh_startepilogue/seh_endepilogue.
-; CHECK-NOT: .seh_endprologue
+; Check that there is a minimal SEH prologue with seh_startepilogue/seh_endepilogue.
+; CHECK: .seh_proc test
+; CHECK: .seh_endprologue
; CHECK: .seh_startepilogue
; CHECK: add sp, sp, #48
; CHECK: .seh_stackalloc 48
|
My one concern here is whether we need to emit the stack adjustment in the prologue somehow. I mentioned this before, but thinking about it a bit more, the caller might not have a frame pointer in some cases. In particular, if a C calling-convention function tail-calls a swifttailcc function, the caller doesn't strictly need a frame pointer. This could impact Swift code if you're not enabling frame pointers by default. The effect would be that stack unwinding doesn't work: the unwinder will compute the wrong offset for the stack pointer, which would corrupt the stack. The way to encode the stack adjustment, in that case, should be something like the following:
seh_endc currently doesn't exist. It shouldn't be hard to implement, although the interaction with epilogue splitting is maybe a little tricky. (See also https://learn.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=msvc-170). I won't block this patch on implementing that, but if you need the functionality for Swift, you might want to add it to your roadmap. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM with one minor comment.
Thanks for that! Definitely something to consider. However, I believe that at least x64 requires that the frame pointer is present as per the ABI? If so, we should be including the frame pointer always in Swift. |
Thanks for the review, I addressed the comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that we can be more specific in the comment, but that isn't as important. It specifically is when the tail call has an airty mismatch and the resumption needs additional stack space for additional arguments.
In some cases, with very simple thunks, it is possible that the `.seh_endprologue` is not emitted. This causes issues in the assembler because the epilogue ends up starting before the prologue has ended. Bug: swiftlang#11377
In some cases, with very simple thunks, it is possible that the `.seh_endprologue` is not emitted. This causes issues in the assembler because the epilogue ends up starting before the prologue has ended. Bug: swiftlang#11377
…ys-emit-seh-end-prologue 🍒 Aarch64: Emit a minimal SEH prologue when needed (llvm#158173)
In some cases, with very simple thunks, it is possible that the
.seh_endprologue
is not emitted. This causes issues in the assembler because the epilogue ends up starting before the prologue has ended.Bug: swiftlang#11377