Skip to content

Conversation

Steelskin
Copy link
Contributor

@Steelskin Steelskin commented Sep 12, 2025

In some cases, with very simple thunks, it is possible that the .seh_endprologue is not emitted. This causes issues in the assembler because the epilogue ends up starting before the prologue has ended.

Bug: swiftlang#11377

Copy link

github-actions bot commented Sep 12, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Sep 15, 2025

I think we only want the .seh_endprologue in the cases where we actually have an epilogue. If we don't need a prologue or epilogue, we can completely omit the .pdata record as an optimization. (https://learn.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=msvc-170#pdata-records)

Strictly speaking, I think we something with end_c to correctly express callee-pop calls, so the computed stack pointer on entry is consistent. I'm not sure how much it matters in practice, though; the caller will usually have a frame pointer.

@Steelskin
Copy link
Contributor Author

I think we only want the .seh_endprologue in the cases where we actually have an epilogue. If we don't need a prologue or epilogue, we can completely omit the .pdata record as an optimization. (https://learn.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=msvc-170#pdata-records)

Strictly speaking, I think we something with end_c to correctly express callee-pop calls, so the computed stack pointer on entry is consistent. I'm not sure how much it matters in practice, though; the caller will usually have a frame pointer.

The issue is happening with Swift where we end up with an assembler error, see swiftlang#11377. This is probably not the right way to fix it, though. These changes did solve the issue I was seeing but clearly break other things. I am not sure what the right thing do to is, maybe the generate LLVM IR by the Swift compiler is incorrect?

@efriedma-quic
Copy link
Collaborator

I think you're pretty close here; just need to narrow the condition where the seh_endprologue is emitted.

In some cases, with very simple thunks, it is possible that the
`.seh_endprologue` is not emitted. This causes issues in the assembler
because the epilogue ends up starting before the prologue has ended.
@Steelskin Steelskin force-pushed the fabrice/win-arm64-always-emit-seh-end-prologue branch from 546dee6 to 8ae1e3a Compare September 26, 2025 21:04
@Steelskin Steelskin changed the title [WIP] Aarch64: Always emit .seh_endprologue Aarch64: Emit a minimal SEH prologue when needed Sep 26, 2025
@Steelskin Steelskin marked this pull request as ready for review September 27, 2025 18:41
@llvmbot
Copy link
Member

llvmbot commented Sep 27, 2025

@llvm/pr-subscribers-backend-aarch64

Author: Fabrice de Gans (Steelskin)

Changes

In some cases, with very simple thunks, it is possible that the .seh_endprologue is not emitted. This causes issues in the assembler because the epilogue ends up starting before the prologue has ended.


Full diff: https://github.com/llvm/llvm-project/pull/158173.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp (+3)
  • (added) llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll (+85)
  • (renamed) llvm/test/CodeGen/AArch64/wincfi-minimal-seh-prologue.ll (+3-2)
diff --git a/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp b/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
index 7947469b6c04f..63b2e2d67f4b1 100644
--- a/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
+++ b/llvm/lib/Target/AArch64/AArch64PrologueEpilogue.cpp
@@ -541,6 +541,9 @@ void AArch64PrologueEmitter::emitPrologue() {
   // to determine the end of the prologue.
   DebugLoc DL;
 
+  if (AFI->getArgumentStackToRestore())
+    HasWinCFI = true;
+
   if (AFI->shouldSignReturnAddress(MF)) {
     // If pac-ret+leaf is in effect, PAUTH_PROLOGUE pseudo instructions
     // are inserted by emitPacRetPlusLeafHardening().
diff --git a/llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll b/llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll
new file mode 100644
index 0000000000000..e495b25690744
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/seh-minimal-prologue-epilogue.ll
@@ -0,0 +1,85 @@
+; RUN: llc -mtriple=aarch64-windows %s -o - | FileCheck %s
+
+; This test verifies that functions requiring Windows CFI that have minimal
+; or no prologue instructions still emit proper SEH directives, specifically
+; ensuring .seh_endprologue is emitted before .seh_startepilogue.
+;
+; This reproduces the issue where Swift async functions with swifttailcc
+; calling convention would fail with:
+; "error: starting epilogue (.seh_startepilogue) before prologue has ended (.seh_endprologue)"
+
+; Test 1: Swift-style tail call function with minimal prologue
+define swifttailcc void @test_swifttailcc_minimal(ptr %async_ctx, ptr %arg1, ptr %arg2) {
+; CHECK-LABEL: test_swifttailcc_minimal:
+; CHECK-NOT:   .seh_proc test_swifttailcc_minimal
+; CHECK-NOT:   .seh_endprologue
+; CHECK-NOT:   .seh_startepilogue
+; CHECK-NOT:   .seh_endepilogue
+; CHECK-NOT:   .seh_endproc
+entry:
+  %ptr1 = getelementptr inbounds i8, ptr %async_ctx, i64 16
+  %ptr2 = getelementptr inbounds i8, ptr %async_ctx, i64 24
+  store ptr %arg1, ptr %ptr1, align 8
+  store ptr %arg2, ptr %ptr2, align 8
+  musttail call swifttailcc void @external_swift_function(ptr %async_ctx, ptr %arg1)
+  ret void
+}
+
+; Test 2: Regular function with no stack frame but needs epilogue
+define void @test_no_stack_frame() {
+; CHECK-LABEL: test_no_stack_frame:
+; CHECK-NEXT:  .seh_proc test_no_stack_frame
+; CHECK:       .seh_endprologue
+; CHECK:       .seh_startepilogue
+; CHECK:       .seh_endepilogue
+; CHECK:       .seh_endproc
+entry:
+  call void @external_function()
+  ret void
+}
+
+; Test 3: Function with minimal stack adjustment only in epilogue
+define void @test_minimal_stack_adjust(ptr %ptr) {
+; CHECK-LABEL: test_minimal_stack_adjust:
+; CHECK-NEXT:  .seh_proc test_minimal_stack_adjust
+; CHECK:       .seh_endprologue
+; CHECK:       .seh_startepilogue
+; CHECK:       add sp, sp, #16
+; CHECK:       .seh_stackalloc 16
+; CHECK:       .seh_endepilogue
+; CHECK:       .seh_endproc
+entry:
+  %local = alloca i64, align 8
+  store i64 42, ptr %local, align 8
+  %value = load i64, ptr %local, align 8
+  store i64 %value, ptr %ptr, align 8
+  ret void
+}
+
+; Test 4: Function similar to the original failing case
+define linkonce_odr hidden swifttailcc void @test_linkonce_swifttailcc(ptr swiftasync %async_ctx, ptr %arg1, ptr noalias dereferenceable(40) %arg2, ptr %arg3, i64 %value, ptr %arg4, ptr %arg5, ptr %arg6, i1 %flag, ptr %arg7, ptr noalias dereferenceable(40) %arg8) {
+; CHECK-LABEL: test_linkonce_swifttailcc:
+; CHECK-NEXT:  .seh_proc
+; CHECK:       .seh_endprologue
+; CHECK:       .seh_startepilogue
+; CHECK:       .seh_endepilogue
+; CHECK:       .seh_endproc
+entry:
+  %frame_ptr = getelementptr inbounds nuw i8, ptr %async_ctx, i64 16
+  %ctx1 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 400
+  %ctx2 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 1168
+  %spill1 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2392
+  store ptr %arg8, ptr %spill1, align 8
+  %spill2 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2384
+  store ptr %arg7, ptr %spill2, align 8
+  %spill3 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2225
+  store i1 %flag, ptr %spill3, align 1
+  %spill4 = getelementptr inbounds nuw i8, ptr %async_ctx, i64 2376
+  store ptr %arg6, ptr %spill4, align 8
+  musttail call swifttailcc void @external_swift_continuation(ptr swiftasync %async_ctx, i64 0, i64 0)
+  ret void
+}
+
+declare swifttailcc void @external_swift_function(ptr, ptr)
+declare swifttailcc void @external_swift_continuation(ptr swiftasync, i64, i64)
+declare void @external_function()
diff --git a/llvm/test/CodeGen/AArch64/wincfi-seh-only-in-epilogue.ll b/llvm/test/CodeGen/AArch64/wincfi-minimal-seh-prologue.ll
similarity index 78%
rename from llvm/test/CodeGen/AArch64/wincfi-seh-only-in-epilogue.ll
rename to llvm/test/CodeGen/AArch64/wincfi-minimal-seh-prologue.ll
index 7daceae3dd4c0..8308108b84f08 100644
--- a/llvm/test/CodeGen/AArch64/wincfi-seh-only-in-epilogue.ll
+++ b/llvm/test/CodeGen/AArch64/wincfi-minimal-seh-prologue.ll
@@ -5,8 +5,9 @@ entry:
   ret void
 }
 
-; Check that there is no .seh_endprologue but there is seh_startepilogue/seh_endepilogue.
-; CHECK-NOT: .seh_endprologue
+; Check that there is a minimal SEH prologue with seh_startepilogue/seh_endepilogue.
+; CHECK:     .seh_proc test
+; CHECK:     .seh_endprologue
 ; CHECK:     .seh_startepilogue
 ; CHECK:     add sp, sp, #48
 ; CHECK:     .seh_stackalloc 48

@efriedma-quic
Copy link
Collaborator

My one concern here is whether we need to emit the stack adjustment in the prologue somehow. I mentioned this before, but thinking about it a bit more, the caller might not have a frame pointer in some cases. In particular, if a C calling-convention function tail-calls a swifttailcc function, the caller doesn't strictly need a frame pointer. This could impact Swift code if you're not enabling frame pointers by default.

The effect would be that stack unwinding doesn't work: the unwinder will compute the wrong offset for the stack pointer, which would corrupt the stack.

The way to encode the stack adjustment, in that case, should be something like the following:

.seh_stackalloc #16
.seh_endc
.seh_endprologue

seh_endc currently doesn't exist. It shouldn't be hard to implement, although the interaction with epilogue splitting is maybe a little tricky.

(See also https://learn.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=msvc-170).


I won't block this patch on implementing that, but if you need the functionality for Swift, you might want to add it to your roadmap.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one minor comment.

@compnerd
Copy link
Member

My one concern here is whether we need to emit the stack adjustment in the prologue somehow. I mentioned this before, but thinking about it a bit more, the caller might not have a frame pointer in some cases. In particular, if a C calling-convention function tail-calls a swifttailcc function, the caller doesn't strictly need a frame pointer. This could impact Swift code if you're not enabling frame pointers by default.

The effect would be that stack unwinding doesn't work: the unwinder will compute the wrong offset for the stack pointer, which would corrupt the stack.

The way to encode the stack adjustment, in that case, should be something like the following:

.seh_stackalloc #16
.seh_endc
.seh_endprologue

seh_endc currently doesn't exist. It shouldn't be hard to implement, although the interaction with epilogue splitting is maybe a little tricky.

(See also https://learn.microsoft.com/en-us/cpp/build/arm64-exception-handling?view=msvc-170).

I won't block this patch on implementing that, but if you need the functionality for Swift, you might want to add it to your roadmap.

Thanks for that! Definitely something to consider. However, I believe that at least x64 requires that the frame pointer is present as per the ABI? If so, we should be including the frame pointer always in Swift.

@Steelskin
Copy link
Contributor Author

Thanks for the review, I addressed the comments.

Copy link
Member

@compnerd compnerd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we can be more specific in the comment, but that isn't as important. It specifically is when the tail call has an airty mismatch and the resumption needs additional stack space for additional arguments.

@compnerd compnerd merged commit 58b4951 into llvm:main Sep 30, 2025
9 checks passed
@Steelskin Steelskin deleted the fabrice/win-arm64-always-emit-seh-end-prologue branch September 30, 2025 15:19
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
In some cases, with very simple thunks, it is possible that the
`.seh_endprologue` is not emitted. This causes issues in the assembler
because the epilogue ends up starting before the prologue has ended.

Bug: swiftlang#11377
Steelskin added a commit to Steelskin/llvm-project that referenced this pull request Oct 3, 2025
In some cases, with very simple thunks, it is possible that the
`.seh_endprologue` is not emitted. This causes issues in the assembler
because the epilogue ends up starting before the prologue has ended.

Bug: swiftlang#11377
compnerd added a commit to swiftlang/llvm-project that referenced this pull request Oct 7, 2025
…ys-emit-seh-end-prologue

🍒 Aarch64: Emit a minimal SEH prologue when needed (llvm#158173)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants