-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[coroutines][CoroSplit] Store allocas on the frame if there is no explicit lifetime.end #88806
Conversation
…licit lifetime.end Certain passes such as SimplifyCFG may remove existing lifetime.end intrinsics on presplit coroutines. CoroSplitPass then incorrectly assumes that these allocas may not persist across the coroutine suspension point. To fix this, computeShouldLiveOnFrame() now saves allocas on the coroutine frame object if their addresses are escaped and those allocas don't have a corresponding lifetime.end. Fixes llvm#86580
@llvm/pr-subscribers-coroutines Author: Alan Zhao (alanzhao1) ChangesCertain passes such as SimplifyCFG may remove existing lifetime.end Fixes #86580 Full diff: https://github.com/llvm/llvm-project/pull/88806.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
index 08a4522e3fac6d..8bd7d0dc36d532 100644
--- a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -1550,6 +1550,8 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
}
void visitIntrinsicInst(IntrinsicInst &II) {
+ if (II.getIntrinsicID() == Intrinsic::lifetime_end)
+ HasExplicitLifetimeEnd = true;
// When we found the lifetime markers refers to a
// subrange of the original alloca, ignore the lifetime
// markers to avoid misleading the analysis.
@@ -1596,6 +1598,7 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
SmallPtrSet<IntrinsicInst *, 2> LifetimeStarts{};
bool MayWriteBeforeCoroBegin{false};
bool ShouldUseLifetimeStartInfo{true};
+ bool HasExplicitLifetimeEnd{false};
mutable std::optional<bool> ShouldLiveOnFrame{};
@@ -1614,6 +1617,10 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
// suspend point between lifetime markers. This should also cover the
// case of a single lifetime.start intrinsic in a loop with suspend point.
if (PI.isEscaped()) {
+ // If there is no explicit lifetime.end, then assume the address can
+ // cross suspension points.
+ if (!HasExplicitLifetimeEnd)
+ return true;
for (auto *A : LifetimeStarts) {
for (auto *B : LifetimeStarts) {
if (Checker.hasPathOrLoopCrossingSuspendPoint(A->getParent(),
diff --git a/llvm/test/Transforms/Coroutines/coro-alloca-no-lifetime-end.ll b/llvm/test/Transforms/Coroutines/coro-alloca-no-lifetime-end.ll
new file mode 100644
index 00000000000000..c9fdd4fd3d2bab
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-alloca-no-lifetime-end.ll
@@ -0,0 +1,54 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S | FileCheck %s
+
+declare ptr @malloc(i64)
+
+%i8.array = type { [100 x i8] }
+declare void @consume.i8.array(ptr)
+
+; testval does not contain an explicit lifetime end. We must assume that it may
+; live across suspension.
+define void @foo() presplitcoroutine {
+; CHECK-LABEL: define void @foo() {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @foo.resumers)
+; CHECK-NEXT: [[ALLOC:%.*]] = call ptr @malloc(i64 16)
+; CHECK-NEXT: [[VFRAME:%.*]] = call noalias nonnull ptr @llvm.coro.begin(token [[ID]], ptr [[ALLOC]])
+; CHECK-NEXT: store ptr @foo.resume, ptr [[VFRAME]], align 8
+; CHECK-NEXT: [[DESTROY_ADDR:%.*]] = getelementptr inbounds [[FOO_FRAME:%.*]], ptr [[VFRAME]], i32 0, i32 1
+; CHECK-NEXT: store ptr @foo.destroy, ptr [[DESTROY_ADDR]], align 8
+; CHECK-NEXT: [[INDEX_ADDR1:%.*]] = getelementptr inbounds [[FOO_FRAME]], ptr [[VFRAME]], i32 0, i32 2
+; CHECK-NEXT: call void @consume.i8.array(ptr [[INDEX_ADDR1]])
+; CHECK-NEXT: [[INDEX_ADDR2:%.*]] = getelementptr inbounds [[FOO_FRAME]], ptr [[VFRAME]], i32 0, i32 3
+; CHECK-NEXT: store i1 false, ptr [[INDEX_ADDR2]], align 1
+; CHECK-NEXT: ret void
+;
+entry:
+ %testval = alloca %i8.array
+ %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+ %alloc = call ptr @malloc(i64 16) #3
+ %vFrame = call noalias nonnull ptr @llvm.coro.begin(token %id, ptr %alloc)
+
+ call void @llvm.lifetime.start.p0(i64 100, ptr %testval)
+ call void @consume.i8.array(ptr %testval)
+
+ %save = call token @llvm.coro.save(ptr null)
+ %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+ switch i8 %suspend, label %exit [
+ i8 0, label %await.ready
+ i8 1, label %exit
+ ]
+await.ready:
+ %StrayCoroSave = call token @llvm.coro.save(ptr null)
+ br label %exit
+exit:
+ ret void
+}
+
+declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr)
+declare ptr @llvm.coro.begin(token, ptr writeonly) #3
+declare token @llvm.coro.save(ptr) #3
+declare ptr @llvm.coro.frame() #5
+declare i8 @llvm.coro.suspend(token, i1) #3
+declare i1 @llvm.coro.end(ptr, i1, token) #3
+declare void @llvm.lifetime.start.p0(i64, ptr nocapture) #4
|
@llvm/pr-subscribers-llvm-transforms Author: Alan Zhao (alanzhao1) ChangesCertain passes such as SimplifyCFG may remove existing lifetime.end Fixes #86580 Full diff: https://github.com/llvm/llvm-project/pull/88806.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
index 08a4522e3fac6d..8bd7d0dc36d532 100644
--- a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -1550,6 +1550,8 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
}
void visitIntrinsicInst(IntrinsicInst &II) {
+ if (II.getIntrinsicID() == Intrinsic::lifetime_end)
+ HasExplicitLifetimeEnd = true;
// When we found the lifetime markers refers to a
// subrange of the original alloca, ignore the lifetime
// markers to avoid misleading the analysis.
@@ -1596,6 +1598,7 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
SmallPtrSet<IntrinsicInst *, 2> LifetimeStarts{};
bool MayWriteBeforeCoroBegin{false};
bool ShouldUseLifetimeStartInfo{true};
+ bool HasExplicitLifetimeEnd{false};
mutable std::optional<bool> ShouldLiveOnFrame{};
@@ -1614,6 +1617,10 @@ struct AllocaUseVisitor : PtrUseVisitor<AllocaUseVisitor> {
// suspend point between lifetime markers. This should also cover the
// case of a single lifetime.start intrinsic in a loop with suspend point.
if (PI.isEscaped()) {
+ // If there is no explicit lifetime.end, then assume the address can
+ // cross suspension points.
+ if (!HasExplicitLifetimeEnd)
+ return true;
for (auto *A : LifetimeStarts) {
for (auto *B : LifetimeStarts) {
if (Checker.hasPathOrLoopCrossingSuspendPoint(A->getParent(),
diff --git a/llvm/test/Transforms/Coroutines/coro-alloca-no-lifetime-end.ll b/llvm/test/Transforms/Coroutines/coro-alloca-no-lifetime-end.ll
new file mode 100644
index 00000000000000..c9fdd4fd3d2bab
--- /dev/null
+++ b/llvm/test/Transforms/Coroutines/coro-alloca-no-lifetime-end.ll
@@ -0,0 +1,54 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 4
+; RUN: opt < %s -passes='cgscc(coro-split),simplifycfg,early-cse' -S | FileCheck %s
+
+declare ptr @malloc(i64)
+
+%i8.array = type { [100 x i8] }
+declare void @consume.i8.array(ptr)
+
+; testval does not contain an explicit lifetime end. We must assume that it may
+; live across suspension.
+define void @foo() presplitcoroutine {
+; CHECK-LABEL: define void @foo() {
+; CHECK-NEXT: entry:
+; CHECK-NEXT: [[ID:%.*]] = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr @foo.resumers)
+; CHECK-NEXT: [[ALLOC:%.*]] = call ptr @malloc(i64 16)
+; CHECK-NEXT: [[VFRAME:%.*]] = call noalias nonnull ptr @llvm.coro.begin(token [[ID]], ptr [[ALLOC]])
+; CHECK-NEXT: store ptr @foo.resume, ptr [[VFRAME]], align 8
+; CHECK-NEXT: [[DESTROY_ADDR:%.*]] = getelementptr inbounds [[FOO_FRAME:%.*]], ptr [[VFRAME]], i32 0, i32 1
+; CHECK-NEXT: store ptr @foo.destroy, ptr [[DESTROY_ADDR]], align 8
+; CHECK-NEXT: [[INDEX_ADDR1:%.*]] = getelementptr inbounds [[FOO_FRAME]], ptr [[VFRAME]], i32 0, i32 2
+; CHECK-NEXT: call void @consume.i8.array(ptr [[INDEX_ADDR1]])
+; CHECK-NEXT: [[INDEX_ADDR2:%.*]] = getelementptr inbounds [[FOO_FRAME]], ptr [[VFRAME]], i32 0, i32 3
+; CHECK-NEXT: store i1 false, ptr [[INDEX_ADDR2]], align 1
+; CHECK-NEXT: ret void
+;
+entry:
+ %testval = alloca %i8.array
+ %id = call token @llvm.coro.id(i32 0, ptr null, ptr null, ptr null)
+ %alloc = call ptr @malloc(i64 16) #3
+ %vFrame = call noalias nonnull ptr @llvm.coro.begin(token %id, ptr %alloc)
+
+ call void @llvm.lifetime.start.p0(i64 100, ptr %testval)
+ call void @consume.i8.array(ptr %testval)
+
+ %save = call token @llvm.coro.save(ptr null)
+ %suspend = call i8 @llvm.coro.suspend(token %save, i1 false)
+ switch i8 %suspend, label %exit [
+ i8 0, label %await.ready
+ i8 1, label %exit
+ ]
+await.ready:
+ %StrayCoroSave = call token @llvm.coro.save(ptr null)
+ br label %exit
+exit:
+ ret void
+}
+
+declare token @llvm.coro.id(i32, ptr readnone, ptr nocapture readonly, ptr)
+declare ptr @llvm.coro.begin(token, ptr writeonly) #3
+declare token @llvm.coro.save(ptr) #3
+declare ptr @llvm.coro.frame() #5
+declare i8 @llvm.coro.suspend(token, i1) #3
+declare i1 @llvm.coro.end(ptr, i1, token) #3
+declare void @llvm.lifetime.start.p0(i64, ptr nocapture) #4
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the analysis, it is pretty helpful.
From the analysis, I feel there is deep problems in the current model since it shows the lifetime intrinsics is not so stable as we thought. e.g., if there are multiple llvm.lifetime.start and llvm.lifetime.end intrinsics but some passes only remove some of the llvm.lifetime.end
intrinsics (not all), I fear it may still be problematic.
The root problem here is that, when we use lifetime info to decide if an alloca should live on the frame, we don't consider lifetime end at all. This is not correct. To fix this fundamentally, I think we need to introduce some data flow analysis here. The logic may be:
- If there is any path from any llvm.lifetime.start to the llvm.coro.end without touching llvm.lifetime.end, we need to put the variable on the frame.
- If there is any path from any llvm.lifetime.start to the llvm.lifetime.end touched llvm.coro.suspend, we need to put the variable on the frame.
- Then it may be safe to put the variable on the stack.
I am not sure if there is existing logic for this. If not, probably we need to do it ourselves. We already did some of such local data flow analysis in coroutines.
Also:
llvm/lib/CodeGen/StackColoring.cpp implements this analysis, but not in a way you can easily reuse. Still probably a useful reference. (Lifetime intrinstics are inherently pretty messy; see also #45725.) |
Is someone looking at doing the deeper fix to the lifetime analysis as suggested above, or should we do a stop gap fix in the meantime? I'm not sure if the current patch is sufficient: it checks whether there is a lifetime.end intrinsic, but what if there is one on one execution path, but not another? Maybe we should always set ShouldUseLifetimeStartInfo to avoid this miscompile for now? |
I have less time developing into coroutines in detail recently. So if there is no one had the chance to fix this properly in the near future, I think we can accept this as a partial workaround.
Yeah, this is the reason why I think the current patch is a partial fix.
I am not sure what you mean. |
Agreed that this is a partial fix - it doesn't address the example that @zmodem mentioned, plus there could be other scenarios that we didn't think of. A complete fix would probably require an RFC on Discourse due to its complexity (llvm lifetimes intrinsics are notoriously fickle per #88806 (comment), and we'd likely have to rewrite most of the existing logic here.)
I think the proposal is that we put all coroutine variables on the coroutine frame object rather than try to figure out what variables can go on the stack. This would generate suboptimal code, but it will be guaranteed to be correct. |
Oh, it means to set Maybe we should try to look at how to fix it properly... |
Yes, sorry, I forgot to write the "to false" part.
That makes it all the more important to ensure we don't miscompile the code ;)
Agreed :) |
I think it would mostly just increase memory pressure, but not so much on runtime performance unless such case happens frequently enough to bloat up the heap size to certain point that allocation performance starts to dip. |
I started a discourse thread at https://discourse.llvm.org/t/coro-using-lifetime-markers-to-allocate-values-on-the-stack-vs-the-frame/78451 |
Closing in favor of #90265 |
Certain passes such as SimplifyCFG may remove existing lifetime.end intrinsics on presplit coroutines. CoroSplitPass then incorrectly assumes that these allocas may not persist across the coroutine suspension point. To fix this, computeShouldLiveOnFrame() now saves allocas on the coroutine frame object if their addresses are escaped and those allocas don't have a corresponding lifetime.end.
Fixes #86580