-
Notifications
You must be signed in to change notification settings - Fork 258
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
r23~b6 miscompiles some coroutines (fixed upstream, mere days after point of last merge) #1540
Comments
Thanks for all the details! Coroutines support is still incomplete in Clang, so we won't spin another RC for this but it's unavoidable that we'll need an r23b release and we'll pick this up then :) |
FWIW, it is "incomplete", but definitely usable? Like, I realize it says "partial", but I feel like they are being pedantic there (as they should, being compiler people): AFAIK Facebook has been using C++ coroutines in production as part of their Folly library for a long while now (though probably only on servers). My project makes rather heavy use of coroutines, on Android even in specific, since 2019; so, I can attest to them seemingly working great at least as far back as r19. Regardless: I mostly have everyone compile our code using r21 anyway--as it is an LTS, which I'm super glad you provide!! :D--and have only recently been dipping my feet in r22, so I'm not like, super sad at your response or anything ;P. Thanks for paying attention! |
https://android-review.googlesource.com/c/toolchain/llvm_android/+/1817582 cherry-picks the fix to r23b toolchain. |
https://ci.android.com/builds/branches/aosp-master-ndk/grid?head=7732824&tail=7732824 has the fix. I wasn't able to repro the crash (guessing at build flags didn't get me there) so I can't verify that it fixed the issue. |
OK, so before I had just been using my project's giant build system to do the compile, and I sadly deleted the folder that I had been doing the bisection work in a month later (and so a month ago), so I experienced first-hand the complexity of my "unhelpful.zip" ;P. I'm sorry I didn't spend the time before to whittle down the build system (as it didn't even take me long to do just now... I was just feeling burned out after spending days bisecting before). @DanAlbert The secret is (apparently) using -Os. So, here, using lewissbaker/cppcoro@a87e97f (the current version) and llvm-mirror/libcxx@2076f53 (the tip of the release_90 branch, which I appreciate is an old version), here is a single call of clang that replicates the issue.
I've re-verified that this works on r21/r22--as well as Ubuntu's clang++-1[012]--and fails on r23~b6 (as before); I then verified it failed on the official release of r23 (as expected)... but, sadly, it is still failing on this r24 canary build :(. |
Can you double check? For me (on Linux), |
@pirama-arumuga-nainar The canary build @DanAlbert linked to (which is the first canary I've ever downloaded: I didn't know this was a thing I could do before, so I couldn't have anything mixed up locally) is reporting as "r416183c", which is missing the "1" at the end?
FWIW, it just doesn't seem to have that patch (so at least that's consistent? ;P).
# git blame c935d99d7cf2016289302412d708641d52d2f7ee -- llvm/lib/Transforms/Coroutines/CoroFrame.cpp | grep -nC3 CouldMerge
Blaming lines: 100% (2304/2304), done.
596-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 596) // NonOverlappedAllocaSet.
597-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 597) for (auto &AllocaSet : NonOverlapedAllocas) {
598-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 598) assert(!AllocaSet.empty() && "Processing Alloca Set is not empty.\n");
599:b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 599) bool CouldMerge = none_of(AllocaSet, [&](auto Iter) {
600-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 600) return IsAllocaInferenre(Alloca, Iter);
601-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 601) });
602:b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 602) if (!CouldMerge)
603-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 603) continue;
604-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 604) AllocaIndex[Alloca] = AllocaIndex[*AllocaSet.begin()];
605-b3a722e66b753 (Chuanqi Xu 2020-09-28 15:40:35 +0800 605) AllocaSet.push_back(Alloca); ^ This is the old code, per the patch. v diff --git a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
index 26d4ea482ffb..e2642234ac9b 100644
--- a/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
+++ b/llvm/lib/Transforms/Coroutines/CoroFrame.cpp
@@ -596,9 +596,21 @@ void FrameTypeBuilder::addFieldForAllocas(const Function &F,
// NonOverlappedAllocaSet.
for (auto &AllocaSet : NonOverlapedAllocas) {
assert(!AllocaSet.empty() && "Processing Alloca Set is not empty.\n");
- bool CouldMerge = none_of(AllocaSet, [&](auto Iter) {
+ bool NoInference = none_of(AllocaSet, [&](auto Iter) {
return IsAllocaInferenre(Alloca, Iter);
});
+ // If the alignment of A is multiple of the alignment of B, the address
+ // of A should satisfy the requirement for aligning for B.
+ //
+ // There may be other more fine-grained strategies to handle the alignment
+ // infomation during the merging process. But it seems hard to handle
+ // these strategies and benefit little.
+ bool Alignable = [&]() -> bool {
+ auto *LargestAlloca = *AllocaSet.begin();
+ return LargestAlloca->getAlign().value() % Alloca->getAlign().value() ==
+ 0;
+ }();
+ bool CouldMerge = NoInference && Alignable;
if (!CouldMerge)
continue;
AllocaIndex[Alloca] = AllocaIndex[*AllocaSet.begin()]; |
Huh. That canary build is actually showing as the same commit as r23.
|
OK, so I am not sure I understand what that commit is referencing anymore? :( Shouldn't I be able to rebuild the toolchain using that commit? I was just digging around in your CI system and pulled an artifact for the most recent finished build (7740932), which 1) has the "1" after the "c"; 2) works! \o/; but... 3) also claims to be the same llvm-project commit?...
@pirama-arumuga-nainar Regardless: it does seem like "r416183c1" (with the "1") has the fix for the issue, and I can confirm it works for me ;P. |
@DanAlbert I think you may have to fix the link to the canary build here :)
That's expected - it's the base revision where we branched llvm-project. We apply cherry-picks, including the one for this issue, on top of that revision during build. |
D'oh! Yeah, my mistake. That build updates the sysroot. https://ci.android.com/builds/branches/aosp-master-ndk/grid?head=7740141&tail=7740141 updates the compiler. Sorry about that. |
Test: ./checkbuild.py && ./run_tests.py Bug: android/ndk#1540 Bug: android/ndk#1551 Bug: android/ndk#1555 Change-Id: I2d345546457d8c461b603be4ea9e725230f11af9 (cherry picked from commit 2af6bc5)
Bug: android/ndk#1540 c1bc7981bab [Coroutine] Remain alignment information when merging frame variables Test: N/A Change-Id: I1dc66055502a4eaace36c2ed6aab05a1f0f68c22
Fixed in r23b. |
r23~b6 is https://android.googlesource.com/toolchain/llvm-project/+/c935d99d7cf2016289302412d708641d52d2f7ee, aka upstream llvm/llvm-project@8456c3a, made on Fri Jan 15 22:53:25 2021 -0800.
My project is crashing with this build of the NDK (and r23b5; earlier betas untested), but not with any other copy of clang I've tried (including other clang-12s). I'm hoping this r23 doesn't become the final build ;P.
I whittled down a still-very-complex test case and bisected the compiler to figure out that this was fixed by upstream mere days later, in llvm/llvm-project@c1bc798, made on Wed Jan 20 18:59:00 2021 +0800.
The upstream bug tracking this issue was https://bugs.llvm.org/show_bug.cgi?id=48712 and the commit which caused the issue was llvm/llvm-project@b3a722e, made on Mon Sep 28 15:48:00 2020 +0800.
(I've uploaded my probably-unhelpful reproduction--which I failed to whittle cppcoro entirely out of--mostly for my own organizational purposes; but, given the LLVM issue, I hope it doesn't matter: unhelpful.zip.)
The text was updated successfully, but these errors were encountered: