-
Notifications
You must be signed in to change notification settings - Fork 13.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WebAssembly] Crash in WebAssemblyCFGStackify
with optimization (bad machine code)
#126916
Comments
WebAssemblyCFGStackify
optimization (bad machine code)WebAssemblyCFGStackify
with optimization (bad machine code)
@llvm/issue-subscribers-backend-webassembly Author: None (stevenwdv)
Clang crashes while optimizing some WebAssembly machine code (both 64-bit and 32-bit). This didn't happen when compiling the same code without optimizations.
When using
</details> Here are the files referenced in the error: RxTimeout-38846c.zip For reference, here's what you get without
</details> This might be related to #63182, not sure. |
Unlike in Itanium EH IR, WinEH IR's unwinding instructions (e.g. `invoke`s) can have multiple possible unwind destinations. For example: ```ll entry: invoke void @foo() to label %cont unwind label %catch.dispatch catch.dispatch: ; preds = %entry %0 = catchswitch within none [label %catch.start] unwind label %terminate catch.start: ; preds = %catch.dispatch %1 = catchpad within %0 [ptr null] ... terminate: ; preds = %catch.dispatch %2 = catchpad within none [] ... ... ``` In this case, if an exception is not caught by `catch.dispatch` (and thus `catch.start`), it should next unwind to `terminate`. `findUnwindDestination` in ISel gathers the list of this unwind destinations traversing the unwind edges: https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2089-L2150 But we don't use that, and instead use our custom `findWasmUnwindDestinations` that only adds the first unwind destination, `catch.start`, to the successor list of `entry`, and not `terminate`: https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2037-L2087 The reason behind it was, as described in the comment block in the code, it was assumed that there always would be an `invoke` that connects `catch.start` and `terminate`. In case of `catch (type)`, there will be `call void @llvm.wasm.rethrow()` in `catch.start`'s predecessor that unwinds to the next destination. For example: https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L429-L430 In case of `catch (...)`, `__cxa_end_catch` can throw, so it becomes an `invoke` that unwinds to the next destination. For example: https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L537-L538 So the unwind ordering relationship between `catch.start` and `terminate` here would be preserved. But turns out this assumption does not always hold. For example: ```ll entry: invoke void @foo() to label %cont unwind label %catch.dispatch catch.dispatch: ; preds = %entry %0 = catchswitch within none [label %catch.start] unwind label %terminate catch.start: ; preds = %catch.dispatch %1 = catchpad within %0 [ptr null] ... call void @_ZSt9terminatev() unreachable terminate: ; preds = %catch.dispatch %2 = catchpad within none [] call void @_ZSt9terminatev() unreachable ... ``` In this case there is no `invoke` that connects `catch.start` to `terminate`. So after `catch.dispatch` BB is removed in ISel, `terminate` is considered unreachable and incorrectly removed in DCE. This makes Wasm just use the general `findUnwindDestination`. In that case `entry`'s successor is going to be [`catch.start`, `terminate`]. We can get the first unwind destination by just traversing the list from the front. --- This required another change in WinEHPrepare. WinEHPrepare demotes all PHIs in EH pads because they are funclets in Windows and funclets can't have PHIs. When used in Wasm they are not funclets so we don't need to do that wholesale but we still need to demote PHIs in `catchswitch` BBs because they are deleted during ISel. But turns out we need to remove PHIs that have a `catchswitch` BB as an incoming block too: ```ll ... catch.dispatch: %0 = catchswitch within none [label %catch.start] unwind label %terminate catch.start: ... somebb: ... ehcleanup ; preds = %catch.dispatch, %somebb %1 = phi i32 [ 10, %catch.dispatch ], [ 20, %somebb ] ... ``` In this case the `phi` in `ehcleanup` BB should be demoted too because `catch.dispatch` BB will be removed in ISel so one if its incoming block will be gone. This pattern didn't manifest before presumably due to how `findWasmUnwindDestinations` worked. (In this example, in our `findWasmUnwindDestinations`, `catch.dispatch` would have had only one successor, `catch.start`. But now `catch.dispatch` has both `catch.start` and `ehcleanup` as successors, revealing this bug. This case is represented by `rethrow_terminator` function in `exception.ll` (or `exception-legacy.ll`) and without this fix it will crash. --- Discovered by the reproducer provided in llvm#126916, even though the bug reported there was not this one.
Unlike in Itanium EH IR, WinEH IR's unwinding instructions (e.g. `invoke`s) can have multiple possible unwind destinations. For example: ```ll entry: invoke void @foo() to label %cont unwind label %catch.dispatch catch.dispatch: ; preds = %entry %0 = catchswitch within none [label %catch.start] unwind label %terminate catch.start: ; preds = %catch.dispatch %1 = catchpad within %0 [ptr null] ... terminate: ; preds = %catch.dispatch %2 = catchpad within none [] ... ... ``` In this case, if an exception is not caught by `catch.dispatch` (and thus `catch.start`), it should next unwind to `terminate`. `findUnwindDestination` in ISel gathers the list of this unwind destinations traversing the unwind edges: https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2089-L2150 But we don't use that, and instead use our custom `findWasmUnwindDestinations` that only adds the first unwind destination, `catch.start`, to the successor list of `entry`, and not `terminate`: https://github.com/llvm/llvm-project/blob/ae42f071032b29821beef6a33771258086bbbb1c/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp#L2037-L2087 The reason behind it was, as described in the comment block in the code, it was assumed that there always would be an `invoke` that connects `catch.start` and `terminate`. In case of `catch (type)`, there will be `call void @llvm.wasm.rethrow()` in `catch.start`'s predecessor that unwinds to the next destination. For example: https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L429-L430 In case of `catch (...)`, `__cxa_end_catch` can throw, so it becomes an `invoke` that unwinds to the next destination. For example: https://github.com/llvm/llvm-project/blob/0db702ac8e06911478615ac537f75ac778817c04/llvm/test/CodeGen/WebAssembly/exception.ll#L537-L538 So the unwind ordering relationship between `catch.start` and `terminate` here would be preserved. But turns out this assumption does not always hold. For example: ```ll entry: invoke void @foo() to label %cont unwind label %catch.dispatch catch.dispatch: ; preds = %entry %0 = catchswitch within none [label %catch.start] unwind label %terminate catch.start: ; preds = %catch.dispatch %1 = catchpad within %0 [ptr null] ... call void @_ZSt9terminatev() unreachable terminate: ; preds = %catch.dispatch %2 = catchpad within none [] call void @_ZSt9terminatev() unreachable ... ``` In this case there is no `invoke` that connects `catch.start` to `terminate`. So after `catch.dispatch` BB is removed in ISel, `terminate` is considered unreachable and incorrectly removed in DCE. This makes Wasm just use the general `findUnwindDestination`. In that case `entry`'s successor is going to be [`catch.start`, `terminate`]. We can get the first unwind destination by just traversing the list from the front. --- This required another change in WinEHPrepare. WinEHPrepare demotes all PHIs in EH pads because they are funclets in Windows and funclets can't have PHIs. When used in Wasm they are not funclets so we don't need to do that wholesale but we still need to demote PHIs in `catchswitch` BBs because they are deleted during ISel. (So we created [`-demote-catchswitch-only`](https://github.com/llvm/llvm-project/blob/a5588b6d20590a10db0f1a2046fba4d9f205ed68/llvm/lib/CodeGen/WinEHPrepare.cpp#L57-L59) option for that) But turns out we need to remove PHIs that have a `catchswitch` BB as an incoming block too: ```ll ... catch.dispatch: %0 = catchswitch within none [label %catch.start] unwind label %terminate catch.start: ... somebb: ... ehcleanup ; preds = %catch.dispatch, %somebb %1 = phi i32 [ 10, %catch.dispatch ], [ 20, %somebb ] ... ``` In this case the `phi` in `ehcleanup` BB should be demoted too because `catch.dispatch` BB will be removed in ISel so one if its incoming block will be gone. This pattern didn't manifest before presumably due to how `findWasmUnwindDestinations` worked. (In this example, in our `findWasmUnwindDestinations`, `catch.dispatch` would have had only one successor, `catch.start`. But now `catch.dispatch` has both `catch.start` and `ehcleanup` as successors, revealing this bug. This case is [represented](https://github.com/llvm/llvm-project/blob/ab87206c4b95aa0b5047facffb5f78f7fe6ac269/llvm/test/CodeGen/WebAssembly/exception.ll#L445) by `rethrow_terminator` function in `exception.ll` (or `exception-legacy.ll`) and without the WinEHPrepare fix it will crash. --- Discovered by the reproducer provided in #126916, even though the bug reported there was not this one.
`MachineVerifier` requires that a virtual register's `LiveInterval` has only one connected component: https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/CodeGen/MachineVerifier.cpp#L3923-L3936 (This is different from SSA; there can be multiple defs for a register, but those live intervals should somehow be _connected_. I am not very sure why this rule exists, but this rule apparently has been there forever since llvm@260fa28.) --- And it turns out our `TEE` generation in RegStackify can create virtual registers with `LiveInterval` with multiple disconnected segments. This is how `TEE` generation works: https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L613-L632 But it is possible that `Reg = INST` can also use `Reg`: ``` 0 Reg = ... ... 1 Reg = INST ..., Reg, ... // It both defs and uses 'Reg' 2 INST ..., Reg, ... 3 INST ..., Reg, ... 4 INST ..., Reg, ... ``` In this pseudocode, `Reg`'s live interval is `[0r,1r),[1r:4r)`, which has two segments that are connected. But after the `TEE` transformation, ``` 0 Reg = ... ... 1 DefReg = INST ..., Reg, ... 2 TeeReg, Reg = TEE_... DefReg 3 INST ..., TeeReg, ... 4 INST ..., Reg, ... 5 INST ..., Reg, ... ``` now `%0`'s live interval is `[0r,1r),[2r,4r)`, which are not connected anymore. In many cases, these split segments are connected by another segment generated by a `PHI` (or a block start boundary that used to be a `PHI`). For example, if there is a loop, ``` bb.0: successors: %bb.1 0 Reg = INST ... ... bb.1: ; predecessors: %bb.1 successors: %bb.1, %bb.2 1 DefReg = INST ..., Reg, ... 2 TeeReg, Reg = TEE_... DefReg 3 INST ..., TeeReg, ... 4 INST ..., Reg, ... 5 INST ..., Reg, ... 6 BR_IF bb.1 bb.2: ; predecessors: %bb.1 7 INST ... ``` The live interval would be `[0r,1B),[1B,1r),[2r,7B)` and these three segments are classified as a single equivalence class because of the segment `[1B,1r)` starting at the merging point (which used to be a `PHI`) at the start of bb.1. But this kind of connection is not always guaranteed. The method that determines whether all components are connected, i.e., there is a single equivalence class in a `LiveRange`, is `ConnectedVNInfoEqClasses::Classify`. --- In RegStackify, there is a routine that splits live intervals into multiple registers in some cases: https://github.com/llvm/llvm-project/blob/f4043f451d0e8c30c8a9826ce87a6e76f3ace468/llvm/lib/Target/WebAssembly/WebAssemblyRegStackify.cpp#L511-L517 It looks this was copied from https://github.com/llvm/llvm-project/blob/dc9a183ac6aa2d087ceac56970255b06c4772ca3/llvm/lib/CodeGen/RegisterCoalescer.cpp#L341-L353. But `LiveIntervals::shrinkToUses` does not return true for all unconnected live intervals. I don't understand all the details of that function or why `RegisterCoalescer::shrinkToUses` was written that way, but it looks it returns true when there are some dead components: https://github.com/llvm/llvm-project/blob/926d980017d82dedb9eb50147a82fdfb01659f16/llvm/lib/CodeGen/LiveIntervals.cpp#L537-L540 And the case in the attached test case does not return true here, but still has multiple unconnected components and does not pass `MachineVerifier` unless they are split. --- So this PR runs `LiveIntervals::splitSeparateComponents` regardless of the return value of `LiveIntervals::shrinkToUses`. `splitSeparateComponents` won't do anything unless there are multiple unconnected components to split. --- This is one of the bugs reported in llvm#126916.
This effectively reverts llvm@61d22f2. llvm@61d22f2 claims that when the predecessor unwinds to an EH pad, we should take the live-out values of not the predecessor's terminator but the last call. But if the call has a possibility to unwind to an EH pad, it means it was an `invoke` before and it should either be the terminating instruction of a BB or be right before a branch, which is the terminator. If a call is in a middle of a BB, that means the call was not an `invoke` before and it unwinds to the caller if it throws. This can fail in the case when there are unwinding instructions that are not calls, such as `CLEANUPRET` or other target-specific throwing instructions (e.g. Wasm's `throw`/`rethrow`/`throw_ref`). For example, as in the test case attached, ```mir bb.2 (landing-pad): ... CALL @foo %0 = CONST_I32 0 CLEANUPRET %bb.3 bb.3 (landing-pad): call @bar, %0 ... ``` When examining the live range of `bb.3`'s `%0`, we should check the live-outs at `CLEANUPRET`, and not `CALL @foo`. But because of this checking routine this PR deletes, this ends up passing `CLEANUPRET` and `%0 = CONST_I32 0` and search for `%0`'s liveout at `CALL @foo`, which is incorrect and results in a validation error. ```cpp // Predecessor of landing pad live-out on last call. if (MFI->isEHPad()) { for (const MachineInstr &MI : llvm::reverse(*Pred)) { if (MI.isCall()) { PEnd = Indexes->getInstructionIndex(MI).getBoundaryIndex(); break; } } } ``` Also I'm not sure if I understand the description about `llvm.experimental.gc.statepoint` intrinsic given in llvm@61d22f2. There doesn't seem to be any special thing about that intrinsic; if it can unwind to an EH pad, it would be at the end of a BB. If it unwinds to the caller, it shouldn't be where we take the live-outs from. Both the test case `statepoint-invoke-ra1.ll` given in that commit, and also https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/X86/statepoint-invoke-ra.mir that replaced it in llvm@2e1150d, seem to pass even with this PR (because it was an `invoke` and at the end of a BB anyway). --- This also deletes `windows-seh-EHa-RegisterLiveness.ll `, a test case added in llvm#76921. It looks that test case was added to track a MachineVerifier bug, which I guess may be the same thing this PR is trying to fix. After this PR, that test unexpectedly succeeds, which I think means we can delete it. (There is also another PR (llvm#76933) that tried to fix this bug in another way, which still has not landed.) --- This bug is discovered from the reproducer in llvm#126916, even though this is not the bug reported there.
Clang crashes while optimizing some WebAssembly machine code (both 64-bit and 32-bit). This didn't happen when compiling the same code without optimizations, also not when enabling
--verify-machineinstrs
.When using
--verify-machineinstrs
, I getBad machine code: Multiple connected components in live interval
.Here's the log (I'm using a Debug build of Clang to get better stack traces etc.):
Log
Here are the files referenced in the error: RxTimeout-38846c.zip
For reference, here's what you get without
--verify-machineinstrs
(a SIGSEGV):Log
This might be related to #63182, not sure.
The text was updated successfully, but these errors were encountered: