-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ninja build #7
base: master
Are you sure you want to change the base?
Ninja build #7
Conversation
…, undef -> X transforms" and subsequent patches This reverts most of the following patches due to reports of miscompiles. I've left the added test cases with comments updated to be FIXMEs. 1cf6f210a2e [IR] Disable select ? C : undef -> C fold in ConstantFoldSelectInstruction unless we know C isn't poison. 469da663f2d [InstSimplify] Re-enable select ?, undef, X -> X transform when X is provably not poison 122b0640fc9 [InstSimplify] Don't fold vectors of partial undef in SimplifySelectInst if the non-undef element value might produce poison ac0af12ed2f [InstSimplify] Add test cases for opportunities to fold select ?, X, undef -> X when we can prove X isn't poison 9b1e95329af [InstSimplify] Remove select ?, undef, X -> X and select ?, X, undef -> X transforms (cherry picked from commit 00f3579aea6e3d4a4b7464c3db47294f71cef9e4)
We need to specify legal integer widths to trigger PR46712, so add those here. This doesn't appear to affect any existing tests, and it's not clear why a datalayout would not include any legal integer widths. While here, change some variable names that include 'tmp' to avoid warnings from the auto-generating script for CHECK lines. (cherry picked from commit efc30e591bb5a6e869fd8e084bd310ae516b0fae)
I'm not sure if the test is truly minimal, but we need to induce a situation where a value becomes a constant but is not immediately folded before getting to the 'or' transform. (cherry picked from commit d8b268680d0858aaf30cb1a278b64b11361bc780)
…gnment assumptions" due to the performance bugs filed in https://bugs.llvm.org/show_bug.cgi?id=46753. An SROA change soon may obviate some of these problems. This reverts commit 8d09f20798ac180b1749276bff364682ce0196ab. (cherry picked from commit 7bfaa40086359ed7e41c862ab0a65e0bb1be0aeb)
(cherry picked from commit 9adf7461f721170419058684a8d3f9228d641d59)
…by combineAdd and combineSub. There was a lot of duplicate code here for checking the VT and subtarget. Moving it into a helper avoids that. It also fixes a bug that combineAdd reused Op0/Op1 after a call to isHorizontalBinOp may have changed it. The new helper function has its own local version of Op0/Op1 that aren't shared by other code. Fixes PR46455. Reviewed By: spatel, bkramer Differential Revision: https://reviews.llvm.org/D83971 (cherry picked from commit 5408024fa87e0b23b169fec07913bd4357acdbc4)
The flag is off by default. (cherry picked from commit 033ef8420cec57187fffac1f06322f73aa945c4c)
This is brought up in https://reviews.llvm.org/D83915. We would like to remove some feature in PowerPC. We did send RFC before, but we think it might be a better idea that we indicate planned removal in the Release Notes for version 11 and actual removal in those for version 12.. Reviewed By: hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D83968
This function has a bug which will incorrectly reschedule instructions after an INLINEASM_BR (which can branch). (The bug may also allow scheduling past a throwing-CALL, I'm not certain.) I could fix that bug, but, as the removed FIXME notes, it's better to attempt rescheduling before converting to 3-addr form, as that may remove the need to convert in the first place. In fact, the code to do such reordering was added to this pass only a few months later, in 2011, via the addition of the function rescheduleMIBelowKill. That code does not contain the same bug. The removal of the sink3AddrInstruction function is not a no-op: in some cases it would move an instruction post-conversion, when rescheduleMIBelowKill would not move the instruction pre-converison. However, this does not appear to be important: the machine instruction scheduler can reorder the after-conversion instructions, in any case. This patch fixes a kernel panic 4.4 LTS x86_64 Linux kernels, when built with clang after 4b0aa5724feaa89a9538dcab97e018110b0e4bc3. Link: ClangBuiltLinux/linux#1085 Differential Revision: https://reviews.llvm.org/D83708 (cherry picked from commit 60433c63acb71935111304d71e41b7ee982398f8)
This suppresses `failed to compute relocation: R_PPC_REL32, Invalid data was encountered while parsing the file` and its 64-bit variants when running llvm-dwarfdump on a PowerPC object file with .eh_frame Unfortunately it is difficult to test the computation: DWARFDataExtractor::getEncodedPointer does not use the relocated value and even if it does, we need to teach llvm-dwarfdump --eh-frame to do some linker job to report a reasonable address. (cherry picked from commit b922004ea29d54534c4f09b9cfa655bf5f3360f0)
Code from D83800 by Yichao Yu (cherry picked from commit 3073a3aa1ef1ce8c9cac9b97a8e5905dd8779e16)
…abels ``` define i32 @test(i1 %cond) { entry: br i1 %cond, label %exit, label %exit exit: %result = select i1 %cond, i32 123, i32 456 ret i32 %result } ``` In this test, after applying transformation of replacing select with Phis, the result will be: ``` define i32 @test(i1 %cond) { entry: br i1 %cond, label %exit, label %exit exit: %result = i32 phi [123, %exit], [123, %exit] ret i32 %result } ``` That is, select is transformed into an invalid Phi, which will then be reduced to 123 and the second value will be lost. But it is worth noting that this problem will arise only if select is in the InstCombine worklist will be before the branch. Otherwise, InstCombine will replace the branch condition with false and transformation will not be applied. The fix is to check the target labels in the branch condition for equality. Patch By: Kirill Polushin Differential Revision: https://reviews.llvm.org/D84003 Reviewed By: mkazantsev (cherry picked from commit c98988107868db41c12b9d782fae25dea2a81c87)
…ranch has the same labels An additional test that allows to check the correctness of handling the case of the same branch labels in the dominator when trying to replace select with phi-node. Patch By: Kirill Polushin Differential Revision: https://reviews.llvm.org/D84006 Reviewed By: mkazantsev (cherry picked from commit df6e185e8f895686510117301e568e5043909b66)
Summary: 1. gcc uses `-march` and `-mtune` flag to chose arch and pipeline model, but clang does not have `-mtune` flag, we uses `-mcpu` to chose both infos. 2. Add SiFive e31 and u54 cpu which have default march and pipeline model. 3. Specific `-mcpu` with rocket-rv[32|64] would select pipeline model only, and use the driver's arch choosing logic to get default arch. Reviewers: lenary, asb, evandro, HsiangKai Reviewed By: lenary, asb, evandro Tags: #llvm, #clang Differential Revision: https://reviews.llvm.org/D71124 (cherry picked from commit 294d1eae75bf8867821a4491f0d67445227f8470)
…urce register when the destination is a 64 register. Previously we only accepted a 32-bit source with a 64-bit dest. Accepting 64-bit as well is more consistent with gas behavior. I think maybe we should accept 16 bit register as well, but I'm not sure. (cherry picked from commit 3c2a56a857227b6bc39285747269f02cd7a9dbe5)
… register. This matches GNU assembler behavior. Operand size is determined only from the destination register. (cherry picked from commit 71b49aa438b22b02230fff30e8874ff756336e6d)
Summary: Remove unused function Reviewed By: lbenes Differential Revision: https://reviews.llvm.org/D83898 (cherry picked from commit 47a3b85a97136fca4a388646cbaec10b71414b60)
The getAllOnesValue can only handle things that are bitcast from a ConstantInt, while here we bitcast through a pointer, so we may see more complex objects (like Array or Struct). Differential Revision: https://reviews.llvm.org/D83870 (cherry picked from commit 8b354cc8db413f596c95b4f3240fabaa3e2c931e)
…instead of .o This matches LLD and fixes https://sourceware.org/bugzilla/show_bug.cgi?id=26262#c1 .o is a bad choice for save-temps output because it is easy to override the bitcode file (*.o) ``` # Use bfd for the example, -fuse-ld=gold is similar. clang -flto -c a.c # generate bitcode file a.o clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps # override a.o # The user repeats the command but get surprised, because a.o is now a combined module. clang -fuse-ld=bfd -flto a.o -o a -Wl,-plugin-opt=save-temps ``` Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D84132 (cherry picked from commit 55fa315b0352b63454206600d6803fafacb42d5e)
(cherry picked from commit aa830e9768303ff8d27c015759294c4ce704d50c)
(cherry picked from commit 817767abeec8343b20de83f8b1b2c8c20bbbe00a)
…ogue Current powerpc backend generates wrong code sequence if stack pointer has to realign if -fstack-clash-protection enabled. When probing in prologue, backend should generate a subtraction instruction rather than a `stux` instruction to realign the stack pointer. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84218 (cherry picked from commit 8912252252c87d8ef6623ecf9fdde444560ee4b9)
…ing dynalloc Current powerpc backend generates wrong code sequence if stack pointer has to realign if `-fstack-clash-protection` enabled. When probing dynamic stack allocation, current `PREPARE_PROBED_ALLOCA` takes `NegSizeReg` as input and returns `FinalStackPtr`. `FinalStackPtr=StackPtr+ActualNegSize` is calculated correctly, however code following `PREPARE_PROBED_ALLOCA` still uses value of `NegSizeReg`, which does not contain `ActualNegSize` if `MaxAlign > TargetAlign`, to calculate loop trip count and residual number of bytes. This patch is part of fix of https://bugs.llvm.org/show_bug.cgi?id=46759. Differential Revision: https://reviews.llvm.org/D84152 (cherry picked from commit c3f9697f1f227296818fbaf1a770a29842ea454c)
This assert was added to verify assumption that GEP's SCEV will be of pointer type, basing on fact that it should be a SCEVAddExpr with (at least) last operand being pointer. Two notes: - GEP's SCEV does not have to be a SCEVAddExpr after all simplifications; - In current state, GEP's SCEV does not have to have at least one pointer operands (all of them can become int during the transforms). However, we might want to be at a point where it is true. We are currently removing this assert and will try to enumerate the cases where "is pointer" notion might be lost during the transforms. When all of them are fixed, we can return it. Differential Revision: https://reviews.llvm.org/D84294 Reviewed By: lebedev.ri (cherry picked from commit b96114c1e1fc4448ea966bce013706359aee3fa9)
since it's failing.
(cherry picked from commit 13ae440de4a408cf9d1a448def09769ecbecfdf7)
Fixes https://bugs.llvm.org/show_bug.cgi?id=46680. Just like insertions through IRBuilder, InsertNewInstBefore() should be using the deferred worklist mechanism, so that processing of newly added instructions is prioritized. There's one side-effect of the worklist order change which could be classified as a regression. An add op gets pushed through a select that at the time is not a umax. We could add a reverse transform that tries to push adds in the reverse direction to restore a min/max, but that seems like a sure way of getting infinite loops... Seems like something that should best wait on min/max intrinsics. Differential Revision: https://reviews.llvm.org/D84109 (cherry picked from commit d12ec0f752e7f2c7f7252539da2d124264ec33f7)
…VECTOR(X,0)) patterns. getTargetShuffleMask is used by the various "SimplifyDemanded" folds so we can't assume that the bypassed extract_subvector can be safely simplified - getFauxShuffleMask performs a more general decode that allows us to more safely catch many of these cases so the impact is minimal. (cherry picked from commit 5b5dc2442ac7a574a3b7d17c15ebeeb9eb3bec26)
…b asm instructions This patch provides optimization of bit manipulation operations by enabling the +experimental-b target feature. It adds matching of single block patterns of instructions to specific bit-manip instructions from the base subset (zbb subextension) of the experimental B extension of RISC-V. It adds also the correspondent codegen tests. This patch is based on Claire Wolf's proposal for the bit manipulation extension of RISCV: https://github.com/riscv/riscv-bitmanip/blob/master/bitmanip-0.92.pdf Differential Revision: https://reviews.llvm.org/D79870 (cherry picked from commit e2692f0ee7f338fea4fc918669643315cefc7678)
The test directory severely inflates the size of the AUR clone, and we're not even using the tests
Treat Zen3 as Zen2 until upstream adds Zen3 support
d84b3a9
to
d5fadee
Compare
🥷 |
Removed some copying of files, rpcs3 seems to build just fine without them? Ninja already copies some include and cmake files to the output as well. |
I forgot to ask, does it make any difference? I hesitated to merge due to the risk of uploading non-functional release. |
Well it's supposed to build more quickly, but I'm getting mixed results since azure is way too inconsistent. |
1f23ff6
to
5836324
Compare
3f0e3cb
to
05332b7
Compare
@xddxd Needs to be rebased. |
d172639
to
9b52b6c
Compare
Building with ninja on Azure. Thanks @JohnHolmesII.