forked from chapuni/llvm
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
simplifycfg: Fix integer overflow converting switch into icmp. #7
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
If a switch instruction has a case for every possible value of its type, with the same successor, SimplifyCFG would replace it with an icmp ult, but the computation of the bound overflows in that case, which inverts the test. Patch by Jed Davis! git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179587 91177308-0d34-0410-b5e6-96231b3b80d8
The attribution is a bit odd due to the git→svn→git round trip, but this is the patch I sent to llvm-commits here: http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20130408/171455.html |
Also: blocks rust-lang/rust#1271. |
brson
pushed a commit
that referenced
this pull request
May 25, 2013
Rather than just splitting the input type and hoping for the best, apply a bit more cleverness. Just splitting the types until the source is legal often leads to an illegal result time, which is then widened and a scalarization step is introduced which leads to truly horrible code generation. With the loop vectorizer, these sorts of operations are much more common, and so it's worth extra effort to do them well. Add a legalization hook for the operands of a TRUNCATE node, which will be encountered after the result type has been legalized, but if the operand type is still illegal. If simple splitting of both types ends up with the result type of each half still being legal, just do that (v16i16 -> v16i8 on ARM, for example). If, however, that would result in an illegal result type (v8i32 -> v8i8 on ARM, for example), we can get more clever with power-two vectors. Specifically, split the input type, but also widen the result element size, then concatenate the halves and truncate again. For example on ARM, To perform a "%res = v8i8 trunc v8i32 %in" we transform to: %inlo = v4i32 extract_subvector %in, 0 %inhi = v4i32 extract_subvector %in, 4 %lo16 = v4i16 trunc v4i32 %inlo %hi16 = v4i16 trunc v4i32 %inhi %in16 = v8i16 concat_vectors v4i16 %lo16, v4i16 %hi16 %res = v8i8 trunc v8i16 %in16 This allows instruction selection to generate three VMOVN instructions instead of a sequences of moves, stores and loads. Update the ARMTargetTransformInfo to take this improved legalization into account. Consider the simplified IR: define <16 x i8> @test1(<16 x i32>* %ap) { %a = load <16 x i32>* %ap %tmp = trunc <16 x i32> %a to <16 x i8> ret <16 x i8> %tmp } define <8 x i8> @test2(<8 x i32>* %ap) { %a = load <8 x i32>* %ap %tmp = trunc <8 x i32> %a to <8 x i8> ret <8 x i8> %tmp } Previously, we would generate the truly hideous: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: push {r7} mov r7, sp sub sp, sp, #20 bic sp, sp, #7 add r1, r0, #48 add r2, r0, #32 vld1.64 {d24, d25}, [r0:128] vld1.64 {d16, d17}, [r1:128] vld1.64 {d18, d19}, [r2:128] add r1, r0, #16 vmovn.i32 d22, q8 vld1.64 {d16, d17}, [r1:128] vmovn.i32 d20, q9 vmovn.i32 d18, q12 vmov.u16 r0, d22[3] strb r0, [sp, #15] vmov.u16 r0, d22[2] strb r0, [sp, #14] vmov.u16 r0, d22[1] strb r0, [sp, #13] vmov.u16 r0, d22[0] vmovn.i32 d16, q8 strb r0, [sp, #12] vmov.u16 r0, d20[3] strb r0, [sp, #11] vmov.u16 r0, d20[2] strb r0, [sp, #10] vmov.u16 r0, d20[1] strb r0, [sp, #9] vmov.u16 r0, d20[0] strb r0, [sp, #8] vmov.u16 r0, d18[3] strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] vldmia sp, {d16, d17} vmov r0, r1, d16 vmov r2, r3, d17 mov sp, r7 pop {r7} bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: push {r7} mov r7, sp sub sp, sp, #12 bic sp, sp, #7 vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d20, d21}, [r0:128] vmovn.i32 d18, q8 vmov.u16 r0, d18[3] vmovn.i32 d16, q10 strb r0, [sp, #3] vmov.u16 r0, d18[2] strb r0, [sp, #2] vmov.u16 r0, d18[1] strb r0, [sp, #1] vmov.u16 r0, d18[0] strb r0, [sp] vmov.u16 r0, d16[3] strb r0, [sp, #7] vmov.u16 r0, d16[2] strb r0, [sp, #6] vmov.u16 r0, d16[1] strb r0, [sp, #5] vmov.u16 r0, d16[0] strb r0, [sp, #4] ldm sp, {r0, r1} mov sp, r7 pop {r7} bx lr Now, however, we generate the much more straightforward: .syntax unified .section __TEXT,__text,regular,pure_instructions .globl _test1 .align 2 _test1: @ @test1 @ BB#0: add r1, r0, #48 add r2, r0, #32 vld1.64 {d20, d21}, [r0:128] vld1.64 {d16, d17}, [r1:128] add r1, r0, #16 vld1.64 {d18, d19}, [r2:128] vld1.64 {d22, d23}, [r1:128] vmovn.i32 d17, q8 vmovn.i32 d16, q9 vmovn.i32 d18, q10 vmovn.i32 d19, q11 vmovn.i16 d17, q8 vmovn.i16 d16, q9 vmov r0, r1, d16 vmov r2, r3, d17 bx lr .globl _test2 .align 2 _test2: @ @test2 @ BB#0: vld1.64 {d16, d17}, [r0:128] add r0, r0, #16 vld1.64 {d18, d19}, [r0:128] vmovn.i32 d16, q8 vmovn.i32 d17, q9 vmovn.i16 d16, q8 vmov r0, r1, d16 bx lr git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179989 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton
pushed a commit
to alexcrichton/llvm
that referenced
this pull request
Aug 15, 2018
…d VPlan for tests." Memory leaks in tests. http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/6289/steps/check-llvm%20asan/logs/stdio Direct leak of 192 byte(s) in 1 object(s) allocated from: #0 0x554ea8 in operator new(unsigned long) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/projects/compiler-rt/lib/asan/asan_new_delete.cc:106 brson#1 0x56cef1 in llvm::VPlanTestBase::doAnalysis(llvm::Function&) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanTestBase.h:53:14 brson#2 0x56bec4 in llvm::VPlanTestBase::buildHCFG(llvm::BasicBlock*) /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanTestBase.h:57:3 brson#3 0x571f1e in llvm::(anonymous namespace)::VPlanHCFGTest_testVPInstructionToVPRecipesInner_Test::TestBody() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/unittests/Transforms/Vectorize/VPlanHCFGTest.cpp:119:15 brson#4 0xed2291 in testing::Test::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc brson#5 0xed44c8 in testing::TestInfo::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2656:11 brson#6 0xed5890 in testing::TestCase::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:2774:28 brson#7 0xef3634 in testing::internal::UnitTestImpl::RunAllTests() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc:4649:43 brson#8 0xef27e0 in testing::UnitTest::Run() /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/src/gtest.cc #9 0xebbc23 in RUN_ALL_TESTS /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/googletest/include/gtest/gtest.h:2233:46 #10 0xebbc23 in main /b/sanitizer-x86_64-linux-bootstrap/build/llvm/utils/unittest/UnitTestMain/TestMain.cpp:51 #11 0x7f65569592e0 in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x202e0) and more. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336718 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton
pushed a commit
to alexcrichton/llvm
that referenced
this pull request
Aug 15, 2018
…ering" This reverts commit r337021. WARNING: MemorySanitizer: use-of-uninitialized-value #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7 brson#1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3 brson#2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3 brson#3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18 brson#4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23 brson#5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5 brson#6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const*, llvm::raw_ostream&, char const*) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5 brson#7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3 brson#8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26 [...] Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv' #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337079 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton
pushed a commit
to alexcrichton/llvm
that referenced
this pull request
Aug 15, 2018
Floating-point trigonometric multiply-add coefficient, e.g. ftmad z0.h, z0.h, z1.h, brson#7 with variants for 16, 32 and 64-bit elements. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337533 91177308-0d34-0410-b5e6-96231b3b80d8
alexcrichton
pushed a commit
to alexcrichton/llvm
that referenced
this pull request
Aug 15, 2018
Summary: Currently, in line with GCC, when specifying reserved registers like sp or pc on an inline asm() clobber list, we don't always preserve the original value across the statement. And in general, overwriting reserved registers can have surprising results. For example: ``` extern int bar(int[]); int foo(int i) { int a[i]; // VLA asm volatile( "mov r7, brson#1" : : : "r7" ); return 1 + bar(a); } ``` Compiled for thumb, this gives: ``` $ clang --target=arm-arm-none-eabi -march=armv7a -c test.c -o - -S -O1 -mthumb ... foo: .fnstart @ %bb.0: @ %entry .save {r4, r5, r6, r7, lr} push {r4, r5, r6, r7, lr} .setfp r7, sp, #12 add r7, sp, #12 .pad brson#4 sub sp, brson#4 movs r1, brson#7 add.w r0, r1, r0, lsl brson#2 bic r0, r0, brson#7 sub.w r0, sp, r0 mov sp, r0 @app mov.w r7, brson#1 @NO_APP bl bar adds r0, brson#1 sub.w r4, r7, #12 mov sp, r4 pop {r4, r5, r6, r7, pc} ... ``` r7 is used as the frame pointer for thumb targets, and this function needs to restore the SP from the FP because of the variable-length stack allocation a. r7 is clobbered by the inline assembly (and r7 is included in the clobber list), but LLVM does not preserve the value of the frame pointer across the assembly block. This type of behavior is similar to GCC's and has been discussed on the bugtracker: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=11807 . No consensus seemed to have been reached on the way forward. Clang behavior has briefly been discussed on the CFE mailing (starting here: http://lists.llvm.org/pipermail/cfe-dev/2018-July/058392.html). I've opted for following Eli Friedman's advice to print warnings when there are reserved registers on the clobber list so as not to diverge from GCC behavior for now. The patch uses MachineRegisterInfo's target-specific knowledge of reserved registers, just before we convert the inline asm string in the AsmPrinter. If we find a reserved register, we print a warning: ``` repro.c:6:7: warning: inline asm clobber list contains reserved registers: R7 [-Winline-asm] "mov r7, brson#1" ^ ``` Reviewers: eli.friedman, olista01, javed.absar, efriedma Reviewed By: efriedma Subscribers: efriedma, eraman, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D49727 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339257 91177308-0d34-0410-b5e6-96231b3b80d8
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
If a switch instruction has a case for every possible value of its type,
with the same successor, SimplifyCFG would replace it with an icmp ult,
but the computation of the bound overflows in that case, which inverts
the test.
Patch by Jed Davis!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@179587 91177308-0d34-0410-b5e6-96231b3b80d8