Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A miscompilation with -C lto -O -C target-cpu=haswell #45145

Closed
leonardo-m opened this issue Oct 9, 2017 · 7 comments
Closed

A miscompilation with -C lto -O -C target-cpu=haswell #45145

leonardo-m opened this issue Oct 9, 2017 · 7 comments
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.

Comments

@leonardo-m
Copy link

This small program solves the Euler problem n.97:

#[inline(never)]
fn e97() -> u64 {
    // Input.
    const BASE: f64 = 28_433.0;
    const EXP: u64 = 7_830_457;
    const BIG: f64 = 1e10;

    let mut b = 1;
    while b < EXP { b <<= 1; }

    let mut n = 1.0;
    while b > 0 {
        if EXP & b != 0 {
            n = (n * n * 2.0) % BIG;
        } else {
            n = (n * n) % BIG;
        }
        b >>= 1;
    }

    ((n * BASE + 1.0) % BIG) as u64
}

fn main() {
    assert_eq!(e97(), 8_739_992_577);
}

If I compile it with normal compilation arguments like this it works correctly:
rustc -C lto -O -C target-cpu=haswell e97.rs

If I compile it with -C target-cpu=native or like this, it asserts at run time:
rustc -C lto -O -C target-cpu=haswell e97.rs

thread 'main' panicked at 'assertion failed: `(left == right)`
  left: `9246255105`,
 right: `8739992577`', e97.rs:27:4

The asm compiling with:
rustc -C lto -O --emit asm e97.rs

_ZN3e973e9717h3d13175579ebef2eE:
    pushq   %rsi
    subq    $48, %rsp
    movaps  %xmm6, 32(%rsp)
    movsd   .LCPI1_0(%rip), %xmm0
    movl    $8388608, %esi
    movsd   .LCPI1_1(%rip), %xmm6
    .p2align    4, 0x90
.LBB1_1:
    testl   $7830457, %esi
    mulsd   %xmm0, %xmm0
    je  .LBB1_3
    addsd   %xmm0, %xmm0
.LBB1_3:
    movaps  %xmm6, %xmm1
    callq   fmod
    testl   $15660914, %esi
    mulsd   %xmm0, %xmm0
    je  .LBB1_5
    addsd   %xmm0, %xmm0
.LBB1_5:
    movaps  %xmm6, %xmm1
    callq   fmod
    shrq    $2, %rsi
    jne .LBB1_1
    mulsd   .LCPI1_2(%rip), %xmm0
    addsd   .LCPI1_0(%rip), %xmm0
    movsd   .LCPI1_1(%rip), %xmm1
    callq   fmod
    movsd   .LCPI1_3(%rip), %xmm1
    movapd  %xmm0, %xmm2
    subsd   %xmm1, %xmm2
    cvttsd2si   %xmm2, %rax
    movabsq $-9223372036854775808, %rcx
    xorq    %rax, %rcx
    cvttsd2si   %xmm0, %rax
    ucomisd %xmm1, %xmm0
    cmovaeq %rcx, %rax
    movaps  32(%rsp), %xmm6
    addq    $48, %rsp
    popq    %rsi
    retq

The asm compiling with:
rustc -C lto -O -C target-cpu=haswell --emit asm e97.rs

_ZN3e973e9717h3d13175579ebef2eE:
    .loc    24 0 0 is_stmt 0
    movabsq $9246255105, %rax
    retq
@hanna-kruppe
Copy link
Contributor

Since the haswell version is completely constant folded and the program contains a float->int cast, I suspect #10184 is at fault. No idea why it depends on the target CPU though.

@cuviper
Copy link
Member

cuviper commented Oct 10, 2017

It also works fine without -C lto, still with targeting haswell.

@TimNN TimNN added C-bug Category: This is a bug. and removed C-bug Category: This is a bug. labels Oct 10, 2017
@nagisa
Copy link
Member

nagisa commented Oct 10, 2017

log₂(1E10) is 33 and a little bit, so whatever value it ends up being, it should fit into u64 just fine and therefore no cast-related UB should be happening.

@nagisa nagisa added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Oct 10, 2017
@nagisa
Copy link
Member

nagisa commented Oct 10, 2017

These are results with various target-cpu supported by LLVM:

target-cpu result
amdfam10
athlon
athlon-4
athlon-fx
athlon-mp
athlon-tbird
athlon-xp
athlon64
athlon64-sse3
atom
barcelona
bdver1
bdver2
bdver3
bdver4
bonnell
broadwell
btver1
btver2
c3
c3-2
cannonlake
core-avx-i
core-avx2
core2
corei7
corei7-avx
generic
geode
haswell
i386
i486
i586
i686
ivybridge
k6
k6-2
k6-3
k8
k8-sse3
knl
lakemont
nehalem
nocona
opteron
opteron-sse3
penryn
pentium
pentium-m
pentium-mmx
pentium2
pentium3
pentium3m
pentium4
pentium4m
pentiumpro
prescott
sandybridge
silvermont
skx
skylake
skylake-avx512
slm
westmere
winchip-c6
winchip2
x86-64
yonah
znver1

@cuviper
Copy link
Member

cuviper commented Oct 10, 2017

It's fine with 1.18, but not 1.19, which corresponds to the LLVM-4 update. I also have a RHEL7 machine, with EPEL7's rust-1.20 + llvm3.9, and it works fine. So it seems clearly to be an LLVM regression.

@cuviper
Copy link
Member

cuviper commented Oct 11, 2017

let mut b = 1;
while b < EXP { b <<= 1; }

FWIW, this can be just EXP.next_power_of_two(), but it doesn't affect the end result.

However, adding a println!("{}", b) after initializing b either way does "fix" the issue. The format reference probably inhibits some constant folding.

@alexcrichton
Copy link
Member

Fixed in #47828

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. C-bug Category: This is a bug. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

6 participants