next_back for a Vec is optimized incorrectly on i686 #52026

neivv · 2018-07-03T18:25:04Z

> rustc --version -v
rustc 1.28.0-nightly (e3bf634e0 2018-06-28)
binary: rustc
commit-hash: e3bf634e060bc2f8665878288bcea02008ca346e
commit-date: 2018-06-28
host: i686-pc-windows-msvc
release: 1.28.0-nightly
LLVM version: 6.0

Optimized code of the following program panics if -C codegen-units is not 1.

struct S([u32; 2]);
fn main() {
    let x = vec![S([1, 2])];
    let mut it = x.iter();
    if let Some(_) = it.next_back() {
        assert!(it.next_back().is_none());
    }
}

The text was updated successfully, but these errors were encountered:

neivv · 2018-07-04T11:02:12Z

I bisected the change to be between nightlies rustc 1.24.0-nightly (0077d12 2017-12-14) (good) and rustc 1.24.0-nightly (77efd68 2017-12-15) (bad)

I saw two differences in LLVM IR between good and bad versions:
(Disclaimer: I'm certainly not sure if I missed something else and this is just a red herring)
https://gist.github.com/neivv/c7e536269885d3b3fac8ae1a93086920
Bad one has struct definitions as

%S = type { [0 x i32], [2 x i32], [0 x i32] }
%"alloc::vec::Vec<S>" = type { [0 x i32], { i32*, i32 }, [0 x i32], i32, [0 x i32] }

and good one has

%S = type { [0 x i8], [2 x i32], [0 x i8] }
%"alloc::vec::Vec<S>" = type { [0 x i8], { i32*, i32 }, [0 x i8], i32, [0 x i8] }

bad IR also has an additional load instruction in the assertion panic branch
https://gist.github.com/neivv/c7e536269885d3b3fac8ae1a93086920#file-bad-ll-L496

Mark-Simulacrum · 2018-07-07T20:58:05Z

cc @eddyb

neivv · 2018-07-08T00:23:02Z

So, I think this is an LLVM issue:

I tried to examine how the LLVM IR is optimized, and one transformation is quite.. suspicious
https://gist.github.com/neivv/daf93ad90c0d475c3abb119b1d01c584#file-neat-ll-L136
(The gist has main before-after the suspicious pass)

It's comparing vec's length to 0xe000_0001 - comparing to 0x1 would be correct, but somehow the three highest bits of the constant are set??
Interestingly multiplying 0xe000_0001 by sizeof(S) -- 8, is same as 0x1 * sizeof(S).

I don't know if there's anything that would allow LLVM do that transformation intentionally, but I don't really have any experience with LLVM either.

So I tried to reduce this to something that would reproduce, but modifying that IR to be something that opt accepts and running it through opt -instcombine, just does the expected transform, comparing slice length to 0x1. And since this bug doesn't occur with -C codegen-units=1, rustc --emit llvm-ir doesn't seem to produce anything that would reproduce the bug either.

hanna-kruppe · 2018-07-08T15:03:13Z

I've not examine the IR in detail but that does sound quite suspiciously like something is deeply broken in LLVM. It might be a miscompilation in our build, or a misuse of the APIs when running passes in-process, or a wild write somewhere messing up data structures.

But since that sort of bug is very difficult to diagnose, it would be good to make sure whether we really can't reproduce it outside of rustc. For starters, you could add -print-module-pass to the LLVM arguments when running under -print-after-all. That will make it print the entire module even after function passes such as SimplifyCFG and InstCombine, and should give you something that you can feed into vanilla LLVM 6.0's opt unmodified.

neivv · 2018-07-09T17:46:44Z

Okay, I've got LLVM-only reproduction =)

https://gist.github.com/neivv/ae449690d80e0aad853251caa5a3863b#file-thing-ll
Running this through opt -S -instcombine will cause the odd transformation to length == 0xe000_0001 (-536870911). At least with the opt that is part of Rust's current LLVM.

Haven't tried to reduce from there / find if anything in the input module is actually suspicious yet.

hanna-kruppe · 2018-07-09T18:38:02Z

Godbolt's opt (trunc) reproduces the odd transformation.

neivv · 2018-07-09T21:35:51Z

Figured it out.

I haven't had a chance to test this hypothesis but noticed this while looking around how InstCombine works, and I'm confident that this is it:
https://github.com/llvm-mirror/llvm/blob/ed10960040a110b16e0415cc5ec320247384f721/lib/Transforms/InstCombine/InstCombineCompares.cpp#L528
In this issue, the code is comparing roughly (S *)slice_start == (S *)slice_start + slice_len - 1.
This function is extracting the constant byte offset, -8, and dividing it by 8, sizeof(S). However, before that, compiling to a 32-bit target, the values are masked by PtrSizeMask 0xFFFFFFFF, which discards the sign bit of int64 and gives +4294967288, which divided by 8 is the same 536870911 which ends up being in the result.

Here's also the best I was able to reduce IR while still reproducing the issue.

; ModuleID = 'tmp1-317d481089b8c8fe83113de504472633.rs'
source_filename = "tmp1-317d481089b8c8fe83113de504472633.rs"
target datalayout = "e-m:x-p:32:32-i64:64-f80:32-n8:16:32-a:0:32-S32"
target triple = "i686-pc-windows-msvc"

%S = type { [0 x i32], [2 x i32], [0 x i32] }

; Function Attrs: uwtable
define internal void @_ZN3tmp4main17hcd108164c8e55929E() unnamed_addr {
  %1 = call { [0 x %S]*, i32 } @"_ZN68_$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..deref..Deref$GT$5deref17h968a5bfabac70869E"()
  %2 = extractvalue { [0 x %S]*, i32 } %1, 0
  %3 = extractvalue { [0 x %S]*, i32 } %1, 1
  %4 = getelementptr inbounds [0 x %S], [0 x %S]* %2, i32 0, i32 0, i32 0, i32 0
  %5 = getelementptr inbounds [0 x %S], [0 x %S]* %2, i32 0, i32 %3, i32 0, i32 0
  %6 = bitcast i32* %4 to %S*
  %7 = bitcast i32* %5 to %S*
  %8 = getelementptr inbounds %S, %S* %7, i32 -1
  %9 = bitcast %S* %8 to i32*
  %10 = bitcast i32* %9 to %S*
  %11 = icmp eq %S* %10, %6
  br i1 %11, label %12, label %13

; <label>:12:                                     ; preds = %24
  ret void

; <label>:13:                                     ; preds = %12, %50
  ret void
}

declare hidden { [0 x %S]*, i32 } @"_ZN68_$LT$alloc..vec..Vec$LT$T$GT$$u20$as$u20$core..ops..deref..Deref$GT$5deref17h968a5bfabac70869E"() unnamed_addr

hanna-kruppe · 2018-07-09T22:34:58Z

Incredibly work! 👏

I've managed to reduce your test case a bit further. As should be expected from the root cause you identified, changing the pointer size to 64 bit makes the issue disappear. Trying to simplify the types and GEPs further runs into the problem that other (simpler and correct) GEP optimizations fire first.

target datalayout = "p:32:32"

%S = type { [2 x i32] }

define i1 @foo([0 x %S]* %p, i32 %n) {
  %start.cast = bitcast [0 x %S]* %p to %S*
  %end = getelementptr inbounds [0 x %S], [0 x %S]* %p, i32 0, i32 %n, i32 0, i32 0
  %end.cast = bitcast i32* %end to %S*
  %last = getelementptr inbounds %S, %S* %end.cast, i32 -1
  %cmp = icmp eq %S* %last, %start.cast
  ret i1 %cmp
}

Someone with a http://bugs.llvm.org/ account should report this issue upstream (I should really request one soon, but, well, I'm not doing that now).

mati865 · 2018-12-07T10:44:32Z

Opened https://bugs.llvm.org/show_bug.cgi?id=39908

@alexcrichton

Update LLVM submodule Fixes rust-lang#52026. Fixes rust-lang#56618. r? @alexcrichton

kennytm added O-x86 C-bug Category: This is a bug. labels Jul 3, 2018

mati865 mentioned this issue Dec 7, 2018

cargo clippy with clippy-preview crash on x86 Linux rust-lang/rust-clippy#3345

Closed

nikic added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. I-unsound Issue: A soundness hole (worst kind of bug), see: https://en.wikipedia.org/wiki/Soundness labels Dec 7, 2018

This was referenced Dec 13, 2018

Backports rust-lang/llvm#132

Merged

Update LLVM submodule #56781

Merged

Centril added a commit to Centril/rust that referenced this issue Dec 16, 2018

Rollup merge of rust-lang#56781 - nikic:llvm-update, r=alexcrichton

c269468

Update LLVM submodule Fixes rust-lang#52026. Fixes rust-lang#56618. r? @alexcrichton

bors closed this as completed in #56781 Dec 16, 2018

Noratrieb added O-x86_32 Target: x86 processors, 32 bit (like i686-*) (IA-32) and removed O-x86-all labels Oct 25, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

next_back for a Vec is optimized incorrectly on i686 #52026

next_back for a Vec is optimized incorrectly on i686 #52026

neivv commented Jul 3, 2018 •

edited

Loading

neivv commented Jul 4, 2018

Mark-Simulacrum commented Jul 7, 2018

neivv commented Jul 8, 2018 •

edited

Loading

hanna-kruppe commented Jul 8, 2018

neivv commented Jul 9, 2018

hanna-kruppe commented Jul 9, 2018

neivv commented Jul 9, 2018

hanna-kruppe commented Jul 9, 2018 •

edited

Loading

mati865 commented Dec 7, 2018

next_back for a Vec is optimized incorrectly on i686 #52026

next_back for a Vec is optimized incorrectly on i686 #52026

Comments

neivv commented Jul 3, 2018 • edited Loading

neivv commented Jul 4, 2018

Mark-Simulacrum commented Jul 7, 2018

neivv commented Jul 8, 2018 • edited Loading

hanna-kruppe commented Jul 8, 2018

neivv commented Jul 9, 2018

hanna-kruppe commented Jul 9, 2018

neivv commented Jul 9, 2018

hanna-kruppe commented Jul 9, 2018 • edited Loading

mati865 commented Dec 7, 2018

neivv commented Jul 3, 2018 •

edited

Loading

neivv commented Jul 8, 2018 •

edited

Loading

hanna-kruppe commented Jul 9, 2018 •

edited

Loading