Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LLVM 3.9 miss-compiles inlined broadcast for Array & SubArray #19792

Closed
Sacha0 opened this issue Dec 31, 2016 · 21 comments · Fixed by #19810
Closed

LLVM 3.9 miss-compiles inlined broadcast for Array & SubArray #19792

Sacha0 opened this issue Dec 31, 2016 · 21 comments · Fixed by #19810
Assignees
Labels
bug Indicates an unexpected problem or unintended behavior priority This should be addressed urgently
Milestone

Comments

@Sacha0
Copy link
Member

Sacha0 commented Dec 31, 2016

On

julia> versioninfo()
Julia Version 0.6.0-dev.1807
Commit 26c8d85* (2016-12-31 04:12 UTC)
Platform Info:
  OS: macOS (x86_64-apple-darwin15.6.0)
  CPU: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas64_
  LIBM: libopenlibm
  LLVM: libLLVM-3.9.1 (ORCJIT, ivybridge)

The linalg/qr tests fail. The failure reduces to

n = 10;
srand(1234321);
a = randn(n,n)/2;

av = view(a, 1:n - 1, 1:n - 1);
qrav = qrfact(av);
q, r  = qrav[:Q], qrav[:R];
q*r; # fine
av; # fine
(q*r, av); # fine
q*r  av # segfaults
# q*r = av # also segfaults
# q*r - av # also segfaults

e.g.

julia> n = 10;

julia> srand(1234321);

julia> a = randn(n,n)/2;

julia> av = view(a, 1:n - 1, 1:n - 1);

julia> qrav = qrfact(av);

julia> q, r  = qrav[:Q], qrav[:R];

julia> q*r; # fine

julia> av; # fine

julia> (q*r, av); # fine

julia> q*r  av # segfaults

signal (11): Segmentation fault: 11
while loading no file, in expression starting on line 0
_ZN12_GLOBAL__N_113InlineSpiller16postOptimizationEv at /Users/sacha/pkg/julia/usr/lib//libLLVM.dylib (unknown line)
_ZN4llvm12RegAllocBase16postOptimizationEv at /Users/sacha/pkg/julia/usr/lib//libLLVM.dylib (unknown line)
_ZN12_GLOBAL__N_18RAGreedy20runOnMachineFunctionERN4llvm15MachineFunctionE at /Users/sacha/pkg/julia/usr/lib//libLLVM.dylib (unknown line)
_ZN4llvm19MachineFunctionPass13runOnFunctionERNS_8FunctionE at /Users/sacha/pkg/julia/usr/lib//libLLVM.dylib (unknown line)
_ZN4llvm13FPPassManager13runOnFunctionERNS_8FunctionE at /Users/sacha/pkg/julia/usr/lib//libLLVM.dylib (unknown line)
_ZN4llvm13FPPassManager11runOnModuleERNS_6ModuleE at /Users/sacha/pkg/julia/usr/lib//libLLVM.dylib (unknown line)
_ZN4llvm6legacy15PassManagerImpl3runERNS_6ModuleE at /Users/sacha/pkg/julia/usr/lib//libLLVM.dylib (unknown line)
operator() at /Users/sacha/pkg/julia/src/jitlayers.cpp:436 [inlined]
__invoke<(lambda at /Users/sacha/pkg/julia/src/jitlayers.cpp:434:13) &, llvm::Module &> at /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/__functional_base:416 [inlined]
__call<(lambda at /Users/sacha/pkg/julia/src/jitlayers.cpp:434:13) &, llvm::Module &> at /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/__functional_base:437 [inlined]
operator() at /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/functional:1437
operator() at /Library/Developer/CommandLineTools/usr/bin/../include/c++/v1/functional:1817 [inlined]
addModuleSet<llvm::SmallVector<std::__1::unique_ptr<llvm::Module, std::__1::default_delete<llvm::Module> >, 1>, llvm::RTDyldMemoryManager *, std::__1::unique_ptr<llvm::orc::LambdaResolver<(lambda at /Users/sacha/pkg/julia/src/jitlayers.cpp:530:23), (lambda at /Users/sacha/pkg/julia/src/jitlayers.cpp:554:23)>, std::__1::default_delete<llvm::orc::LambdaResolver<(lambda at /Users/sacha/pkg/julia/src/jitlayers.cpp:530:23), (lambda at /Users/sacha/pkg/julia/src/jitlayers.cpp:554:23)> > > > at /Users/sacha/pkg/julia/usr/include/llvm/ExecutionEngine/Orc/IRCompileLayer.h:73 [inlined]
addModule at /Users/sacha/pkg/julia/src/jitlayers.cpp:558
jl_add_to_ee at /Users/sacha/pkg/julia/src/jitlayers.cpp:782 [inlined]
jl_finalize_function at /Users/sacha/pkg/julia/src/jitlayers.cpp:793
getAddressForFunction at /Users/sacha/pkg/julia/src/codegen.cpp:1070 [inlined]
jl_generate_fptr at /Users/sacha/pkg/julia/src/codegen.cpp:1193
jl_call_method_internal at /Users/sacha/pkg/julia/src/./julia_internal.h:240
jl_apply_generic at /Users/sacha/pkg/julia/src/gf.c:2196
isapprox at ./linalg/generic.jl:1190
unknown function (ip: 0x3198d4456)
jl_apply_generic at /Users/sacha/pkg/julia/src/gf.c:2196
do_call at /Users/sacha/pkg/julia/src/interpreter.c:75
eval at /Users/sacha/pkg/julia/src/interpreter.c:214
jl_interpret_toplevel_expr at /Users/sacha/pkg/julia/src/interpreter.c:34
jl_toplevel_eval_flex at /Users/sacha/pkg/julia/src/toplevel.c:636
jl_toplevel_eval_in at /Users/sacha/pkg/julia/src/builtins.c:606
eval at ./boot.jl:236
jlcall_eval_18559 at /Users/sacha/pkg/julia/usr/lib/julia/sys.dylib (unknown line)
jl_apply_generic at /Users/sacha/pkg/julia/src/gf.c:2196
eval_user_input at ./REPL.jl:66
unknown function (ip: 0x3198c9a76)
jl_apply_generic at /Users/sacha/pkg/julia/src/gf.c:2196
macro expansion at ./REPL.jl:97 [inlined]
#1 at ./event.jl:66
unknown function (ip: 0x3198c285f)
jl_apply_generic at /Users/sacha/pkg/julia/src/gf.c:2196
jl_apply at /Users/sacha/pkg/julia/src/./julia.h:1388 [inlined]
start_task at /Users/sacha/pkg/julia/src/task.c:261
Allocations: 7033163 (Pool: 7028643; Big: 4520); GC: 15
Segmentation fault: 11

Best!

@tkelman tkelman added the linear algebra Linear algebra label Dec 31, 2016
@tkelman
Copy link
Contributor

tkelman commented Dec 31, 2016

bisected to #19746 - @andreasnoack that assertion failure was real and should not have been merged

@tkelman tkelman added bug Indicates an unexpected problem or unintended behavior and removed linear algebra Linear algebra labels Dec 31, 2016
@tkelman
Copy link
Contributor

tkelman commented Dec 31, 2016

Interestingly this doesn't fail with LLVM 3.7.1 (on Ubuntu 14.04 anyway). Julia codegen bug or upstream llvm bug? If you build with LLVM_ASSERTIONS = 1 you get more info

.../llvm-3.9.1/lib/CodeGen/LiveVariables.cpp:114

MBB != &MF->front() && "Can't find reaching def for virtreg"

cc @vchuravy

@vchuravy vchuravy self-assigned this Dec 31, 2016
@tkelman
Copy link
Contributor

tkelman commented Dec 31, 2016

There's also what seems to be a test-system bug here that JULIA_CPU_CORES=2 usr/bin/julia test/runtests.jl linalg/hessenberg linalg/qr says it passes but JULIA_CPU_CORES=1 usr/bin/julia test/runtests.jl linalg/hessenberg linalg/qr does not. Similar to #19376, and was seen with the broadcast test failure that #16961 introduced and #19745 fixed

@vchuravy
Copy link
Member

I can reproduce this with -O1 on a build with LLVM_ASSERTIONS, but no with -O0. Maybe this is related to #15453?

@vchuravy
Copy link
Member

vchuravy commented Dec 31, 2016

The MachineInstuction the assertion is triggered from is:

(rr) p MI.dump()
  CMP64rr %vreg230, %vreg428, %EFLAGS<imp-def>; GR64:%vreg230,%vreg428 dbg:simdloop.jl:72 @[ broadcast.jl:150 @[ broadcast.jl:145 @[ broadcast.jl:263 @[ broadcast.jl:314 @[ broadcast.jl:454 @[ arraymath.jl:34 ] ] ] ] ] ]

and if I remove @simd from

julia/base/broadcast.jl

Lines 151 to 159 in 26c8d85

@simd for I in iter
# reverse-broadcast the indices
@nexprs $nargs i->(I_i = newindex(I, keep_i, Idefault_i))
# extract array values
@nexprs $nargs i->(@inbounds val_i = _broadcast_getindex(A_i, I_i))
# call the function and store the result
result = @ncall $nargs f val
@inbounds B[I] = result
end
the example works.

I further reduced this to:

Y = view(rand(10, 10), :, :);
X = rand(Float64, 10, 10);
broadcast(-, X, Y); # works
-(X, Y) # fails

@vchuravy
Copy link
Member

Y = view(rand(10, 10), :, :);
X = rand(Float64, 10, 10);
myminus(A, B) = broadcast(-, A, B);
myminus(X, Y)
julia39-dbg: /home/wallnuss/src/julia/deps/srccache/llvm-3.9.1/lib/CodeGen/LiveVariables.cpp:114: void llvm::LiveVariables::MarkVirtRegAliveInBlock(llvm::LiveVariables::VarInfo&, llvm::MachineBasicBlock*, llvm::MachineBasicBlock*, std::vector<llvm::MachineBasicBlock*>&): Assertion `MBB != &MF->front() && "Can't find reaching def for virtreg"' failed.

@vchuravy vchuravy changed the title linalg/qr test failure LLVM 3.9 miss-compiles inlined broadcast. Dec 31, 2016
@vchuravy vchuravy changed the title LLVM 3.9 miss-compiles inlined broadcast. LLVM 3.9 miss-compiles inlined broadcast for Array & SubArray Dec 31, 2016
@vchuravy vchuravy added this to the 0.6.0 milestone Dec 31, 2016
@vchuravy
Copy link
Member

I will be travelling for the first week of January, so I won't have too much time to look into this (sorry about that)

@vtjnash, @JeffBezanson If one of you has time to see if we can work around this from our codegen side, that would be great. I was more looking at this from the LLVM side, but I didn't find anything obvious.

@tkelman
Copy link
Contributor

tkelman commented Dec 31, 2016

This seems buggy enough that we should at least consider dropping back to 3.7 if it can't be solved soon.

@andreasnoack
Copy link
Member

This doesn't segfault on LLVM-svn c828f5b04a33480c4d8c86863a1439df265f2b2b.

@tkelman
Copy link
Contributor

tkelman commented Jan 1, 2017

(reverse-)bisect on llvm to figure out what patch to carry?

@andreasnoack
Copy link
Member

I'll try that

@andreasnoack
Copy link
Member

A familiar name showed up (and bad means good)

99ca52276f9ee1386866d6dff6179cfa64824621 is the first bad commit
commit 99ca52276f9ee1386866d6dff6179cfa64824621
Author: Keno Fischer <kfischer@college.harvard.edu>
Date:   Mon Dec 5 21:25:03 2016 +0000

    [LAA] Prevent invalid IR for loop-invariant bound in loop body

    Summary:
    If LAA expands a bound that is loop invariant, but not hoisted out
    of the loop body, it used to use that value anyway, causing a
    non-domination error, because the memcheck block is of course not
    dominated by the scalar loop body. Detect this situation and expand
    the SCEV expression instead.

    Fixes PR31251

    Reviewers: anemet
    Subscribers: mzolotukhin, llvm-commits

    Differential Revision: https://reviews.llvm.org/D27397

    git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288705 91177308-0d34-0410-b5e6-96231b3b80d8

:040000 040000 74687040ae081d9648605e476296951bbba888b5 89b8d818ed0748ef7474a01ce73d3e847df797ea M  lib
:040000 040000 884e86a25a25239e045977569ecbd328f0fc6119 c64349b6f7c34fd1afae99e5981f0297cc63770e M  test

it can be cherry-picked cleanly from release_39 and fixes the example in this issue.

@tkelman
Copy link
Contributor

tkelman commented Jan 2, 2017

Very nice! Let's add that to our list then.

@Keno
Copy link
Member

Keno commented Jan 2, 2017

We should also pick up 83dc06334ff95ad18a951d0bb540290510f2f81a and whatever the git sha corresponding to https://reviews.llvm.org/rL290260 is.

@tkelman tkelman assigned tkelman and unassigned vchuravy Jan 2, 2017
@tkelman
Copy link
Contributor

tkelman commented Jan 2, 2017

working on a branch (will also need to rebuild mac and windows binaries, if you've got any other changes in mind now's a good time)

tkelman added a commit that referenced this issue Jan 2, 2017
miscompilation of broadcast
@Keno
Copy link
Member

Keno commented Jan 2, 2017

I'm about to reapply a fixed version of https://reviews.llvm.org/D21731. We'll probably want to replace our existing patch by that once that's done.

@tkelman
Copy link
Contributor

tkelman commented Jan 2, 2017

Could you also take a look at #19803 ? That only seems to be happening on 32 bit x86 and with LLVM assertions enabled. If it helps make reproducing faster I can put up a docker image that has deps prebuilt for it.

@Keno
Copy link
Member

Keno commented Jan 2, 2017

Yes, that would be helpful.

@tkelman
Copy link
Contributor

tkelman commented Jan 2, 2017

We'll probably want to replace our existing patch by that once that's done.

bf92182 if it sticks without getting reverted upstream

edit: needs some conflicts resolved to apply against 3.9.1, I'm inclined to leave the patch as it is for now if I don't hear otherwise?

tkelman added a commit that referenced this issue Jan 2, 2017
miscompilation of broadcast
@JeffBezanson JeffBezanson added the priority This should be addressed urgently label Jan 3, 2017
tkelman added a commit that referenced this issue Jan 4, 2017
andyferris pushed a commit to andyferris/julia that referenced this issue Jan 4, 2017
@peirongf
Copy link

I am using llvm 4.0, to build Halide project. Halide is with tag release_2017_05_03.
It failed with following error information:

bin/gaussian5x5_generator -o ./bin/arm-64-android -e o,h -f gaussian5x5_hvx128 target=arm-64-android-hvx_128
gaussian5x5_generator: /llvm-4.0.0.src/lib/CodeGen/LiveVariables.cpp:114: void llvm::LiveVariables::MarkVirtRegAliveInBlock(llvm::LiveVariables::VarInfo&, llvm::MachineBasicBlock*, llvm::MachineBasicBlock*, std::vectorllvm::MachineBasicBlock*&): Assertion `MBB != &MF->front() && "Can't find reaching def for virtreg"' failed.
Makefile:91: recipe for target 'bin/arm-64-android/gaussian5x5_hvx128.o' failed
make: *** [bin/arm-64-android/gaussian5x5_hvx128.o] Aborted (core dumped)

@tkelman
Copy link
Contributor

tkelman commented Jun 20, 2017

Report that to Halide please, it's not a Julia issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behavior priority This should be addressed urgently
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants