Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFH (request for help): LLVM assertion #10341

Merged
merged 1 commit into from
Mar 6, 2015
Merged

Conversation

JeffBezanson
Copy link
Member

This change is intended to improve type inference in cases mentioned in #10331. However it fails at the start of the tests, when trying to load a system image without a sys.so:

cp usr/lib/julia/sys.ji local.ji
./julia -J local.ji

...

define %jl_value_t* @jlcall_typeinf_222(%jl_value_t*, %jl_value_t**, i32) {
top:
  %3 = getelementptr %jl_value_t** %1, i64 0
  %4 = load %jl_value_t** %3
  %5 = getelementptr %jl_value_t** %1, i64 1
  %6 = load %jl_value_t** %5
  %7 = getelementptr %jl_value_t** %1, i64 2
  %8 = load %jl_value_t** %7
  %9 = getelementptr %jl_value_t** %1, i64 3
  %10 = load %jl_value_t** %9
  %11 = getelementptr %jl_value_t** %1, i64 4
  %12 = load %jl_value_t** %11
  %13 = bitcast %jl_value_t* %12 to %jl_value_t**
  %14 = getelementptr %jl_value_t** %13, i64 1
  %15 = bitcast %jl_value_t** %14 to i8*
  %16 = load i8* %15
  %17 = trunc i8 %16 to i1
  %18 = call %jl_value_t* @julia_typeinf_222(%jl_value_t* %4, %jl_value_t* %6, %jl_value_t* %8, %jl_value_t* %10, i1 %17)
  ret %jl_value_t* %18
}

julia: /home/jeff/src/julia/deps/llvm-3.3/include/llvm/Support/Casting.h:97: static bool llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To = llvm::BranchInst, From = llvm::TerminatorInst]: Assertion `Val && "isa<> used on a null pointer"' failed.

With sys.so, this doesn't happen. Stack trace:

#2  0x00007ffff621ba76 in __assert_fail_base (
    fmt=0x7ffff636d2b0 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=assertion@entry=0x7ffff74d8160 "Val && \"isa<> used on a null pointer\"", 
    file=file@entry=0x7ffff74d8118 "/home/jeff/src/julia/deps/llvm-3.3/include/llvm/Support/Casting.h", line=line@entry=97, 
    function=function@entry=0x7ffff76e46c0 <_ZZN4llvm11isa_impl_clINS_10BranchInstEPKNS_14TerminatorInstEE4doitES4_E19__PRETTY_FUNCTION__> "static bool llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To = llvm::BranchInst, From = llvm::TerminatorInst]") at assert.c:92
#3  0x00007ffff621bb22 in __GI___assert_fail (
    assertion=0x7ffff74d8160 "Val && \"isa<> used on a null pointer\"", 
    file=0x7ffff74d8118 "/home/jeff/src/julia/deps/llvm-3.3/include/llvm/Support/Casting.h", line=97, 
    function=0x7ffff76e46c0 <_ZZN4llvm11isa_impl_clINS_10BranchInstEPKNS_14TerminatorInstEE4doitES4_E19__PRETTY_FUNCTION__> "static bool llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To = llvm::BranchInst, From = llvm::TerminatorInst]") at assert.c:101
#4  0x00007ffff68a0251 in llvm::enable_if<llvm::is_same<llvm::TerminatorInst, llvm::simplify_type<llvm::TerminatorInst>::SimpleType>, llvm::cast_retty<llvm::BranchInst, llvm::TerminatorInst*>::ret_type>::type llvm::dyn_cast<llvm::BranchInst, llvm::TerminatorInst>(llvm::TerminatorInst*) [clone .part.635] ()
   from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
#5  0x00007ffff6f97c5b in (anonymous namespace)::CodeGenPrepare::runOnFunction(llvm::Function&) () from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
---Type <return> to continue, or q <return> to quit---
#6  0x00007ffff73986ef in llvm::FPPassManager::runOnFunction(llvm::Function&)
    () from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
#7  0x00007ffff7398829 in llvm::FunctionPassManagerImpl::run(llvm::Function&)
    () from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
#8  0x00007ffff7398a2d in llvm::FunctionPassManager::run(llvm::Function&) ()
   from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
#9  0x00007ffff6d9742e in llvm::JIT::jitTheFunction(llvm::Function*, llvm::MutexGuard const&) () from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
#10 0x00007ffff6d97b4d in llvm::JIT::runJITOnFunctionUnlocked(llvm::Function*, llvm::MutexGuard const&) ()
   from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
#11 0x00007ffff6d97df7 in llvm::JIT::getPointerToFunction(llvm::Function*) ()
   from /home/jeff/src/julia/usr/bin/../lib/libjulia-debug.so
#12 0x00007ffff68f6031 in jl_generate_fptr (f=0x7ffdf49595e0)
    at codegen.cpp:705
#13 0x00007ffff68e6dcf in jl_trampoline_compile_function (f=0x7ffdf49595e0, 
    always_infer=0, sig=0x7ffdf39e4000) at builtins.c:920
#14 0x00007ffff68e6ee8 in jl_trampoline (F=0x7ffdf49595e0, 
    args=0x7fffffff43e8, nargs=5) at builtins.c:931

I'm not sure what's going on. I could use some help debugging this (cc @vtjnash, @Keno).

@Keno
Copy link
Member

Keno commented Feb 26, 2015

For some reason there's a dead basic block:

define %jl_value_t* @"julia_setindex!_96"(%jl_value_t*, %jl_value_t*, %UnitRange) {
top:
  %I = alloca %UnitRange, !dbg !788, !julia_type !791
  %3 = alloca i64, !dbg !788
  %4 = alloca %jl_value_t*, i32 2, !dbg !788
  %5 = getelementptr %jl_value_t** %4, i32 2, !dbg !788
  %6 = getelementptr %jl_value_t** %4, i32 0, !dbg !788
  %7 = bitcast %jl_value_t** %6 to i64*, !dbg !788
  store i64 0, i64* %7, !dbg !788
  %8 = getelementptr %jl_value_t** %4, i32 1, !dbg !788
  %9 = bitcast %jl_value_t** %8 to %jl_value_t***, !dbg !788
  %10 = load %jl_value_t*** @jl_pgcstack, !dbg !788
  store %jl_value_t** %10, %jl_value_t*** %9, !dbg !788
  store %jl_value_t** %4, %jl_value_t*** @jl_pgcstack, !dbg !788
  %11 = select i1 true, %UnitRange %2, %UnitRange %2, !dbg !788, !julia_type !791
  store %UnitRange %11, %UnitRange* %I, !dbg !788
  %12 = bitcast %jl_value_t* %1 to %jl_array_t*, !dbg !792
  %13 = getelementptr inbounds %jl_array_t* %12, i32 0, i32 2, !dbg !792
  %14 = load i64* %13, !dbg !792, !tbaa %jtbaa_arraylen
  store i64 %14, i64* %3, !dbg !792
  %15 = load %UnitRange* %I, !dbg !792, !julia_type !791
  %16 = extractvalue %UnitRange %15, 1, !dbg !792
  %17 = load %UnitRange* %I, !dbg !792, !julia_type !791
  %18 = extractvalue %UnitRange %17, 0, !dbg !792
  %19 = call { i64, i1 } @llvm.ssub.with.overflow.i64(i64 %16, i64 %18), !dbg !792
  %20 = extractvalue { i64, i1 } %19, 1, !dbg !792
  %21 = load %jl_value_t** @jl_overflow_exception, !dbg !792
  %22 = xor i1 %20, true, !dbg !792
  br i1 %22, label %pass, label %fail, !dbg !792

if:                                               ; preds = %pass2

fail:                                             ; preds = %top
  call void @jl_throw_with_superfluous_argument(%jl_value_t* %21, i32 370), !dbg !792
  unreachable, !dbg !792

pass:                                             ; preds = %top
  %23 = extractvalue { i64, i1 } %19, 0, !dbg !792
  %24 = call { i64, i1 } @llvm.sadd.with.overflow.i64(i64 %23, i64 1), !dbg !792
  %25 = extractvalue { i64, i1 } %24, 1, !dbg !792
  %26 = load %jl_value_t** @jl_overflow_exception, !dbg !792
  %27 = xor i1 %25, true, !dbg !792
  br i1 %27, label %pass2, label %fail1, !dbg !792

fail1:                                            ; preds = %pass
  call void @jl_throw_with_superfluous_argument(%jl_value_t* %26, i32 370), !dbg !792
  unreachable, !dbg !792

pass2:                                            ; preds = %pass
  %28 = extractvalue { i64, i1 } %24, 0, !dbg !792
  %29 = icmp eq i64 %14, %28, !dbg !792
  %30 = xor i1 %29, true, !dbg !792
  %31 = xor i1 %30, true, !dbg !792
  br i1 %31, label %L, label %if, !dbg !792
}

@Keno
Copy link
Member

Keno commented Feb 26, 2015

This is actually quite mysterious, because I didn't think there was ever a case where we were relying on that code path.

@vtjnash
Copy link
Member

vtjnash commented Feb 26, 2015

Without having looked at any runs, perhaps this is related to the calling convention errors associated with invoke? (Uncomment invoke test in core.jl to observe failure)

@JeffBezanson
Copy link
Member Author

This problem seems to have gone away. My guess is this is due to the refactoring for #10392, since I believe the assertion failure was from the extremely-long typeinf function.

@JeffBezanson
Copy link
Member Author

Hmm some random failures. @andreasnoack is this just bad luck: https://travis-ci.org/JuliaLang/julia/jobs/53271264

@tkelman
Copy link
Contributor

tkelman commented Mar 6, 2015

I've definitely seen that before elsewhere.

ERROR: LoadError: LoadError: LoadError: test failed: cholfact((A1 + A1') - I) did not throw Base.LinAlg.PosDefException
 in expression: cholfact((A1 + A1') - I)
 in anonymous at task.jl:1375
while loading /tmp/julia/share/julia/test/sparsedir/cholmod.jl, in expression starting on line 313
while loading sparse.jl, in expression starting on line 3
while loading /tmp/julia/share/julia/test/runtests.jl, in expression starting on line 3

JeffBezanson added a commit that referenced this pull request Mar 6, 2015
RFH (request for help): LLVM assertion
@JeffBezanson JeffBezanson merged commit 05226ca into master Mar 6, 2015
@Keno
Copy link
Member

Keno commented Mar 7, 2015

I'm still getting the corresponding assertion failure here when running with LLVM 3.7. Will debug.

@Keno
Copy link
Member

Keno commented Mar 8, 2015

Ok, with my debugging setup finally working again, we have the following chain:

(codegen for typeinf_uncached) -> (codegen for setindex!) -> cache_method -> typeinf -> (call typeinf_uncached)

which causes the problem.

@Keno
Copy link
Member

Keno commented Mar 8, 2015

Proposed fix:

diff --git a/src/gf.c b/src/gf.c
index 8705a24..72eb11b 100644
--- a/src/gf.c
+++ b/src/gf.c
@@ -918,7 +918,9 @@ static jl_function_t *cache_method(jl_methtable_t *mt, jl_tu
         }
         method->linfo->specializations = spe;
         gc_wb(method->linfo, method->linfo->specializations);
-        jl_type_infer(newmeth->linfo, type, method->linfo);
+        if (!jl_in_inference) {
+            jl_type_infer(newmeth->linfo, type, method->linfo);
+        }
     }
     JL_GC_POP();
     return newmeth;

@JeffBezanson does that look reasonable or is there a reason it wasn't checked there before?

@vtjnash
Copy link
Member

vtjnash commented Mar 8, 2015

isn't that supposed to be caught further up the chain, so we don't permanently cache non-type inferred methods?

@vtjnash
Copy link
Member

vtjnash commented Mar 9, 2015

@Keno. i think you've earned your salary for the week: that patch also seems to fix the llvm assertion failure that has made me unable to use julia-debug.exe since early January!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants