Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashes in calls of functions generated with Symbolics.build_function #46043

Open
lassepe opened this issue Jul 14, 2022 · 2 comments
Open

Crashes in calls of functions generated with Symbolics.build_function #46043

lassepe opened this issue Jul 14, 2022 · 2 comments

Comments

@lassepe
Copy link
Contributor

lassepe commented Jul 14, 2022

Note: Reposting here since I suspect that this is a Julia bug rather than an issue of Symbolics.jl

In some of my research code, I am seeing sporadic crashes for functions generated by Symbolics.build_functions in codegen.cpp

julia: /buildworker/worker/package_linux64/build/src/codegen.cpp:3635: jl_cgval_t emit_invoke(jl_codectx_t&, const jl_cgval_t&, const jl_cgval_t*, size_t, jl_value_t*): Assertion `(((jl_value_t*)(((jl_taggedvalue_t*)((char*)(mi) - sizeof(jl_taggedvalue_t)))->header & ~(uintptr_t)15))==(jl_value_t*)(jl_method_instance_type))' failed.

Which is the inlined version of this:

assert(jl_is_method_instance(mi));

Click to see the full backtrace captured with gdb.
julia: /buildworker/worker/package_linux64/build/src/codegen.cpp:3635: jl_cgval_t emit_invoke(jl_codectx_t&, const jl_cgval_t&, const jl_cgval_t*, size_t, jl_value_t*): Assertion `(((jl_value_t*)(((jl_taggedvalue_t*)((char*)(mi) - sizeof(jl_taggedvalue_t)))->header & ~(uintptr_t)15))==(jl_value_t*)(jl_method_instance_type))' failed.

Thread 1 "julia" received signal SIGABRT, Aborted.
__GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
50	../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
(gdb) backtrace 
#0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00007ffff7d8d859 in __GI_abort () at abort.c:79
#2  0x00007ffff7d8d729 in __assert_fail_base (fmt=0x7ffff7f23588 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", 
    assertion=0x7ffff70f5200 "(((jl_value_t*)(((jl_taggedvalue_t*)((char*)(mi) - sizeof(jl_taggedvalue_t)))->header & ~(uintptr_t)15))==(jl_value_t*)(jl_method_instance_type))", 
    file=0x7ffff70f3c48 "/buildworker/worker/package_linux64/build/src/codegen.cpp", line=3635, 
    function=<optimized out>) at assert.c:92
#3  0x00007ffff7d9efd6 in __GI___assert_fail (
    assertion=assertion@entry=0x7ffff70f5200 "(((jl_value_t*)(((jl_taggedvalue_t*)((char*)(mi) - sizeof(jl_taggedvalue_t)))->header & ~(uintptr_t)15))==(jl_value_t*)(jl_method_instance_type))", 
    file=file@entry=0x7ffff70f3c48 "/buildworker/worker/package_linux64/build/src/codegen.cpp", 
    line=line@entry=3635, 
    function=function@entry=0x7ffff70febe0 <emit_invoke(jl_codectx_t&, jl_cgval_t const&, jl_cgval_t const*, unsigned long, _jl_value_t*)::__PRETTY_FUNCTION__> "jl_cgval_t emit_invoke(jl_codectx_t&, const jl_cgval_t&, const jl_cgval_t*, size_t, jl_value_t*)") at assert.c:101
#4  0x00007ffff6f4bf90 in emit_invoke (ctx=..., lival=..., argv=argv@entry=0x7fffffff6f40, nargs=nargs@entry=4, 
    rt=rt@entry=0x7fffe33a8160 <jl_system_image_data+1306272>)
    at /buildworker/worker/package_linux64/build/src/codegen.cpp:3635
#5  0x00007ffff6f751a7 in emit_invoke (ctx=..., rt=0x7fffe33a8160 <jl_system_image_data+1306272>, 
    ex=<optimized out>) at /buildworker/worker/package_linux64/build/src/codegen.cpp:3626
#6  0x00007ffff6f6ec85 in emit_expr (ctx=..., expr=expr@entry=0x7ffed3cb72b0, ssaval=ssaval@entry=2)
    at /buildworker/worker/package_linux64/build/src/codegen.cpp:4585
#7  0x00007ffff6f77e55 in emit_ssaval_assign (ctx=..., idx=idx@entry=2, r=r@entry=0x7ffed3cb72b0)
    at /buildworker/worker/package_linux64/build/src/codegen.cpp:4245
#8  0x00007ffff6f67cca in emit_stmtpos (ssaval_result=2, expr=0x7ffed3cb72b0, ctx=...)
    at /buildworker/worker/package_linux64/build/src/codegen.cpp:4487
#9  emit_function (lam=lam@entry=0x7ffeeb6d3210, src=src@entry=0x7fff54adfc90, 
    jlrettype=jlrettype@entry=0x7fffe33a8160 <jl_system_image_data+1306272>, params=..., 
    vaOverride=vaOverride@entry=false) at /buildworker/worker/package_linux64/build/src/codegen.cpp:7326
#10 0x00007ffff6f7f2b9 in jl_emit_code (li=0x7ffeeb6d3210, src=0x7fff54adfc90, 
    jlrettype=0x7fffe33a8160 <jl_system_image_data+1306272>, params=...)
    at /buildworker/worker/package_linux64/build/src/codegen.cpp:7688
#11 0x00007ffff6f7f721 in jl_emit_codeinst (codeinst=codeinst@entry=0x7ffee5832b30, src=<optimized out>, 
    src@entry=0x7fff54adfc90, params=...) at /buildworker/worker/package_linux64/build/src/codegen.cpp:7733
#12 0x00007ffff7033950 in _jl_compile_codeinst (codeinst=codeinst@entry=0x7ffee5832b30, src=0x7fff54adfc90, 
    world=world@entry=31858) at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:124
#13 0x00007ffff7035082 in jl_generate_fptr (mi=mi@entry=0x7ffeeb6d3210, world=world@entry=31858)
    at /buildworker/worker/package_linux64/build/src/jitlayers.cpp:350
#14 0x00007ffff6fa62dd in jl_compile_method_internal (mi=mi@entry=0x7ffeeb6d3210, world=world@entry=31858)
    at /buildworker/worker/package_linux64/build/src/gf.c:1980
#15 0x00007ffff6fa6c33 in jl_compile_method_internal (world=31858, mi=0x7ffeeb6d3210)
    at /buildworker/worker/package_linux64/build/src/gf.c:2246
#16 _jl_invoke (world=31858, mfunc=0x7ffeeb6d3210, nargs=3, args=0x7fffffff9a60, F=0x7ffee9cbee48)
    at /buildworker/worker/package_linux64/build/src/gf.c:2239
#17 jl_invoke (F=0x7ffee9cbee48, args=0x7fffffff9a60, nargs=3, mfunc=0x7ffeeb6d3210)
    at /buildworker/worker/package_linux64/build/src/gf.c:2254

It does not happen every time but the frequency seems to correlate with:

  • the "size" of the vector-valued function (happens more often for large outputs)
  • the re-generation of this function generated with Revise.jl (though I have also seen this error on first try without any Revise action)

Unfortunately, I have thus far been unable to create a compact reproducer that does not involve a ton of my research code. However, I have a setup to reproduce this issue locally with gdb and am happy to provide more information if necessary.

The following code snippet reliably causes this issue for me (compilation takes a while):

using Symbolics
using SparseArrays

function main(; n=10000, m=10000, nsp=10)
    x = begin
        @variables(x[1:n],) |> only |> Symbolics.scalarize
    end

    f = map(1:m) do _
        ind = rand(eachindex(x), nsp)
        sum(x -> x^2, x[ind])
    end

    J = Symbolics.sparsejacobian(f, x)
    (J_rows, J_cols, J_vals) = findnz(J)


    J_vals_fn! = Symbolics.build_function(J_vals, x; expression=Val{false})[2]
    sparse_J = (; rows=J_rows, cols=J_cols, (vals_fn!)=J_vals_fn!)

    result = zeros(length(sparse_J.rows))
    input = rand(length(x))
    sparse_J.vals_fn!(result, input)
    result
end

Version Info

  • Julia 1.7.3 and 1.8.0-rc1
  • Symbolics 4.9

Unfortunately, I cannot instantiate the project that triggers this with Julia nightly since some of my dependencies don't play with nightly nicely yet. If I find a reproducer without those dependencies, I will post here again.

Interesting, I have thus far been unable to reproduce the problem on nightly. (see comment git bisect below)

@lassepe
Copy link
Contributor Author

lassepe commented Jul 15, 2022

I just did a git bisect and the problem seems to have been (indirectly?) fixed in 3d787a7. I'm not sure whether the problem is actually fixed or only appears with a very low probability now. Any thoughts on this, @Keno?

@lassepe
Copy link
Contributor Author

lassepe commented Jul 17, 2022

Here is a trace captured of rr of that segfault (recorded with julia 1.8.0-rc3)
https://s3.amazonaws.com/julialang-dumps/reports/2022-07-17T17-59-42-lassepe.tar.zst

@lassepe lassepe changed the title Sporatic crashes in calls of functions generated with Symbolics.build_function Crashes in calls of functions generated with Symbolics.build_function Jul 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants