Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cygwin build produces SIGSEGV at startup #16921

Closed
josefsachsconning opened this issue Jun 14, 2016 · 15 comments
Closed

cygwin build produces SIGSEGV at startup #16921

josefsachsconning opened this issue Jun 14, 2016 · 15 comments
Labels
system:windows Affects only Windows
Milestone

Comments

@josefsachsconning
Copy link
Contributor

running master branch at dab5c8d
My last successful build was 0.5.0-dev+4707 at 2119ea6

HAW7L0605$ usr/bin/julia

signal (11): SIGSEGV
while loading no file, in expression starting on line 0
crt_sig_handler at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\signals-win.c:87
_gnu_exception_handler at /usr/src/debug/mingw64-x86_64-runtime-4.0.5-1/crt\crt_handler.c:223
unknown function (ip: 00000000773678C7)
unknown function (ip: 0000000077377E8C)
unknown function (ip: 00000000773684CE)
unknown function (ip: 000000007739BAC7)
emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3994
to_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:832
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:88
jl_apply at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia.h:1384
jl_init_restored_modules at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\dump.c:1884
_julia_init at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\init.c:710
julia_init at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\task.c:277
wmain at /home/s2sajs/julia-master/ui\repl.c:677
__tmainCRTStartup at /usr/src/debug/mingw64-x86_64-runtime-4.0.5-1/crt\crtexe.c:329
mainCRTStartup at /usr/src/debug/mingw64-x86_64-runtime-4.0.5-1/crt\crtexe.c:212
unknown function (ip: 00000000772459BC)
unknown function (ip: 000000007737A2E0)
Allocations: 833057 (Pool: 832340; Big: 717); GC: 0
@pfitzseb
Copy link
Member

Same error for today's Windows nightly binaries.

@tkelman tkelman added the system:windows Affects only Windows label Jun 14, 2016
@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2016

probably #16907

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2016

confirmed by bisect to be e24fec2, cc @JeffBezanson

$ git bisect bad
e24fec28f4d431417c1a70c9535b0279058d6971 is the first bad commit
commit e24fec28f4d431417c1a70c9535b0279058d6971
Author: Jeff Bezanson <jeff.bezanson@gmail.com>
Date:   Mon Jun 13 11:18:00 2016 -0400

    delete IR for non-inlineable functions after codegen to save memory

:040000 040000 ec361ee16d12300ac7c13a17c9910e3a1597d58c 99a06c4a9790dec2b6ad3542e8f13766f9867083 M        base
:040000 040000 18eaab07e34fcfe25fca727299e6a512a4232be4 d62e9a2e2e3baf68504069c47759b515c84fc6de M        src

Tony@LAPTOP-O230JCFF ~/julia32
$ git bisect log
git bisect start
# good: [466da651c058b2a0d55cc42efe6c7d4c5ee8270a] Merge pull request #16854 from JuliaLang/jn/dep-cmdlineargs
git bisect good 466da651c058b2a0d55cc42efe6c7d4c5ee8270a
# bad: [b848fd8e3f142e592602c26a703dca144141d74a] Merge pull request #16858 from JuliaLang/tk/urlbackslash
git bisect bad b848fd8e3f142e592602c26a703dca144141d74a
# bad: [4c1af23951c4184f36a8fef964191261f9fe42a1] Merge pull request #16907 from JuliaLang/jb/delete_non_inlineable
git bisect bad 4c1af23951c4184f36a8fef964191261f9fe42a1
# good: [cd4f694aaf40367d71d1a136b2f1b40e7957cf93] Add GC_OLD and GC_OLD_MARKED
git bisect good cd4f694aaf40367d71d1a136b2f1b40e7957cf93
# good: [03adf4ba4f2db8b833d6b5deacf5dfd5513554f9] Merge pull request #16903 from zhmz90/RemoveNewline
git bisect good 03adf4ba4f2db8b833d6b5deacf5dfd5513554f9
# good: [8aa1d5ca7a2c05e3b28423b5c811443781286503] Merge pull request #16900 from JuliaLang/jn/grabbag4
git bisect good 8aa1d5ca7a2c05e3b28423b5c811443781286503
# good: [a84f702a30c88292d1a881f849c3e1306bfe3b71] Merge pull request #16898 from JuliaLang/jn/incremental-datatype-precompile
git bisect good a84f702a30c88292d1a881f849c3e1306bfe3b71
# bad: [e24fec28f4d431417c1a70c9535b0279058d6971] delete IR for non-inlineable functions after codegen to save memory
git bisect bad e24fec28f4d431417c1a70c9535b0279058d6971
# first bad commit: [e24fec28f4d431417c1a70c9535b0279058d6971] delete IR for non-inlineable functions after codegen to save memory

Don't know why AppVeyor didn't have an issue.

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2016

backtrace (it's failing an assertion when you run with julia-debug):

(gdb) b _wassert
Breakpoint 1 at 0x41edb0: file /usr/src/debug/mingw64-i686-runtime-4.0.6-1/misc/wassert.c, line 37.
(gdb) r
Starting program: /cygdrive/c/cygwin64/home/tony/julia32/usr/bin/julia-debug.exe --check-bounds=yes --startup-file=no test/runtests.jl core
[New Thread 5328.0x3148]
[New Thread 5328.0x25d8]
[New Thread 5328.0x1590]
[New Thread 5328.0x578]

Breakpoint 1, _wassert (
    _Message=_Message@entry=0x284bee40 L"jl_is_array(data)",
    _File=_File@entry=0x284bee80 L"/home/Tony/julia32/src/dump.c",
    _Line=_Line@entry=2128)
    at /usr/src/debug/mingw64-i686-runtime-4.0.6-1/misc/wassert.c:37
37      /usr/src/debug/mingw64-i686-runtime-4.0.6-1/misc/wassert.c: No such file or directory.
(gdb) bt
#0  _wassert (_Message=_Message@entry=0x284bee40 L"jl_is_array(data)",
    _File=_File@entry=0x284bee80 L"/home/Tony/julia32/src/dump.c",
    _Line=_Line@entry=2128)
    at /usr/src/debug/mingw64-i686-runtime-4.0.6-1/misc/wassert.c:37
#1  0x66eca9e9 in _assert (_Message=0x66fa058e <BOM+1216> "jl_is_array(data)",
    _File=0x66f9ff9c <BackRef_tag+508> "/home/Tony/julia32/src/dump.c",
    _Line=2128)
    at /usr/src/debug/mingw64-i686-runtime-4.0.6-1/misc/wassert.c:30
#2  0x66e03323 in jl_uncompress_ast (li=0x866e900, data=0x7720010)
    at /home/Tony/julia32/src/dump.c:2128
#3  0x66e62231 in emit_function (lam=0x866e900, declarations=0x866e93c)
    at /home/Tony/julia32/src/codegen.cpp:3971
#4  0x66e3ae0a in to_function (li=0x866e900)
    at /home/Tony/julia32/src/codegen.cpp:832
#5  0x66e3b7b8 in jl_compile_linfo (li=0x866e900)
    at /home/Tony/julia32/src/codegen.cpp:1032
#6  0x66dd0f19 in jl_call_method_internal (meth=0x866e900, args=0xc5fb88,
    nargs=1) at /home/Tony/julia32/src/julia_internal.h:88
#7  0x66dd6252 in jl_apply_generic (args=0xc5fb88, nargs=1)
    at /home/Tony/julia32/src/gf.c:1774
#8  0x66de6a62 in jl_apply (args=0xc5fb88, nargs=1)
    at /home/Tony/julia32/src/julia.h:1384
#9  0x66de8f25 in jl_module_run_initializer (m=0x7bf31f0)
---Type <return> to continue, or q <return> to quit---
    at /home/Tony/julia32/src/module.c:594
#10 0x66e026ff in jl_init_restored_modules (init_order=0xa8b2650)
    at /home/Tony/julia32/src/dump.c:1884
#11 0x66df5948 in _julia_init (rel=JL_IMAGE_JULIA_HOME)
    at /home/Tony/julia32/src/init.c:710
#12 0x66df6f89 in julia_init (rel=JL_IMAGE_JULIA_HOME)
    at /home/Tony/julia32/src/task.c:277
#13 0x00402d76 in wmain (argc=2, argv=0xd2d0a4, envp=0x3d71c48)
    at /home/Tony/julia32/ui/repl.c:677
#14 0x00401400 in __tmainCRTStartup ()
    at /usr/src/debug/mingw64-i686-runtime-4.0.6-1/crt/crtexe.c:329
#15 0x73ea62c4 in KERNEL32!BaseThreadInitThunk ()
   from /cygdrive/c/WINDOWS/System32/KERNEL32.DLL
#16 0x77322700 in ntdll!RtlSetCriticalSectionSpinCount ()
   from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#17 0x773226cb in ntdll!RtlSetCriticalSectionSpinCount ()
   from /cygdrive/c/WINDOWS/SYSTEM32/ntdll.dll
#18 0x00000000 in ?? ()

@JeffBezanson
Copy link
Member

I have 2 candidate patches to try.

First patch:

--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -3967,6 +3967,11 @@ static std::unique_ptr<Module> emit_function(jl_lambda_info_t *lam, jl_llvm_func
     // step 1. unpack AST and allocate codegen context for this function
     jl_array_t *code = (jl_array_t*)lam->code;
     JL_GC_PUSH1(&code);
+    if ((jl_value_t*)code == jl_nothing) {
+        jl_type_infer(lam, 0);
+        code = (jl_array_t*)lam->code;
+        assert((jl_value_t*)code != jl_nothing);
+    }
     if (!jl_typeis(code,jl_array_any_type))
         code = jl_uncompress_ast(lam, code);
     //jl_static_show(JL_STDOUT, (jl_value_t*)ast);

Second patch:

--- a/src/codegen.cpp
+++ b/src/codegen.cpp
@@ -874,16 +874,6 @@ static void to_function(jl_lambda_info_t *li)
     // mark the pointer calling convention
     li->jlcall_api = (f->getFunctionType() == jl_func_sig ? 0 : 1);

-    // if not inlineable, code won't be needed again
-    if (JL_DELETE_NON_INLINEABLE &&
-        li->def && li->inferred && !li->inlineable && !jl_options.outputji) {
-        li->code = jl_nothing;
-        li->slottypes = jl_nothing;
-        li->ssavaluetypes = jl_box_long(jl_array_len(li->ssavaluetypes)); jl_gc_wb(li, li->ssavaluetypes);
-        li->slotflags = NULL;
-        li->slotnames = NULL;
-    }
-
     // done compiling: restore global state
     if (old != NULL) {
         builder.SetInsertPoint(old);
@@ -1018,6 +1008,15 @@ extern "C" void jl_generate_fptr(jl_lambda_info_t *li)
     if (li->fptr == NULL) {
         li->fptr = (jl_fptr_t)getAddressForFunction((Function*)li->functionObjectsDecls.functionObject);
         assert(li->fptr != NULL);
+        // if not inlineable, code won't be needed again
+        if (JL_DELETE_NON_INLINEABLE &&
+            li->def && li->inferred && !li->inlineable && !jl_options.outputji) {
+            li->code = jl_nothing;
+            li->slottypes = jl_nothing;
+            li->ssavaluetypes = jl_box_long(jl_array_len(li->ssavaluetypes)); jl_gc_wb(li, li->ssavaluetypes);
+            li->slotflags = NULL;
+            li->slotnames = NULL;
+        }
     }
     JL_UNLOCK(&codegen_lock); // Might GC
 }

Could somebody try these?

@vtjnash I don't fully understand what's going on here. I can see why the second patch might be necessary, but it seems to significantly negate the memory savings. That says to me that I'm deleting IR that we need, but in that case why isn't the failure more widespread?

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2016

The first patch is making the linker take an inordinate amount of time creating sys.dll... will try the second instead.

edit: first patch fails the added assertion

Assertion failed!

Program: C:\cygwin64\home\Tony\julia\usr\bin\julia.exe
File: /home/Tony/julia/src/codegen.cpp, Line 3973

Expression: (jl_value_t*)code != jl_nothing

@josefsachsconning
Copy link
Contributor Author

I haven't previously mentioned it, but it is my impression that ld.exe has been taking a lot longer for a while, even before this patch.

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2016

Possibly the change to linking llvm as a dll is to blame there, looks like it has nothing to do with the patch.

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2016

I get an access violation with the second patch:

    JULIA test/core

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x620db3f1 -- emit_function at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:3993
while loading no file, in expression starting on line 0
emit_function at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:3993
to_function at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:832
jl_compile_linfo at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:1031
jl_get_specialization1 at /home/Tony/julia32/src/home/Tony/julia32/src\gf.c:1097
emit_call at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:2752
emit_expr at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:3203
emit_call_function_object at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:2672
emit_call at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:2773
emit_expr at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:3203
emit_condition at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:3063
emit_expr at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:3191
emit_stmtpos at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:3098
emit_function at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:4790
to_function at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:832
jl_compile_linfo at /home/Tony/julia32/src/home/Tony/julia32/src\codegen.cpp:1031
jl_call_method_internal at /home/Tony/julia32/src/home/Tony/julia32/src\julia_internal.h:88
jl_apply at /home/Tony/julia32/src/home/Tony/julia32/src\julia.h:1384
jl_init_restored_modules at /home/Tony/julia32/src/home/Tony/julia32/src\dump.c:1884
_julia_init at /home/Tony/julia32/src/home/Tony/julia32/src\init.c:710
julia_init at /home/Tony/julia32/src/home/Tony/julia32/src\task.c:277
wmain at /home/Tony/julia32/ui\repl.c:677
__tmainCRTStartup at /usr/src/debug/mingw64-i686-runtime-4.0.6-1/crt\crtexe.c:329
unknown function (ip: 73EA62C3)
unknown function (ip: 773226FF)
unknown function (ip: 773226CA)
Allocations: 870203 (Pool: 869422; Big: 781); GC: 1

@josefsachsconning
Copy link
Contributor Author

The first patch gives me an access violation:

HAW7L0605$ usr/bin/julia.exe

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x6909ceea -- emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3999
while loading no file, in expression starting on line 0
emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3999
to_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:832
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:88
typeinf_edge at .\inference.jl:1510
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:93
typeinf_ext at .\inference.jl:1554
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:93
jl_apply at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia.h:1384
emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3971
to_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:832
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:88
jl_apply at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia.h:1384
jl_init_restored_modules at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\dump.c:1884
_julia_init at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\init.c:710
julia_init at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\task.c:277
wmain at /home/s2sajs/julia-master/ui\repl.c:677
__tmainCRTStartup at /usr/src/debug/mingw64-x86_64-runtime-4.0.5-1/crt\crtexe.c:329
mainCRTStartup at /usr/src/debug/mingw64-x86_64-runtime-4.0.5-1/crt\crtexe.c:212
unknown function (ip: 00000000772459BC)
unknown function (ip: 000000007737A2E0)
Allocations: 833858 (Pool: 833130; Big: 728); GC: 1

Please submit a bug report with steps to reproduce this fault, and any error messages that follow (in their entirety). Thanks.
Exception: EXCEPTION_ACCESS_VIOLATION at 0x6909ceea -- emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3999
while loading no file, in expression starting on line 0
emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3999
to_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:832
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:88
unshare_linfo! at .\inference.jl:1400
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:93
typeinf_edge at .\inference.jl:1510
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:93
typeinf_ext at .\inference.jl:1554
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:93
jl_apply at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia.h:1384
emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3971
to_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:832
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:88
jl_apply at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia.h:1384
jl_exit at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\jl_uv.c:534
_exception_handler at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\signals-win.c:270
unknown function (ip: 0000000077377E8C)
unknown function (ip: 00000000773684CE)
unknown function (ip: 000000007739BAC7)
emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3999
to_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:832
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:88
typeinf_edge at .\inference.jl:1510
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:93
typeinf_ext at .\inference.jl:1554
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:93
jl_apply at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia.h:1384
emit_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:3971
to_function at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\codegen.cpp:832
jl_call_method_internal at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia_internal.h:88
jl_apply at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\julia.h:1384
jl_init_restored_modules at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\dump.c:1884
_julia_init at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\init.c:710
julia_init at /home/s2sajs/julia-master/src/home/s2sajs/julia-master/src\task.c:277
wmain at /home/s2sajs/julia-master/ui\repl.c:677
__tmainCRTStartup at /usr/src/debug/mingw64-x86_64-runtime-4.0.5-1/crt\crtexe.c:329
mainCRTStartup at /usr/src/debug/mingw64-x86_64-runtime-4.0.5-1/crt\crtexe.c:212
unknown function (ip: 00000000772459BC)
unknown function (ip: 000000007737A2E0)
Allocations: 834078 (Pool: 833350; Big: 728); GC: 1

@tkelman
Copy link
Contributor

tkelman commented Jun 14, 2016

I resolved the mystery of why appveyor didn't identify this - since it's building with JULIA_SYSIMG_BUILD_FLAGS="--output-ji ../usr/lib/julia/sys.ji" to test the ability to load from a .ji file without a corresponding .dll present, maybe it's following a different code path when building sys.dll in the first place? If you run on Linux with --precompiled=no you can also reproduce what is probably the same segfault.

@JeffBezanson
Copy link
Member

Thanks, this gives me some good stuff to work with.

@JeffBezanson
Copy link
Member

Branch to try: #16936

@tkelman
Copy link
Contributor

tkelman commented Jun 15, 2016

Looks like that'll work.

@josefsachsconning
Copy link
Contributor Author

#16936 appears to work

tkelman added a commit that referenced this issue Jun 16, 2016
this would have at least caught #16921 in local testing on
platforms other than Windows
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
system:windows Affects only Windows
Projects
None yet
Development

No branches or pull requests

4 participants