Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tests segfault with Julia 0.4 #95

Closed
IainNZ opened this issue Oct 23, 2014 · 18 comments
Closed

tests segfault with Julia 0.4 #95

IainNZ opened this issue Oct 23, 2014 · 18 comments

Comments

@IainNZ
Copy link
Contributor

IainNZ commented Oct 23, 2014

PackageEvaluator.jl is a script that runs nightly. It attempts to load all Julia packages and run their tests (if available) on both the stable version of Julia (0.3) and the nightly build of the unstable version (0.4). The results of this script are used to generate a package listing enhanced with testing results.

On Julia 0.4

  • On 2014-10-22 the testing status was Tests pass.
  • On 2014-10-23 the testing status changed to Tests fail, but package loads.

Tests pass. means that PackageEvaluator found the tests for your package, executed them, and they all passed.

Tests fail, but package loads. means that PackageEvaluator found the tests for your package, executed them, and they didn't pass. However, trying to load your package with using worked.

Special message from @IainNZ: This change may be due to JuliaLang/julia#8712.

This issue was filed because your testing status became worse. No additional issues will be filed if your package remains in this state, and no issue will be filed if it improves. If you'd like to opt-out of these status-change messages, reply to this message saying you'd like to and @IainNZ will add an exception. If you'd like to discuss PackageEvaluator.jl please file an issue at the repository. For example, your package may be untestable on the test machine due to a dependency - an exception can be added.

Test log:

>>> 'Pkg.add("PyCall")' log
INFO: Installing PyCall v0.4.10
INFO: Package database updated

>>> 'using PyCall' log

WARNING: deprecated syntax "(Symbol=>Timer)[]" at /home/idunning/pkgtest/.julia/v0.4/PyCall/src/gui.jl:127.
Use "Dict{Symbol,Timer}()" instead.

WARNING: deprecated syntax "[a=>b, ...]" at /home/idunning/pkgtest/.julia/v0.4/PyCall/src/numpy.jl:169.
Use "Dict(a=>b, ...)" instead.

WARNING: deprecated syntax "(String=>Type)[a=>b, ...]" at /home/idunning/pkgtest/.julia/v0.4/PyCall/src/numpy.jl:176.
Use "Dict{String,Type}(a=>b, ...)" instead.
Julia Version 0.4.0-dev+1249
Commit 23a9373 (2014-10-23 04:46 UTC)
Platform Info:
  System: Linux (x86_64-unknown-linux-gnu)
  CPU: Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz
  WORD_SIZE: 64
  BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Sandybridge)
  LAPACK: libopenblas
  LIBM: libopenlibm
  LLVM: libLLVM-3.3

>>> test log

WARNING: deprecated syntax "(Symbol=>Timer)[]" at /home/idunning/pkgtest/.julia/v0.4/PyCall/src/gui.jl:127.
Use "Dict{Symbol,Timer}()" instead.

WARNING: deprecated syntax "[a=>b, ...]" at /home/idunning/pkgtest/.julia/v0.4/PyCall/src/numpy.jl:169.
Use "Dict(a=>b, ...)" instead.

WARNING: deprecated syntax "(String=>Type)[a=>b, ...]" at /home/idunning/pkgtest/.julia/v0.4/PyCall/src/numpy.jl:176.
Use "Dict{String,Type}(a=>b, ...)" instead.

WARNING: deprecated syntax "{a b; c d}" at /home/idunning/pkgtest/.julia/v0.4/PyCall/test/runtests.jl:47.
Use "Any[a b; c d]" instead.

WARNING: deprecated syntax "{a b; c d}" at /home/idunning/pkgtest/.julia/v0.4/PyCall/test/runtests.jl:49.
Use "Any[a b; c d]" instead.

WARNING: deprecated syntax "{a,b, ...}" at /home/idunning/pkgtest/.julia/v0.4/PyCall/test/runtests.jl:49.
Use "Any[a,b, ...]" instead.

WARNING: deprecated syntax "{a,b, ...}" at /home/idunning/pkgtest/.julia/v0.4/PyCall/test/runtests.jl:49.
... truncated ...
typeinf_3B_2430 at /home/idunning/julia04/usr/bin/../lib/julia/sys.so (unknown line)
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
typeinf_ext at ./inference.jl:1230
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
unknown function (ip: -1731327797)
unknown function (ip: -1731325768)
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
array2py2arrayeq at /home/idunning/pkgtest/.julia/v0.4/PyCall/test/runtests.jl:55
jlcall_array2py2arrayeq;41062 at  (unknown line)
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
anonymous at test.jl:85
do_test at test.jl:47
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
unknown function (ip: -1731020464)
unknown function (ip: -1731023450)
unknown function (ip: -1730953426)
unknown function (ip: -1730950752)
jl_load at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
include at ./boot.jl:242
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
include_from_node1 at loading.jl:128
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
process_options at ./client.jl:293
_start at ./client.jl:362
_start_3B_3776 at /home/idunning/julia04/usr/bin/../lib/julia/sys.so (unknown line)
jl_apply_generic at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
unknown function (ip: 4200466)
julia_trampoline at /home/idunning/julia04/usr/bin/../lib/libjulia.so (unknown line)
unknown function (ip: 4199453)
__libc_start_main at /usr/lib/libc.so.6 (unknown line)
unknown function (ip: 4199507)
unknown function (ip: 0)
INFO: Testing PyCall
===============================[ ERROR: PyCall ]================================

failed process: Process(`/home/idunning/julia04/usr/bin/julia /home/idunning/pkgtest/.julia/v0.4/PyCall/test/runtests.jl`, ProcessSignaled(11)) [0]

================================================================================
INFO: No packages to install, update or remove
ERROR: PyCall had test errors
 in error at error.jl:21
 in test at pkg/entry.jl:719
 in anonymous at pkg/dir.jl:28
 in cd at ./file.jl:20
 in cd at pkg/dir.jl:28
 in test at pkg.jl:68
 in process_options at ./client.jl:221
 in _start at ./client.jl:362
 in _start_3B_3776 at /home/idunning/julia04/usr/bin/../lib/julia/sys.so


>>> end of log
@IainNZ
Copy link
Contributor Author

IainNZ commented Oct 23, 2014

I don't normally file "segfault"-y issues, assuming they are random noise, but PyCall isn't one to normally get them, so filing this just as a precaution.

@stevengj
Copy link
Member

Looks like this is a Julia issue.

@stevengj
Copy link
Member

Hmm, #8798 fixed one problem, and sometimes the test pass now, but now it occasionally segfaults, so there may be some other problem.

@IainNZ
Copy link
Contributor Author

IainNZ commented Oct 24, 2014

Just glad to know it wasn't another spurious PkgEval segfault!

@fundamental
Copy link

It looks like a possible memory management issue, as a fair number of the segfaults observed here are tracing back to array_dealloc at numpy/core/src/multiarray/arrayobject.c

@stevengj
Copy link
Member

@fundamental, I wonder what has changed. Do you observe the segfaults with Julia 0.3?

(Eventually, all of the numpy array stuff should be replaced with something like #70. cc @jakebolewski )

@fundamental
Copy link

I get the feeling that it's a relatively recent commit within julia 0.4, though this is just based upon past usage and not based on running tests.
I don't even remember any PyCall related issues until my most recent v0.4 git update, but who knows what commit I was running prior to that (I think a very early v0.4 but I might be wrong here).
If you look at the referenced "Segfault caused by deserialize" bug within julia it seems to produce something moderately repeatable as of commit 93a33af.

These sort of issues might have happened in the past as most of my use was experimenting around via the ipython notebook and the occasional kernel restart was pretty forgettable.

If the test case passes on your machine, then I can test with an older commit, however the magic numbers seem to need to be pretty exact to trigger the bad behavior. (gc threshold, memory alignment, buffer overrun, etc)

@stevengj
Copy link
Member

Can you try a git bisect?

The test does not crash reliably for me.

@fundamental
Copy link

I could, but that depends on how often the llvm version changes. The normal build takes long enough to run, the issue appears to be some sort of memory corruption based issue, so there is no guarantee that if the test passes the bug is not there, and any llvm build (or a few other deps) will take a considerable amount of time to run.

Don't get me wrong, I want this fixed, but the non-determinism has me concerned.

@stevengj
Copy link
Member

There is also the difficulty that a separate crash was caused by #8551. But the test could be modified to avoid that bug when bisecting older versions.

@fundamental
Copy link

If it helps, I'm seeing the occasional printout of:

Exception SystemError: 'Objects/methodobject.c:120: bad argument to internal function' in ignored

while running code which seems to occasionally have this segfault.

@mweastwood
Copy link
Contributor

@stevengj Tests are consistently segfaulting for me. If you want to revise the tests to avoid the first issue, I'm willing to bisect.

@mweastwood
Copy link
Contributor

I tried bisecting without changing the tests, and I'm having a pretty rough time (there's been plenty of breakages in the past couple of weeks). I skipped whenever I got a compilation error or ERROR: convert has no method matching convert(::Type{VersionInterval}, ::VersionNumber).

I am not done and cannot narrow it down to a single commit, but it appears that the segfaults started with the changes to call overloading.

@stevengj
Copy link
Member

@mweastwood, thanks for making the attempt. cc: @JeffBezanson

@stevengj
Copy link
Member

stevengj commented Nov 5, 2014

This is about the minimal code snippet that reliably crashes for me:

using Base.Test, PyCall
roundtripeq(x) = convert(PyAny, PyObject(x)) == x
@test roundtripeq(C_NULL)
@test roundtripeq([1,3,4,5]) && roundtripeq([1,3.2,"hello",true])
@test roundtripeq([1 2 3;4 5 6]) && roundtripeq([1. 2 3;4 5 6])
@test roundtripeq((1,(3.2,"hello"),true))
@test roundtripeq(Int32)

If I delete just about anything it stops crashing; the unpredictable nature of this suggests some weird memory thing (an interaction with garbage collection in Julia and/or Python).

@mweastwood
Copy link
Contributor

@stevengj I can confirm that that snippet generates the seg fault for me as well.

@stevengj stevengj changed the title [PkgEval] PyCall may have a testing issue on Julia 0.4 (2014-10-23) tests segfault with Julia 0.4 Nov 8, 2014
@stevengj
Copy link
Member

stevengj commented Nov 8, 2014

Closing this in favor of JuliaLang/julia#8551, where I seem to have narrowed down the problem to a missing deserialize method.

@stevengj stevengj closed this as completed Nov 8, 2014
@stevengj
Copy link
Member

stevengj commented Nov 8, 2014

Seems to be a PyCall bug, actually. Reopening.

@stevengj stevengj reopened this Nov 8, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants