Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when calling C code using GMP initialization together with OpenMP #33223

Closed
ederc opened this issue Sep 11, 2019 · 4 comments · Fixed by #33284
Closed

Segmentation fault when calling C code using GMP initialization together with OpenMP #33223

ederc opened this issue Sep 11, 2019 · 4 comments · Fixed by #33284
Labels
domain:bignums BigInt and BigFloat domain:multithreading Base.Threads and related functionality

Comments

@ederc
Copy link

ederc commented Sep 11, 2019

When calling C code from julia I figured out a segmentation fault which appears when using in the C code GMP together with OpenMP and letting the code run in parallel. If I let the code run from the C library without calling it from julia, this error does not appear. Note that the error occurs only if I set num_threads greater than 1 in the OpenMP pragma.

Here is a minimal example:

#include <gmp.h>
#include <omp.h>
void min_example()
{
    int i;
#pragma omp parallel for num_threads(4) private(i)
    for (i = 0; i < 4; ++i) {
        mpz_t x;
        mpz_init(x);
        mpz_clear(x);
    }
}

Running the C code directly no error appears, there is no memory issue, I even tried it with valgrind.
Doing now a ccall like

function test()
    lib = Libdl.dlopen(clib)
    sym = Libdl.dlsym(lib, :min_example)
    ccall(sym, Nothing, ())
end

the call of test throws a segmentation fault:

#0  0x00007ffff7b36871 in jl_gc_counted_malloc () from /usr/bin/../lib/libjulia.so.1
#1  0x00007fffe64fb008 in __gmpz_init () from /usr/bin/../lib/julia/libgmp.so

The error always appears when calling mpz_init(), also in my bigger real example where I have encountered the problem at first.
Again, if num_threads(1) is given, no such error appears, but once the value of num_threads is greater than 1 and the computation is done in parallel I run into this error. Of course, x in the above example is private to each thread.

Here is the versioninfo:

Julia Version 1.2.0
Commit c6da87ff4b (2019-08-20 00:03 UTC)
Platform Info:
  OS: Linux (x86_64-pc-linux-gnu)
  CPU: Intel(R) Core(TM) i7-5557U CPU @ 3.10GHz
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-6.0.1 (ORCJIT, broadwell)
@stevengj
Copy link
Member

At first I thought that maybe Julia's bundled MPFR library isn't built with the --enable-thread-safe option, but this doesn't seem to be the case:

julia> ccall((:mpfr_buildopt_tls_p,:libmpfr), Cint, ())
1

julia> ccall((:mpfr_buildopt_tls_p,:libmpfr), Cint, ())
1

However, I should note that we also replace the GMP memory allocator with our own:

        ccall((:__gmp_set_memory_functions, :libgmp), Cvoid,
              (Ptr{Cvoid},Ptr{Cvoid},Ptr{Cvoid}),
              cglobal(:jl_gc_counted_malloc),
              cglobal(:jl_gc_counted_realloc_with_old_size),
              cglobal(:jl_gc_counted_free_with_size))

and your segfault is occurring in jl_gc_counted_malloc.

@vtjnash, is jl_gc_counted_malloc thread-safe?

@stevengj stevengj added domain:multithreading Base.Threads and related functionality domain:bignums BigInt and BigFloat labels Sep 12, 2019
@thofma
Copy link
Contributor

thofma commented Sep 13, 2019

@ederc I think I ran into something similar when enabling threading for flint in Nemo. Can you try the hack at https://github.com/Nemocas/Nemo.jl/blob/bb40f2055d2fdd50e58ac032937ce75f1a1ea5fa/src/Nemo.jl#L226? (As a workaround)

@ederc
Copy link
Author

ederc commented Sep 14, 2019

@thofma Thanks Tommy, your hack works for me.

Is this really the way one should do this or should we keep this issue open?

@thofma
Copy link
Contributor

thofma commented Sep 14, 2019

I don't think this is the way one should do it. I hope that someone knowledgeable of the GC internals can help with this.

vtjnash added a commit that referenced this issue Sep 16, 2019
Better align the API of the jl_ wrappers for malloc/realloc/free with the libc namesakes,
including being safe to use on threads.

fix #33223
JeffBezanson pushed a commit that referenced this issue Sep 18, 2019
Better align the API of the jl_ wrappers for malloc/realloc/free with the libc namesakes,
including being safe to use on threads.

fix #33223
KristofferC pushed a commit that referenced this issue Sep 22, 2019
Better align the API of the jl_ wrappers for malloc/realloc/free with the libc namesakes,
including being safe to use on threads.

fix #33223

(cherry picked from commit 6c2c940)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain:bignums BigInt and BigFloat domain:multithreading Base.Threads and related functionality
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants