MPI implementations intercepting Signals is incompatible with Julia GC safepoint #725

alexandrebouchard · 2023-03-10T23:10:02Z

Thanks again for your help with #720 - this one is unrelated (except that issue #720 lead us to create more comprehensive unit test revealing this new, probably unrelated segfault).

Summary of this problem: a segfault occurs when GC is triggered in a multithreaded+MPI context.

How to reproduce: I have create a draft PR adding a GC.gc() call in one of MPI.jl's existing multithreaded test: see PR Request #724

The draft PR is based off the most recent commit where all tests passed (Tag 0.20.8). In the output of "test-intel-linux", the salient output is

signal (11): Segmentation fault
in expression starting at /home/runner/work/MPI.jl/MPI.jl/test/test_threads.jl:18
ijl_gc_enable at /cache/build/default-amdci4-2/julialang/julia-release-1-dot-8/src/gc.c:2955

The change we made is in the file test/test_threads.jl, where we added the following if clause:

    Threads.@threads for i = 1:N
        reqs[N+i] = MPI.Irecv!(@view(recv_arr[i:i]), comm; source=src, tag=i)
        reqs[i] = MPI.Isend(@view(send_arr[i:i]), comm; dest=dst, tag=i)
        if i == 1 
            GC.gc()
        end

    end

We experience similar problems with MPICH 4.0 in our package (https://github.com/Julia-Tempering/Pigeons.jl), but not with MPICH 4.1.

Related discussions

https://juliaparallel.org/MPI.jl/stable/knownissues/#Multi-threading-and-signal-handling

This describes a similar issue in the context of UCX. However this problem does not seem limited to UCX from our investigations so far.

segmentation faults when combined with @threads with memory allocation #337

This describes a similar issue in the context of OpenMPI. However it seems that certain versions of MPICH and intel MPI (which is MPICH-derived) might suffer from a similar issue?

In light of these two sources, perhaps other environment variables in the style of

MPI.jl/src/MPI.jl

Line 133 in 6d513bb

ENV["UCX_ERROR_SIGNALS"] = "SIGILL,SIGBUS,SIGFPE"

could be set to address this issue? I was wondering if anyone might have some suggestion on whether that's a reasonable hypothesis? Having limited MPI experience I am not sure what these environment variables might be.

Thank you so much for your time.

The text was updated successfully, but these errors were encountered:

vchuravy · 2023-03-11T02:30:41Z

In a multi-threaded environment Julia uses segmentation faults on special addresses for it's safepoint implementation. If the MPI implementation intercepts signals this will cause spurious aborts.

UCX is a library that does this and so for a better experience we tell it not to. Generally Julia will handle signals for the user.

alexandrebouchard · 2023-03-11T15:17:18Z

That's right, @vchuravy and this issue we are documenting here is that this issue is not just with UCX, and affects other MPI implementations, in particular some that are currently in MPI.jl's set of test cases (see "test-intel-linux" in #724 showing that MPI.jl with Intel's MPI will currently crash when GC happens in a multithreaded context)

vchuravy · 2023-03-11T15:46:16Z

If you can figure out how to tell Intel MPI not to intercept signals we can add that as a vendor specific workaround.

alexandrebouchard · 2023-03-11T15:56:31Z

We will do some research on that, thank you.

However it seems though a more principled approach would be to tell Julia to use another signal for GC coordination, since it seems that in any situation where Julia is used as a child process, GC+multithreading would trigger a crash. This leads to a kind of a Whac-A-Mole situation where the issue has to be addressed on all possible of parent processes, some of which could potentially be closed source (like the situation here).

alexandrebouchard · 2023-03-11T16:05:54Z

Also it looks like that issue was reported here: https://discourse.julialang.org/t/julia-crashes-inside-threads-with-mpi/52400/5

From a quick look there is no obvious ENV-based workaround for Intel MPI.

Add to the list of MPI systems incompatible with GC+multithread: MPICH 4.0 (but MPICH 4.1!).

vchuravy · 2023-03-12T00:17:49Z

However it seems though a more principled approach would be to tell Julia to use another signal for GC coordination, since it seems that in any situation where Julia is used as a child process, GC+multithreading would trigger a crash

Let's be precise here. Julia does not crash, the MPI implementation is misreporting a signal as a crash.

The Julia GC safepoint needs to be very low-overhead and is implemented as a load from an address. When GC needs to be triggered Julia set's the safepoint to hot e.g. it maps the page from which the load happens as inaccessible. The OS will provide a signal to the process and Julia inspects the address to ensure that the signal was caused by the safepoint.

While there are different alternatives one could implement, this method has the lowest overhead during execution off the program,
(and while I am interested in experimenting with different alternatives I don't expect these experiments to bear fruit any time soon).

some of which could potentially be closed source

I would encourage you to file a ticket with the vendor of the software.

vchuravy · 2023-03-12T00:23:24Z

Can you see which libfabric version the IntelMPI is using? There was a signal handler related bugfix that landed in v1.10.0rc1 (ofiwg/libfabric#5613)

alexandrebouchard · 2023-03-13T19:01:38Z

According to

MPI.jl/.github/workflows/UnitTests.yml

Line 242 in 6d513bb

key: ${{ runner.os }}-intelmpi-2019.9.304

this particular failed test is on intelmpi-2019.9.304

vchuravy · 2023-03-13T19:42:55Z

@simonbyrne the latest is 2021.8.0 maybe worth an update?

simonbyrne · 2023-03-21T04:28:37Z

Is that the same as oneAPI MPI? We already test that (thanks to @giordano)

simonbyrne · 2023-03-21T04:31:10Z

@alexandrebouchard what version of Intel MPI are you using? And what is your libfabric version?

alexandrebouchard · 2023-03-21T13:23:04Z

I am travelling this week, but let me get back to you on this soon!

vtjnash · 2023-06-23T19:35:40Z

Intel PSM also has the same issue as this, and requires the existence of the environment variable IPATH_NO_BACKTRACE not to crash, this is undocumented here:

https://github.com/intel/psm/blob/e5b9f1cbf432161639cb5c51d17b196c92eb4278/ipath/ipath_debug.c#L162

Similar to UCX as documented here:
https://juliaparallel.org/MPI.jl/stable/knownissues/#Multi-threading-and-signal-handling

giordano · 2023-06-24T10:35:15Z

Also OpenMPI sets the same environment variable for a similar reason: https://docs.open-mpi.org/en/main/news/news-v2.x.html

Change the behavior for handling certain signals when using PSM and PSM2 libraries. Previously, the PSM and PSM2 libraries would trap certain signals in order to generate tracebacks. The mechanism was found to cause issues with Open MPI’s own error reporting mechanism. If not already set, Open MPI now sets the IPATH_NO_BACKTRACE and HFI_NO_BACKTRACE environment variables to disable PSM/PSM2’s handling these signals.

https://github.com/open-mpi/ompi/blob/4216f3fc13079b80f64c07987935345189206064/opal/runtime/opal_init.c#L98-L115

    /* Very early in the init sequence -- before *ANY* MCA components
       are opened -- we need to disable some behavior from the PSM and
       PSM2 libraries (by default): at least some old versions of
       these libraries hijack signal handlers during their library
       constructors and then do not un-hijack them when the libraries
       are unloaded.

       It is a bit of an abstraction break that we have to put
       vendor/transport-specific code in the OPAL core, but we're
       out of options, unfortunately.

       NOTE: We only disable this behavior if the corresponding
       environment variables are not already set (i.e., if the
       user/environment has indicated a preference for this behavior,
       we won't override it). */

giordano · 2023-06-24T16:38:17Z

It doesn't look like setting IPATH_NO_BACKTRACE=1 is sufficient: #742 😞

vchuravy changed the title ~~GC in a multithreaded MPI context causing segfaults beyond UCX~~ MPI implementations intercepting Signals is incompatible with Julia GC safepiint Mar 11, 2023

vchuravy changed the title ~~MPI implementations intercepting Signals is incompatible with Julia GC safepiint~~ MPI implementations intercepting Signals is incompatible with Julia GC safepoint Mar 11, 2023

giordano added the Intel MPI label Mar 11, 2023

alexandrebouchard mentioned this issue Mar 12, 2023

Openmpi bugs Julia-Tempering/Pigeons.jl#31

Closed

alexandrebouchard mentioned this issue Mar 13, 2023

Segfault fix Julia-Tempering/Pigeons.jl#30

Merged

vchuravy mentioned this issue Jun 23, 2023

segmentation fault with multi-threading JuliaPy/PythonCall.jl#219

Open

giordano linked a pull request Jun 24, 2023 that will close this issue

Avoid segmentation faults in threaded code when using Intel MPI #742

Draft

giordano mentioned this issue Jul 19, 2023

Multiple CI jobs broken on master #749

Closed

5 tasks

giordano mentioned this issue Nov 3, 2023

Skip threads tests which are known to fail #791

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MPI implementations intercepting Signals is incompatible with Julia GC safepoint #725

MPI implementations intercepting Signals is incompatible with Julia GC safepoint #725

alexandrebouchard commented Mar 10, 2023

vchuravy commented Mar 11, 2023

alexandrebouchard commented Mar 11, 2023 •

edited

Loading

vchuravy commented Mar 11, 2023

alexandrebouchard commented Mar 11, 2023

alexandrebouchard commented Mar 11, 2023

vchuravy commented Mar 12, 2023

vchuravy commented Mar 12, 2023

alexandrebouchard commented Mar 13, 2023

vchuravy commented Mar 13, 2023

simonbyrne commented Mar 21, 2023

simonbyrne commented Mar 21, 2023

alexandrebouchard commented Mar 21, 2023

vtjnash commented Jun 23, 2023

giordano commented Jun 24, 2023

giordano commented Jun 24, 2023

MPI implementations intercepting Signals is incompatible with Julia GC safepoint #725

MPI implementations intercepting Signals is incompatible with Julia GC safepoint #725

Comments

alexandrebouchard commented Mar 10, 2023

vchuravy commented Mar 11, 2023

alexandrebouchard commented Mar 11, 2023 • edited Loading

vchuravy commented Mar 11, 2023

alexandrebouchard commented Mar 11, 2023

alexandrebouchard commented Mar 11, 2023

vchuravy commented Mar 12, 2023

vchuravy commented Mar 12, 2023

alexandrebouchard commented Mar 13, 2023

vchuravy commented Mar 13, 2023

simonbyrne commented Mar 21, 2023

simonbyrne commented Mar 21, 2023

alexandrebouchard commented Mar 21, 2023

vtjnash commented Jun 23, 2023

giordano commented Jun 24, 2023

giordano commented Jun 24, 2023

alexandrebouchard commented Mar 11, 2023 •

edited

Loading