Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI_Win_allocate_shared throws MPI_ERR_RMA_SHARED on osc component selection #9974

Closed
albandil opened this issue Feb 4, 2022 · 2 comments
Closed
Assignees
Milestone

Comments

@albandil
Copy link

albandil commented Feb 4, 2022

I am using self-compiled Open MPI from the 5.0.x development branch on a standard desktop system with openSUSE Tumbleweed. I have up-to-date Git submodules and have executed autogen.pl before compilation.

In the commit 43b5d8c, the code of ompi_osc_rdma_component_query has been changed to always return OMPI_ERR_RMA_SHARED when shared memory functionality is queried. Before, the function was returning -1. This change, however, leads to unnecessary failures of the component selection in ompi_osc_base_select. The latter function fails when any of the available one-sided communication components produces OMPI_ERR_RMA_SHARED, even though other components would work perfectly fine.

To give an example, I tested compilation and execution of the following program:

#include <mpi.h>
#include <stdio.h>

int main (int argc, char* argv[])
{
    MPI_Win win;
    int *ptr, nproc, rank, size = sizeof(int), disp = 1;

    // The processes allocate a continuous shared memory segment.
    // Each process controls a chunk of the bytesize of one integer.
    // Each process writes its rank into the shared memory.
    // The rank-0 process then prints contents of the whole shared memory (= all rank IDs).

    MPI_Init(&argc, &argv);
    MPI_Comm_size(MPI_COMM_WORLD, &nproc);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Win_allocate_shared(size, disp, MPI_INFO_NULL, MPI_COMM_WORLD, &ptr, &win);

    *ptr = rank;

    MPI_Win_fence(0, win);

    if (rank == 0)
    {
        for (int i = 0; i < nproc; i++)
        {
            printf("%d ", ptr[i]);
        }
        printf("\n");
    }

    MPI_Win_free(&win);
    MPI_Finalize();

    return 0;
}

When I compile the program with the current 5.0.x version and attempt to run it, I get

[yunipher:00000] *** An error occurred in MPI_Win_allocate_shared
[yunipher:00000] *** reported by process [3017211905,0]
[yunipher:00000] *** on communicator MPI_COMM_WORLD
[yunipher:00000] *** MPI_ERR_RMA_SHARED: Memory cannot be shared
[yunipher:00000] *** MPI_ERRORS_ARE_FATAL (processes in this communicator will now abort,
[yunipher:00000] ***    and MPI will try to terminate your MPI job as well)

This can be avoided either by using a pre 43b5d8c version of the code, or manually excluding the broken "rdma" osc component ("sm" is then considered alone)

$ mpiexec -n 1 --mca osc ^rdma ./test.x

I believe that the code in ompi_osc_base_select is overreacing. It should not pass through the error status OMPI_ERR_RMA_SHARED from a single component unless all available components are unusable for shared memory.

@bwbarrett bwbarrett self-assigned this Feb 4, 2022
@bwbarrett
Copy link
Member

Yeah, WIN_ALLOCATE_SHARED is currently broken in both master and 5.0. There is no workaround on those branches. Fixing this is on my todo list.

@awlauria
Copy link
Contributor

awlauria commented Apr 1, 2022

@albandil I confirmed that the provided test passes in v5.0.0rc4. available here: https://www.open-mpi.org/software/ompi/v5.0/

Please retest, and let us know if you run into any issues.

@awlauria awlauria closed this as completed Apr 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants