Open
Description
Thank you for taking the time to submit an issue!
Background information
What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)
v4.0.1 and v4.0.2
Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)
Installed from the source tarball (both with Intel Parallel Studio 2020.0.088 as well as with GCC-7.4.0).
Please describe the system on which you are running
- Operating system/version: Ubuntu 18.04.3 LTS
- Computer hardware: 2 x Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
- Network type:
Details of the problem
When I split the comm_world communicator into two groups (comm_shmem) and try to allocate shmem segments on the latter by means of MPI_win_allocate I get the following error message:
--------------------------------------------------------------------------
A system call failed during shared memory initialization that should
not have. It is likely that your MPI job will now either abort or
experience performance degradation.
Local host: guppy01
System call: unlink(2) /dev/shm/osc_rdma.guppy01.fd690001.4
Error: No such file or directory (errno 2)
--------------------------------------------------------------------------
I used the following program:
PROGRAM test
USE mpi_f08
TYPE(MPI_comm) :: comm_world, comm_shmem
TYPE(MPI_group) :: group_world,group_shmem
TYPE(MPI_win) :: win
TYPE(c_ptr) :: baseptr
INTEGER(KIND=MPI_ADDRESS_KIND) :: winsize
INTEGER, ALLOCATABLE :: group(:)
INTEGER :: nrank,irank,nrank_shmem,irank_shmem,nshmem
INTEGER :: i,n,sizeoftype
INTEGER :: ierror
CALL MPI_init( ierror )
comm_world = MPI_comm_world
CALL MPI_comm_rank( comm_world, irank, ierror )
CALL MPI_comm_size( comm_world, nrank, ierror )
WRITE(*,'(a,i4,2x,a,i4)') 'nrank:',nrank,'irank:',irank
ALLOCATE(group(0:nrank-1))
nshmem=4
n=0
DO i=0,nrank-1
IF (i/nshmem == irank/nshmem) THEN
group(n)=i
n=n+1
ENDIF
ENDDO
CALL MPI_comm_group( comm_world, group_world, ierror )
CALL MPI_group_incl( group_world, n, group, group_shmem, ierror )
CALL MPI_comm_create( comm_world, group_shmem, comm_shmem, ierror )
DEALLOCATE(group)
CALL MPI_comm_rank( comm_shmem, irank_shmem, ierror )
CALL MPI_comm_size( comm_shmem, nrank_shmem, ierror )
WRITE(*,'(a,i4,2x,a,i4)') 'irank:',irank,'irank_shmem:',irank_shmem
CALL MPI_sizeof( i, sizeoftype, ierror )
winsize=10*sizeoftype
CALL MPI_win_allocate( winsize, sizeoftype, MPI_INFO_NULL, comm_shmem, baseptr, win, ierror )
CALL MPI_win_free( win, ierror )
CALL MPI_finalize( ierror )
END PROGRAM
and ran it with 8 ranks:
mpirun -mca shmem mmap -np 8 test
Switching to "posix" (mpirun -mca shmem posix ...) gets rid of this error but has problems of its own for which I'll submit a separate issue.
Metadata
Metadata
Assignees
Labels
No labels