Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OMPI build fails when cuda is enabled #8736

Closed
AboorvaDevarajan opened this issue Mar 30, 2021 · 1 comment
Closed

OMPI build fails when cuda is enabled #8736

AboorvaDevarajan opened this issue Mar 30, 2021 · 1 comment

Comments

@AboorvaDevarajan
Copy link
Member

Background information

What version of Open MPI are you using? (e.g., v3.0.5, v4.0.2, git branch name and hash, etc.)

ompi - master

Describe how Open MPI was installed (e.g., from a source/distribution tarball, from a git clone, from an operating system distribution package, etc.)

git clone

If you are building/installing from a git clone, please copy-n-paste the output from git submodule status.

ompi]$ git submodule status
 074b437c236852a898009573a2883dd9526207f1 3rd-party/openpmix (v1.1.3-2908-g074b437)
 eba2595e56b559ba3144f6e5e13453eca6642c24 3rd-party/prrte (dev-31073-geba2595)

Please describe the system on which you are running

* Operating system/version: RH8.3
* Computer hardware:  ppc64le

Details of the problem

$ git clone --recursive https://github.com/open-mpi/ompi.git
$ cd ompi
$ ./autogen.pl 
$ ./configure --disable-man-pages --enable-mca-no-build=btl-uct --enable-mpi1-compatibility     --with-cuda=/usr/local/cuda 
$ make -j 20
  CCLD     libopen-pal.la
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_free':
common_cuda.c:(.text+0x300): multiple definition of `mca_common_cuda_free'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x300): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `opal_cuda_memcpy':
common_cuda.c:(.text+0x3a0): multiple definition of `opal_cuda_memcpy'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x3a0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x2ec): multiple definition of `opal_cuda_verbose'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x2ec): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_malloc':
common_cuda.c:(.text+0x8b0): multiple definition of `mca_common_cuda_malloc'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x8b0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x328): multiple definition of `mca_common_cuda_enabled'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x328): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.data+0x40): multiple definition of `cuda_event_max'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.data+0x40): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x320): multiple definition of `cuda_event_ipc_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x320): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x308): multiple definition of `cuda_event_ipc_frag_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x308): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x318): multiple definition of `cuda_event_dtoh_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x318): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x300): multiple definition of `cuda_event_dtoh_frag_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x300): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x310): multiple definition of `cuda_event_htod_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x310): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x2f8): multiple definition of `cuda_event_htod_frag_array'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x2f8): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_register_mca_variables':
common_cuda.c:(.text+0x15e0): multiple definition of `mca_common_cuda_register_mca_variables'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):common_cuda.c:(.text+0x15e0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x2f0): multiple definition of `libcuda_handle'
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o):(.bss+0x2f0): first defined here
mca/common/cuda/.libs/libmca_common_cuda_noinst.a(common_cuda.o): In function `mca_common_cuda_fini':
common_cuda.c:(.text+0x18b0): multiple definition of `mca_common_cuda_fini'

CUDA builds are frequently broken, I guess we can add cuda build checks in CI to avoid CUDA build issues.

@awlauria
Copy link
Contributor

awlauria commented Apr 1, 2021

Thanks @AboorvaDevarajan . Closing as a dup of #8656.

@awlauria awlauria closed this as completed Apr 1, 2021
wckzhang added a commit to wckzhang/ompi that referenced this issue Apr 8, 2021
These symbols were causing compilation errors with cuda and the new
default statically linked components. Explicitly including common_cuda
is unnecessary because the MCA system adds it, which, when built as a
static library, caused duplicates.

Issue open-mpi#8736

Signed-off-by: William Zhang <wilzhang@amazon.com>
wckzhang added a commit to wckzhang/ompi that referenced this issue Apr 13, 2021
These symbols were causing compilation errors with cuda and the new
default statically linked components. Explicitly including common_cuda
is unnecessary because the MCA system adds it, which, when built as a
static library, caused duplicates.

Issue open-mpi#8736

Signed-off-by: William Zhang <wilzhang@amazon.com>
(cherry picked from commit c81cdd7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants