figure out if we need to exclude the gcc9 builds for cuda <=10.2 #68

beckermr · 2020-10-09T18:32:11Z

leofang · 2020-10-09T18:34:51Z

Answer is yes. CUDA 9.2, which is the CUDA version we used to compile for the CUDA-awareness support, only supports up to GCC 7 (?) I think.

leofang · 2020-10-09T18:35:27Z

The problem less clear to me is if this would work for CUDA 11, which requires a different glibc version.

beckermr · 2020-10-09T18:37:15Z

Thanks @leofang ! See also the comments from

conda-forge/conda-forge.github.io#1160 (comment)

by @kkraus14. they seem to indicate that if openmpi is only looking at CUDA host APIs there is not an issue.

jakirkham · 2020-10-09T18:46:21Z

With CUDA 11.0, the nvcc package effectively requires GLIBC 2.17+ at build time and adds this dependency to packages for install time ( conda-forge/nvcc-feedstock#43 ). So there is nothing special packages need to do to get that constraint when building CUDA 11.0 support.

leofang · 2020-10-09T18:51:53Z

@jakirkham But do we get the same constraint if we build openmpi only with cuda_compiler_version=="9.2"?

jakirkham · 2020-10-09T19:16:48Z

No because that constraint is only added when building for CUDA 11.0. Otherwise we still use the default GLIBC currently 2.12.

beckermr · 2020-10-09T19:23:44Z

so what is the conclusion here? Do we understand enough about how openmpi is using CUDA to say if the current builds are ok?

kkraus14 · 2020-10-09T19:26:25Z

I'm definitely not confident, but based on the build script it looks like only gcc / g++ are being used or configured: https://github.com/conda-forge/openmpi-feedstock/blob/master/recipe/build-mpi.sh

jakirkham · 2020-10-09T19:51:04Z

After digging back through things here, I think we came to the conclusion that openmpi was using dlopen for CUDA support (so might be ok), but there are some compile time checks that occur as well, which were a bit confusing. I don't think we ever came to a resolution on that.

xref: open-mpi/ompi#7334

leofang · 2020-10-09T19:56:57Z

Yeah, IIRC at build time we don't need nvcc, just the CUDA headers. We don't even need to link to CUDA shared libraries as John mentioned. We settled at CUDA 9.2 because we realized Open MPI doesn't really care the recent CUDA versions.

I am just a bit worried that when we specify cuda_compiler_version=="9.2":

openmpi-feedstock/recipe/meta.yaml

Line 16 in 5106c44

skip: true # [win or (linux64 and cuda_compiler_version != '9.2')]

we don't enforce to use the latest glibc, and when we do conda install -c conda-forge openmpi cudatoolkit=11.0 we might have problems.

jakirkham · 2020-10-09T20:12:56Z

Well GLIBC is backwards compatible. So libraries built with an older GLIBC can always be installed on a system with a newer GLIBC. IOW one can install openmpi on a GLIBC 2.17 system (even though it is built using GLIBC 2.12) without issues.

Should add cudatoolkit itself requires the system be able to support an equivalent CUDA version. So if the driver doesn't support that version, Conda won't be able to install that.

So I guess the question is can one configure a system using CentOS 6 that has a new enough driver version to support CUDA 11.0? I would think the answer is no as the associated libraries would also require GLIBC 2.17+

leofang · 2020-10-09T20:57:21Z

Sounds good, so looks like conda install -c conda-forge openmpi cudatoolkit=11.0 on Cent OS 7 (or any OS supporting CUDA 11) will just work.

I think this issue can be closed, then? The current status is:

We choose cuda_compiler_version=="9.2" to build Open MPI
The bot did two builds, with both gcc7 and gcc9
Both builds are built with glibc 2.12
glibc is backward compatible, so using the new builds with glibc 2.17 + CUDA 11 should work, so will glibc 2.12 + CUDA <11.

leofang · 2020-12-30T05:07:29Z

With conda-forge/conda-forge-pinning-feedstock#1052 all CUDA builds (9.2 - 11.0) currently fall back to gcc 7.

beckermr mentioned this issue Oct 9, 2020

gcc 9.3.0 migration conda-forge/conda-forge.github.io#1160

Closed

9 tasks

leofang closed this as completed Dec 30, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

figure out if we need to exclude the gcc9 builds for cuda <=10.2 #68

figure out if we need to exclude the gcc9 builds for cuda <=10.2 #68

beckermr commented Oct 9, 2020

leofang commented Oct 9, 2020

leofang commented Oct 9, 2020

beckermr commented Oct 9, 2020 •

edited

Loading

jakirkham commented Oct 9, 2020

leofang commented Oct 9, 2020

jakirkham commented Oct 9, 2020

beckermr commented Oct 9, 2020

kkraus14 commented Oct 9, 2020

jakirkham commented Oct 9, 2020

leofang commented Oct 9, 2020 •

edited

Loading

jakirkham commented Oct 9, 2020

leofang commented Oct 9, 2020 •

edited

Loading

leofang commented Dec 30, 2020

figure out if we need to exclude the gcc9 builds for cuda <=10.2 #68

figure out if we need to exclude the gcc9 builds for cuda <=10.2 #68

Comments

beckermr commented Oct 9, 2020

leofang commented Oct 9, 2020

leofang commented Oct 9, 2020

beckermr commented Oct 9, 2020 • edited Loading

jakirkham commented Oct 9, 2020

leofang commented Oct 9, 2020

jakirkham commented Oct 9, 2020

beckermr commented Oct 9, 2020

kkraus14 commented Oct 9, 2020

jakirkham commented Oct 9, 2020

leofang commented Oct 9, 2020 • edited Loading

jakirkham commented Oct 9, 2020

leofang commented Oct 9, 2020 • edited Loading

leofang commented Dec 30, 2020

beckermr commented Oct 9, 2020 •

edited

Loading

leofang commented Oct 9, 2020 •

edited

Loading

leofang commented Oct 9, 2020 •

edited

Loading