Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH Add SBSA aarch64 CTK pkgs for CUDA 11.X #56

Closed
wants to merge 6 commits into from

Conversation

mike-wendt
Copy link
Contributor

@mike-wendt mike-wendt commented Mar 16, 2021

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

Add aarch64 support with the SBSA runfile for CUDA 11.X versions

NOTE: This does not support the Jetson/AGX platform. Those images use L4T which is not compatible with the SBSA installs. Currently Jetson/AGX only support CUDA 10.2 and SBSA only supports CUDA 11.X so there is no overlap at the moment. We're actively working with internal teams on how to best address this when AGX adopts CUDA 11.X support but that is still TBD.

For Jetson/AGX builds that use conda it's recommended to use a selector and omit the CTK from recipes for now. This means users have to ensure they have the correct version of CUDA installed on their system and it doesn't allow for versioning but the currently JetPack for AGX/Jetson only supports CUDA 10.2

@conda-forge-linter
Copy link

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

  • The recipe license should not include the word "License".

For recipe:

  • License is not an SPDX identifier (or a custom LicenseRef) nor an SPDX license expression.

Documentation on acceptable licenses can be found here.

@mike-wendt mike-wendt added the enhancement New feature or request label Mar 16, 2021
@mike-wendt
Copy link
Contributor Author

@jakirkham I did not make this change but I wanted input on this section. There are sysroot versions for aarch64 and ppc64le should I add them to this section or not?

{% if "sysroot_version" in cudavars[major_minor] %}
- sysroot_linux-64 {{ cudavars[major_minor]["sysroot_version"] }} # [linux64]
{% endif %}

@mike-wendt
Copy link
Contributor Author

FYI I've been seeing a lot of failures with retrieving the CUDA runfiles and the site itself takes a while to load. May have to wait and re-run the CI later when the CDN is in a better state.

@jakirkham
Copy link
Member

So that comes from PR ( #16 ). AIUI this was only needed for x86_64 as we support CentOS 6 & 7 there and needed a way to force CentOS 7 be used with CUDA 11.0+. With aarch64 and ppc64le, we only support CentOS 7. So don't think that should be needed

@isuruf
Copy link
Member

isuruf commented Mar 16, 2021

So don't think that should be needed

AFAIK, sbsa aarch64 doesn't support centos 7.

@mike-wendt
Copy link
Contributor Author

So don't think that should be needed

AFAIK, sbsa aarch64 doesn't support centos 7.

Right that's why I didn't include it, just didn't know if there was a higher glibc version to match the Centos 8 requirement or not. If the answer is to leave as-is that works for me

@isuruf
Copy link
Member

isuruf commented Mar 16, 2021

You can merge it here without it, but we'll not be able to use it anywhere.

@mike-wendt
Copy link
Contributor Author

mike-wendt commented Mar 16, 2021

So the aarch64 jobs are failing due to curand and the glibc version it expects. I'm guessing this is because we are building on CentOS 7 images for a product that supports CentOS 8, Ubuntu 18.04/20.04

Finding curand from Conda environment
	located at $PREFIX/lib/libcurand.so.10.2.1.245
	trying to open library...	ERROR: failed to open curand:
/lib64/libm.so.6: version `GLIBC_2.27' not found (required by $PREFIX/lib/libcurand.so.10.2.1.245)

Are there any workarounds for this? I'm guessing this is no without a heavy lift to support cos8 directly in conda-forge.

@jakirkham
Copy link
Member

So don't think that should be needed

AFAIK, sbsa aarch64 doesn't support centos 7.

Yep, you're right.

Also seeing ppc64le requires CentOS 8 for CUDA 11.2 as well. Does anyone know when that changed?

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements


In terms of support in conda-forge, here are the things we will need:

  • Docker images for CentOS 8
  • Rebuild of compilers for CentOS 8
  • Updating cudatoolkit (this PR)
  • Updating nvcc package
  • Migrator (or some strategy to handle migration) to add this to relevant feedstock
  • Anything else?

cc @beckermr (as well who went through this before with CentOS 7 in case he spots anything)

@mike-wendt
Copy link
Contributor Author

mike-wendt commented Mar 17, 2021

Are there any workarounds for this?

Also seeing ppc64le requires CentOS 8 for CUDA 11.2 as well. Does anyone know when that changed?

https://docs.nvidia.com/cuda/cuda-installation-guide-linux/index.html#system-requirements

@jakirkham From scrubbing through the docs it looks this looks like the breakdown, starting with CUDA 11.0 CentOS 8 was required

CUDA Released x86-64 OS support aarch64 OS support ppc64le OS support
10.2 Nov 2019 RHEL 6.10/7.7/8.1 N/A RHEL 7.6 & Ubuntu 18.04.1
11.0_GA May 2020 RHEL 7.x/8.y (x <= 8, y <= 2) RHEL 8.y (y <= 2) & Ubuntu 18.04.z (z <= 4) RHEL 8.y (y <= 2) & Ubuntu 18.04.z (z <= 4)
11.1.0 Sep 2020 RHEL 7.x/8.y (x <= 8, y <= 2) RHEL 8.y (y <= 2) & SLES 15.y (y <= 2) & Ubuntu 18.04.z (z <= 5) RHEL 8.y (y <= 2) & Ubuntu 18.04 LTS
11.2.0 Dec 2020 RHEL 7.x/8.y (x <= 9, y <= 3) RHEL 8.y (y <= 3) & SLES 15.y (y <= 2) & Ubuntu 18.04.z (z <= 5) RHEL 8.y (y <= 3) & Ubuntu 18.04 LTS
11.2.1 Feb 2021 RHEL 7.x/8.y (x <= 9, y <= 3) RHEL 8.y (y <= 3) & SLES 15.y (y <= 2) & Ubuntu 18.04.z (z <= 5) RHEL 8.y (y <= 3)
11.2.2 Mar 2021 RHEL 7.x/8.y (x <= 9, y <= 3) RHEL 8.y (y <= 3) & SLES 15.y (y <= 2) & Ubuntu 18.04.z (z <= 5) RHEL 8.y (y <= 3)

Notes

  • Even though docs for CUDA 11.2.0 show ppc64le support for Ubuntu 18.04 there are no downloads

@mike-wendt mike-wendt mentioned this pull request Mar 18, 2021
5 tasks
@jakirkham
Copy link
Member

^ @jaimergp

@jaimergp
Copy link
Member

Hm, that will require some rebuilds, I guess? Or at least some repodata patching. I'm knee deep in other stuff at work but we will be releasing a new OpenMM version soon that brings full PPC support, so I guess I will put some work into the non-x64 builds too!

@jakirkham
Copy link
Member

Yep just letting you know 🙂

@jakirkham
Copy link
Member

Also Anthony had started working on a Docker image for ARM in PR ( conda-forge/docker-images#158 )

No headers are included so this is just a tweak to document and reflect this
@jakirkham
Copy link
Member

@conda-forge-admin, please re-render

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants