Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rebuild for CUDA 12 w/arch + Windows support #56

Merged

Conversation

regro-cf-autotick-bot
Copy link
Contributor

This PR has been triggered in an effort to update cuda120.

Notes and instructions for merging this PR:

  1. Please merge the PR only after the tests have passed.
  2. Feel free to push to the bot's branch to update this PR if needed.

Please note that if you close this PR we presume that the feedstock has been rebuilt, so if you are going to perform the rebuild yourself don't close this PR until the your rebuild has been merged.


Here are some more details about this specific migrator:

The transition to CUDA 12 SDK includes new packages for all CUDA libraries and build tools. Notably, the cudatoolkit package no longer exists, and packages should depend directly on the specific CUDA libraries (libcublas, libcusolver, etc) as needed. For an in-depth overview of the changes and to report problems see this issue. Please feel free to raise any issues encountered there. Thank you! 🙏


If this PR was opened in error or needs to be updated please add the bot-rerun label to this PR. The bot will close this PR and schedule another one. If you do not have permissions to add this label, you can use the phrase @conda-forge-admin, please rerun bot in a PR comment to have the conda-forge-admin add it for you.

This PR was created by the regro-cf-autotick-bot. The regro-cf-autotick-bot is a service to automatically track the dependency graph, migrate packages, and propose package version updates for conda-forge. Feel free to drop us a line if there are any issues! This PR was generated by https://github.com/regro/cf-scripts/actions/runs/6899793194, please use this URL for debugging.

The transition to CUDA 12 SDK includes new packages for all CUDA libraries and
build tools. Notably, the cudatoolkit package no longer exists, and packages
should depend directly on the specific CUDA libraries (libcublas, libcusolver,
etc) as needed. For an in-depth overview of the changes and to report problems
[see this issue]( conda-forge/conda-forge.github.io#1963 ).
Please feel free to raise any issues encountered there. Thank you! 🙏
@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@traversaro
Copy link
Contributor

Some aarch64/ppc64le cuda builds are failing with:

  #$
  "/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/_build_env/bin"/aarch64-conda-linux-gnu-c++
  -D__CUDA_ARCH_LIST__=520 -E -x c++ -D__CUDACC__ -D__NVCC__
  -I"/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/targets/sbsa-linux/include"
  "-I/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/_build_env/bin/../targets/sbsa-linux/include"
  -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=0
  -D__CUDACC_VER_BUILD__=76 -D__CUDA_API_VER_MAJOR__=12
  -D__CUDA_API_VER_MINOR__=0 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
  "cuda_runtime.h" "CMakeCUDACompilerId.cu" -o
  "tmp/CMakeCUDACompilerId.cpp4.ii"

  #$ cudafe++ --c++17 --gnu_version=120300 --display_error_number
  --orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name
  "/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/work/build/CMakeFiles/3.27.8/CompilerIdCUDA/CMakeCUDACompilerId.cu"
  --allow_managed --unsigned_chars --unsigned_wchar_t --m64 --parse_templates
  --gen_c_file_name "tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name
  "CMakeCUDACompilerId.cudafe1.stub.c" --gen_module_id_file
  --module_id_file_name "tmp/CMakeCUDACompilerId.module_id"
  "tmp/CMakeCUDACompilerId.cpp4.ii"

  nvcc error : 'cudafe++' died due to signal 11 (Invalid memory reference)

  nvcc error : 'cudafe++' core dumped

  # --error 0x8b --

Probably it is just an out of memory issue?

@traversaro
Copy link
Contributor

@conda-forge-admin re-render

@traversaro
Copy link
Contributor

Some aarch64/ppc64le cuda builds are failing with:

  #$
  "/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/_build_env/bin"/aarch64-conda-linux-gnu-c++
  -D__CUDA_ARCH_LIST__=520 -E -x c++ -D__CUDACC__ -D__NVCC__
  -I"/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehol/targets/sbsa-linux/include"
  "-I/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/_build_env/bin/../targets/sbsa-linux/include"
  -D__CUDACC_VER_MAJOR__=12 -D__CUDACC_VER_MINOR__=0
  -D__CUDACC_VER_BUILD__=76 -D__CUDA_API_VER_MAJOR__=12
  -D__CUDA_API_VER_MINOR__=0 -D__NVCC_DIAG_PRAGMA_SUPPORT__=1 -include
  "cuda_runtime.h" "CMakeCUDACompilerId.cu" -o
  "tmp/CMakeCUDACompilerId.cpp4.ii"

  #$ cudafe++ --c++17 --gnu_version=120300 --display_error_number
  --orig_src_file_name "CMakeCUDACompilerId.cu" --orig_src_path_name
  "/home/conda/feedstock_root/build_artifacts/librealsense_1700196953429/work/build/CMakeFiles/3.27.8/CompilerIdCUDA/CMakeCUDACompilerId.cu"
  --allow_managed --unsigned_chars --unsigned_wchar_t --m64 --parse_templates
  --gen_c_file_name "tmp/CMakeCUDACompilerId.cudafe1.cpp" --stub_file_name
  "CMakeCUDACompilerId.cudafe1.stub.c" --gen_module_id_file
  --module_id_file_name "tmp/CMakeCUDACompilerId.module_id"
  "tmp/CMakeCUDACompilerId.cpp4.ii"

  nvcc error : 'cudafe++' died due to signal 11 (Invalid memory reference)

  nvcc error : 'cudafe++' core dumped

  # --error 0x8b --

Probably it is just an out of memory issue?

The file that is failing is a simple file used by CMake to identify the CUDA compiler use, so I think it is not an out of memory issue.

@jakirkham
Copy link
Member

@conda-forge-admin , please re-render

conda-forge-webservices[bot] and others added 2 commits December 2, 2023 04:46
@jakirkham
Copy link
Member

@conda-forge-admin , please re-render

conda-forge-webservices[bot] and others added 2 commits December 2, 2023 04:53
@jakirkham
Copy link
Member

@conda-forge-admin , please re-render

@jakirkham
Copy link
Member

Cleaned up disk space on Azure and switched to cross-compilation, which appears to have addressed the memory issue

@jakirkham
Copy link
Member

jakirkham commented Dec 2, 2023

Seeing these errors from CMake on CI

CMake Error at /home/conda/feedstock_root/build_artifacts/librealsense_1701494354003/_build_env/share/cmake-3.27/Modules/FindPackageHandleStandardArgs.cmake:230 (message):
  Could NOT find CUDA (missing: CUDA_TOOLKIT_ROOT_DIR CUDA_CUDART_LIBRARY)
  (found version "12.0")

Maybe this logic needs to be reworked a bit

elif [[ ! -z "${cuda_compiler_version+x}" && "${cuda_compiler_version}" != "None" ]]
then
echo "==> cuda_compiler_version=${cuda_compiler_version}, use CMake's CUDA support"
CMAKE_ARGS="${CMAKE_ARGS} -DBUILD_WITH_CUDA=ON -DCMAKE_CUDA_ARCHITECTURES=all"

@traversaro
Copy link
Contributor

Thanks @jakirkham ! Yes now that only 11.8 + 12 are supported we can simplify that code (I can work on this in a few days), thanks for chiming in.

@jakirkham
Copy link
Member

@conda-forge-admin , please re-render

1 similar comment
@jakirkham
Copy link
Member

@conda-forge-admin , please re-render

@traversaro traversaro closed this Mar 7, 2024
@traversaro traversaro reopened this Mar 7, 2024
@traversaro
Copy link
Contributor

@conda-forge-admin , please re-render

@jakirkham
Copy link
Member

Would add cuda-version to requirements/host like so

requirements:
  ...
  host:
    ...
    - cuda-version {{ cuda_compiler_version }}  # [(cuda_compiler_version or "None") != "None"]

This will ensure that requirements/build and requirements/host use the same CUDA version for compilers, libraries, etc.

@traversaro
Copy link
Contributor

@conda-forge-admin , please re-render

@traversaro traversaro added the automerge Merge the PR when CI passes label Mar 7, 2024
@traversaro
Copy link
Contributor

Travis build of ppc64 are super-slow. I think it make sense to disable CUDA for ppc64, and switch back ppc64 to use cross-compilation (where CUDA was failing).

Copy link
Member

@jakirkham jakirkham left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Silvio! 🙏

Added a few suggestions to see if we can get cross-compilation to work

recipe/build.sh Outdated Show resolved Hide resolved
conda-forge.yml Outdated Show resolved Hide resolved
Copy link
Contributor

github-actions bot commented Mar 7, 2024

Hi! This is the friendly conda-forge automerge bot!

I considered the following status checks when analyzing this PR:

  • linter: passed
  • travis: failed
  • azure: passed

Thus the PR was not passing and not merged.

Co-authored-by: jakirkham <jakirkham@gmail.com>
@traversaro
Copy link
Contributor

@conda-forge-admin , please re-render

Copy link
Contributor

github-actions bot commented Mar 7, 2024

Hi! This is the friendly conda-forge automerge bot!

Commits were made to this PR after the automerge label was added. For security reasons, I have disabled automerge by removing the automerge label. Please add the automerge label again (or ask a maintainer to do so) if you'd like to enable automerge again!

@github-actions github-actions bot removed the automerge Merge the PR when CI passes label Mar 7, 2024
@traversaro
Copy link
Contributor

Thanks Silvio! 🙏

Added a few suggestions to see if we can get cross-compilation to work

Thanks! I just applied them, let's see if they work.

@@ -44,6 +44,7 @@ requirements:
- python
- libudev # [linux and cdt_name!='cos6']
- libusb
- cuda-version {{ cuda_compiler_version }} # [(cuda_compiler_version or "None") != "None"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like CMake can't find cudart. So let's add it as a dependency and see if that helps

We should double check whether it is linking to cudart or not. If not, we may want to add this to ignore_run_exports_from above

Suggested change
- cuda-version {{ cuda_compiler_version }} # [(cuda_compiler_version or "None") != "None"]
- cuda-version {{ cuda_compiler_version }} # [(cuda_compiler_version or "None") != "None"]
- cuda-cudart-dev # [(cuda_compiler_version or "None") != "None"]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed this comment and added it like this dc6ed15 (I added it only for cuda==12 builds).

@traversaro
Copy link
Contributor

@conda-forge-admin , please re-render

Copy link
Contributor

github-actions bot commented Mar 7, 2024

Hi! This is the friendly automated conda-forge-webservice.

I tried to rerender for you, but it looks like there was nothing to do.

This message was generated by GitHub actions workflow run https://github.com/conda-forge/librealsense-feedstock/actions/runs/8196232282.

@jakirkham
Copy link
Member

Ok think I understand what is happening

librealsense still uses some deprecated CMake features that need to be changed. Raised upstream issue ( IntelRealSense/librealsense#12736 ), which includes more info about the changes needed

Think we can disable CUDA 12 on linux_aarch64 and linux_ppc64le until this is fixed upstream. We can also raise an issue on the feedstock to track these upstream changes that we are waiting on

@traversaro
Copy link
Contributor

Think we can disable CUDA 12 on linux_aarch64 and linux_ppc64le until this is fixed upstream. We can also raise an issue on the feedstock to track these upstream changes that we are waiting on

Can't we use the non-crosscompile build, at least aarch64 ? I do not care a lot for ppc64le, but we use a lot this library on Jetson boards, and having CUDA 12 builds but not for aarch64 may be confusing for users from my organization.

@jakirkham
Copy link
Member

Yeah wasn't sure what level of effort we were going for. So just proposed the simplest approach

Certainly we could do native builds if that is tenable for you

Another option might be to patch the CMake files to enable cross-compilation here. @Tobias-Fischer did this for dlib. So he may be able to advise

@traversaro
Copy link
Contributor

@conda-forge-admin , please re-render

@traversaro traversaro added the automerge Merge the PR when CI passes label Mar 8, 2024
@traversaro traversaro merged commit e32d8bf into conda-forge:main Mar 8, 2024
47 of 48 checks passed
@regro-cf-autotick-bot regro-cf-autotick-bot deleted the rebuild-cuda120-0-3_h7be56b branch March 8, 2024 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Merge the PR when CI passes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants