Enable CUDA (take two) #63

traversaro · 2023-06-24T10:03:16Z

Updated version of #7 .
Fix #7 .

Checklist

Used a personal fork of the feedstock to propose changes
Bumped the build number (if the version is unchanged)
Reset the build number to 0 (if the version changed)
Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
Ensured the license file is being packaged.

conda-forge-webservices · 2023-06-24T10:03:21Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

traversaro · 2023-06-24T12:08:53Z

11.1 and 11.0 builds on Linux are failing with:

[69/1593] Building CUDA object CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o
FAILED: CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o 
$BUILD_PREFIX/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DNSYNC_ATOMIC_CPP11 -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DUSE_CUDA=1 -D_GNU_SOURCE -I$SRC_DIR/include/onnxruntime -I$SRC_DIR/include/onnxruntime/core/session -I$SRC_DIR/build-ci/Release/_deps/pytorch_cpuinfo-src/include -I$SRC_DIR/build-ci/Release/_deps/google_nsync-src/public -I$SRC_DIR/build-ci/Release -I$SRC_DIR/onnxruntime -I$SRC_DIR/build-ci/Release/_deps/abseil_cpp-src -gencode=arch=compute_37,code=sm_37 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG -std=c++17 -Xcompiler=-fPIC --diag-suppress 554 --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -MD -MT CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o -MF CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o.d -x cu -c $SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu -o CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
[70/1593] Building CXX object CMakeFiles/onnxruntime_providers_shared.dir$SRC_DIR/onnxruntime/core/providers/shared/common.cc.o
[71/1593] Building CXX object CMakeFiles/onnxruntime_mlas.dir$SRC_DIR/onnxruntime/core/mlas/lib/intrinsics/avx512/quantize_avx512f.cpp.o
ninja: build stopped: subcommand failed.

Based on https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#build and microsoft/onnxruntime#14644, I guess this version of CUDA are simply not supported, so we can just drop them given that this is a new package.

traversaro · 2023-06-27T08:39:36Z

Build is now failing with:

2023-06-27T08:18:12.8964617Z [775/1593] Building CXX object CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o
2023-06-27T08:18:12.8971202Z FAILED: CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o 
2023-06-27T08:18:12.8981167Z $BUILD_PREFIX/bin/x86_64-conda-linux-gnu-c++ -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DONNX_USE_LITE_PROTO=1 -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DUSE_CUDA=1 -D_GNU_SOURCE -D__ONNX_NO_DOC_STRINGS -Donnxruntime_providers_cuda_EXPORTS -I$SRC_DIR/include/onnxruntime -I$SRC_DIR/include/onnxruntime/core/session -I$SRC_DIR/build-ci/Release/_deps/pytorch_cpuinfo-src/include -I$SRC_DIR/build-ci/Release/_deps/google_nsync-src/public -I$SRC_DIR/build-ci/Release -I$SRC_DIR/onnxruntime -I$SRC_DIR/build-ci/Release/_deps/abseil_cpp-src -I$SRC_DIR/build-ci/Release/_deps/safeint-src -I$SRC_DIR/build-ci/Release/_deps/gsl-src/include -I$SRC_DIR/build-ci/Release/_deps/onnx-src -I$SRC_DIR/build-ci/Release/_deps/onnx-build -I$SRC_DIR/build-ci/Release/_deps/protobuf-src/src -I$SRC_DIR/build-ci/Release/_deps/flatbuffers-src/include -I$SRC_DIR/build-ci/Release/_deps/eigen-src -I$SRC_DIR/build-ci/Release/_deps/mp11-src/include -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/onnxruntime-1.15.1 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -isystem /usr/local/cuda/include -ffunction-sections -fdata-sections -DCPUINFO_SUPPORTED -O3 -DNDEBUG -flto=auto -fno-fat-lto-objects -fPIC -Wall -Wextra -Wno-deprecated-copy -Wno-nonnull-compare -Wno-reorder -Wno-error=sign-compare -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o -MF CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o.d -o CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o -c $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc
2023-06-27T08:18:12.8988968Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc: In member function 'onnxruntime::common::Status onnxruntime::contrib::cuda::Attention<T>::ComputeInternal(onnxruntime::OpKernelContext*) const':
2023-06-27T08:18:12.8995186Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:3: error: there are no arguments to 'ORT_UNUSED_VARIABLE' that depend on a template parameter, so a declaration of 'ORT_UNUSED_VARIABLE' must be available [-fpermissive]
2023-06-27T08:18:12.9000825Z   167 |   ORT_UNUSED_VARIABLE(is_mask_1d_key_seq_len_start);
2023-06-27T08:18:12.9011123Z       |   ^~~~~~~~~~~~~~~~~~~
2023-06-27T08:18:12.9011929Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:3: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
2023-06-27T08:18:12.9012837Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc: In instantiation of 'onnxruntime::common::Status onnxruntime::contrib::cuda::Attention<T>::ComputeInternal(onnxruntime::OpKernelContext*) const [with T = onnxruntime::MLFloat16]':
2023-06-27T08:18:12.9013415Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.h:21:10:   required from here
2023-06-27T08:18:12.9014039Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:22: error: 'ORT_UNUSED_VARIABLE' was not declared in this scope; did you mean 'HAS_UNUSED_VARIABLE'?
2023-06-27T08:18:12.9014766Z   167 |   ORT_UNUSED_VARIABLE(is_mask_1d_key_seq_len_start);
2023-06-27T08:18:12.9015056Z       |   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-06-27T08:18:12.9015289Z       |   HAS_UNUSED_VARIABLE
2023-06-27T08:18:12.9015946Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc: In instantiation of 'onnxruntime::common::Status onnxruntime::contrib::cuda::Attention<T>::ComputeInternal(onnxruntime::OpKernelContext*) const [with T = float]':
2023-06-27T08:18:12.9016440Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.h:21:10:   required from here
2023-06-27T08:18:12.9017000Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:22: error: 'ORT_UNUSED_VARIABLE' was not declared in this scope; did you mean 'HAS_UNUSED_VARIABLE'?
2023-06-27T08:18:12.9017380Z   167 |   ORT_UNUSED_VARIABLE(is_mask_1d_key_seq_len_start);
2023-06-27T08:18:12.9017662Z       |   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-06-27T08:18:12.9017904Z       |   HAS_UNUSED_VARIABLE

See microsoft/onnxruntime#16000 (comment), it should be easy to patch.

conda-forge-webservices · 2023-06-27T08:43:23Z

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict.
Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

conda-forge-webservices · 2023-06-27T08:44:51Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

traversaro · 2023-06-27T10:32:01Z

The build is successful. Test are failing as expected as there is no GPU in the test machines.

…nda-forge-pinning 2023.06.27.08.33.03

traversaro · 2023-06-28T06:30:12Z

@conda-forge-admin, please rerender

…nda-forge-pinning 2023.06.28.05.21.27

conda-forge-webservices · 2023-06-29T12:25:54Z

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

The outputs section contained an unexpected subsection name. string is not a valid subsection name.

conda-forge-webservices · 2023-06-29T12:26:57Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

traversaro · 2023-06-30T14:31:12Z

Ok, I tested locally the GPU builds, Python via onnxruntime_test and C++ via https://github.com/ami-iit/onnx-cpp-benchmark, and they are working fine.

Once the PR has been merged, it should be possible to install non-CUDA builds via mamba install onnxruntime=*=*cpu and CUDA builds with mamba install onnxruntime=*=*cuda.

The PR is now ready for review @conda-forge/onnxruntime, this are the things it would be useful to get a feedback:

Even by only buildng with CUDA 11.2, the PR adds 16 jobs to the already existing 42 jobs. Is that too much? There are are combinations (for example cuda+novec) that we could skip?
I am not a big expert on the handling of build string, so feel free to provide feedback on that part (I investigate what to do in Have build with BUILD_WITH_CUDA enabled librealsense-feedstock#19 (comment)).
onnxruntime-gpu package on PyPI also include the TensorRTExecutionProvider, however at that moment I guess we can't build that on conda-forge. However, we can always add it in a future PR.

traversaro · 2023-07-11T09:03:37Z

Hello @conda-forge/onnxruntime, do you have any input for this PR? Thanks in advance!

recipe/build.sh

recipe/meta.yaml

Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>

jtilly

Awesome, thank you, @traversaro!

traversaro · 2023-07-31T14:27:45Z

Thanks @jtilly !

CCRcmcpe · 2023-08-07T06:16:01Z

mamba install onnxruntime=*=*cuda seems not so obvious IMO. Is it possible to make something like mamba install onnxruntime-cuda, like faiss-gpu does?

traversaro · 2023-08-07T07:11:06Z

mamba install onnxruntime=*=*cuda seems not so obvious IMO. Is it possible to make something like mamba install onnxruntime-cuda, like faiss-gpu does?

Thanks for the suggestion @CCRcmcpe ! Can you open a new issue for discussing this?

traversaro requested review from cbourjau, janjagusch, jtilly and xhochy as code owners June 24, 2023 10:03

This comment was marked as outdated.

Sign in to view

Enable CUDA builds

cf2d33c

traversaro force-pushed the patch-1 branch from acbbd3d to cf2d33c Compare June 27, 2023 11:43

This comment was marked as outdated.

Sign in to view

conda-forge-webservices[bot] and others added 3 commits June 27, 2023 11:52

MNT: Re-rendered with conda-build 3.25.0, conda-smithy 3.23.1, and co…

f4ab024

…nda-forge-pinning 2023.06.27.08.33.03

Update build.sh

a48538d

Update bld.bat

0806f36

This comment was marked as outdated.

Sign in to view

traversaro added 2 commits June 27, 2023 23:46

Update meta.yaml

e1a4fd4

Update meta.yaml

e89b2e9

MNT: Re-rendered with conda-build 3.25.0, conda-smithy 3.23.1, and co…

2d7f457

…nda-forge-pinning 2023.06.28.05.21.27

This comment was marked as outdated.

Sign in to view

traversaro added 3 commits June 28, 2023 11:20

Update build.sh

7dd7298

Update install-cpp.bat

da7f041

Update install-cpp.sh

9af7630

traversaro added 4 commits June 29, 2023 14:20

Update meta.yaml

e072c28

Update install-cpp.sh

9093ac4

Update install-cpp.bat

b80ec3a

Update meta.yaml

fab7d32

Update meta.yaml

c14f9fd

Update meta.yaml

50152c6

traversaro mentioned this pull request Jun 29, 2023

Document installation of onnxruntime with CUDA support ami-iit/onnx-cpp-benchmark#2

Open

Update install-cpp.bat

c585ff4

traversaro closed this Jun 29, 2023

traversaro reopened this Jun 29, 2023

traversaro mentioned this pull request Jul 11, 2023

How to handle which version install by default in CUDA-enabled recipe? conda-forge/conda-forge.github.io#1978

Open

kkraus14 reviewed Jul 13, 2023

View reviewed changes

recipe/build.sh Outdated Show resolved Hide resolved

recipe/meta.yaml Outdated Show resolved Hide resolved

traversaro and others added 2 commits July 17, 2023 16:34

Update recipe/build.sh

4e77b04

Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>

Update recipe/meta.yaml

ad9ca85

Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>

kkraus14 approved these changes Jul 30, 2023

View reviewed changes

jtilly approved these changes Jul 31, 2023

View reviewed changes

jtilly merged commit c4512e0 into conda-forge:main Jul 31, 2023
57 checks passed

traversaro mentioned this pull request Aug 8, 2023

Enable cuda feature in onnxruntime package iit-danieli-joint-lab/idjl-vision-dependencies-vcpkg#5

Merged

traversaro mentioned this pull request Nov 29, 2023

Package build string harmonization with tensorflow and pytorch #83

Open

traversaro mentioned this pull request Mar 7, 2024

Prioritize the cuda builds if the user can using build numbers and add cudart dependency #108

Merged

5 tasks

traversaro mentioned this pull request May 16, 2024

Test CUDA compilation on Windows #115

Closed

5 tasks

Tobias-Fischer mentioned this pull request Oct 16, 2024

Add linux-aarch64 +ppc64le builds conda-forge/kaldi-feedstock#50

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enable CUDA (take two) #63

Enable CUDA (take two) #63

traversaro commented Jun 24, 2023 •

edited

Loading

conda-forge-webservices bot commented Jun 24, 2023

This comment was marked as outdated.

This comment was marked as outdated.

traversaro commented Jun 24, 2023

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

traversaro commented Jun 27, 2023

conda-forge-webservices bot commented Jun 27, 2023

conda-forge-webservices bot commented Jun 27, 2023

traversaro commented Jun 27, 2023

This comment was marked as outdated.

This comment was marked as outdated.

traversaro commented Jun 28, 2023

This comment was marked as outdated.

conda-forge-webservices bot commented Jun 29, 2023

conda-forge-webservices bot commented Jun 29, 2023

traversaro commented Jun 30, 2023 •

edited

Loading

traversaro commented Jul 11, 2023

jtilly left a comment

traversaro commented Jul 31, 2023

CCRcmcpe commented Aug 7, 2023

traversaro commented Aug 7, 2023

Enable CUDA (take two) #63

Enable CUDA (take two) #63

Conversation

traversaro commented Jun 24, 2023 • edited Loading

conda-forge-webservices bot commented Jun 24, 2023

This comment was marked as outdated.

This comment was marked as outdated.

traversaro commented Jun 24, 2023

This comment was marked as outdated.

This comment was marked as outdated.

This comment was marked as outdated.

traversaro commented Jun 27, 2023

conda-forge-webservices bot commented Jun 27, 2023

conda-forge-webservices bot commented Jun 27, 2023

traversaro commented Jun 27, 2023

This comment was marked as outdated.

This comment was marked as outdated.

traversaro commented Jun 28, 2023

This comment was marked as outdated.

conda-forge-webservices bot commented Jun 29, 2023

conda-forge-webservices bot commented Jun 29, 2023

traversaro commented Jun 30, 2023 • edited Loading

traversaro commented Jul 11, 2023

jtilly left a comment

Choose a reason for hiding this comment

traversaro commented Jul 31, 2023

CCRcmcpe commented Aug 7, 2023

traversaro commented Aug 7, 2023

traversaro commented Jun 24, 2023 •

edited

Loading

traversaro commented Jun 30, 2023 •

edited

Loading