Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable CUDA (take two) #63

Merged
merged 19 commits into from
Jul 31, 2023
Merged

Enable CUDA (take two) #63

merged 19 commits into from
Jul 31, 2023

Conversation

traversaro
Copy link
Contributor

@traversaro traversaro commented Jun 24, 2023

Updated version of #7 .
Fix #7 .

Checklist

  • Used a personal fork of the feedstock to propose changes
  • Bumped the build number (if the version is unchanged)
  • Reset the build number to 0 (if the version changed)
  • Re-rendered with the latest conda-smithy (Use the phrase @conda-forge-admin, please rerender in a comment in this PR for automated rerendering)
  • Ensured the license file is being packaged.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@traversaro

This comment was marked as outdated.

1 similar comment
@traversaro

This comment was marked as outdated.

@traversaro
Copy link
Contributor Author

11.1 and 11.0 builds on Linux are failing with:

[69/1593] Building CUDA object CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o
FAILED: CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o 
$BUILD_PREFIX/bin/nvcc -forward-unknown-to-host-compiler -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DNSYNC_ATOMIC_CPP11 -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DUSE_CUDA=1 -D_GNU_SOURCE -I$SRC_DIR/include/onnxruntime -I$SRC_DIR/include/onnxruntime/core/session -I$SRC_DIR/build-ci/Release/_deps/pytorch_cpuinfo-src/include -I$SRC_DIR/build-ci/Release/_deps/google_nsync-src/public -I$SRC_DIR/build-ci/Release -I$SRC_DIR/onnxruntime -I$SRC_DIR/build-ci/Release/_deps/abseil_cpp-src -gencode=arch=compute_37,code=sm_37 -gencode=arch=compute_50,code=sm_50 -gencode=arch=compute_52,code=sm_52 -gencode=arch=compute_60,code=sm_60 -gencode=arch=compute_70,code=sm_70 -gencode=arch=compute_75,code=sm_75 -gencode=arch=compute_80,code=sm_80 --expt-relaxed-constexpr --Werror default-stream-launch -Xcudafe "--diag_suppress=bad_friend_decl" -Xcudafe "--diag_suppress=unsigned_compare_with_zero" -Xcudafe "--diag_suppress=expr_has_no_effect" -O3 -DNDEBUG -std=c++17 -Xcompiler=-fPIC --diag-suppress 554 --compiler-options -Wall --compiler-options -Wno-deprecated-copy --compiler-options -Wno-nonnull-compare -MD -MT CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o -MF CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o.d -x cu -c $SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu -o CMakeFiles/onnxruntime_test_cuda_ops_lib.dir$SRC_DIR/onnxruntime/test/shared_lib/cuda_ops.cu.o
nvcc warning : The 'compute_35', 'compute_37', 'compute_50', 'sm_35', 'sm_37' and 'sm_50' architectures are deprecated, and may be removed in a future release (Use -Wno-deprecated-gpu-targets to suppress warning).
nvcc fatal   : A single input file is required for a non-link phase when an outputfile is specified
[70/1593] Building CXX object CMakeFiles/onnxruntime_providers_shared.dir$SRC_DIR/onnxruntime/core/providers/shared/common.cc.o
[71/1593] Building CXX object CMakeFiles/onnxruntime_mlas.dir$SRC_DIR/onnxruntime/core/mlas/lib/intrinsics/avx512/quantize_avx512f.cpp.o
ninja: build stopped: subcommand failed.

Based on https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#build and microsoft/onnxruntime#14644, I guess this version of CUDA are simply not supported, so we can just drop them given that this is a new package.

@traversaro

This comment was marked as outdated.

@traversaro

This comment was marked as outdated.

@traversaro

This comment was marked as outdated.

@traversaro
Copy link
Contributor Author

Build is now failing with:

2023-06-27T08:18:12.8964617Z [775/1593] Building CXX object CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o
2023-06-27T08:18:12.8971202Z FAILED: CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o 
2023-06-27T08:18:12.8981167Z $BUILD_PREFIX/bin/x86_64-conda-linux-gnu-c++ -DCPUINFO_SUPPORTED_PLATFORM=1 -DEIGEN_MPL2_ONLY -DEIGEN_USE_THREADS -DNSYNC_ATOMIC_CPP11 -DONNX_ML=1 -DONNX_NAMESPACE=onnx -DONNX_USE_LITE_PROTO=1 -DORT_ENABLE_STREAM -DPLATFORM_POSIX -DUSE_CUDA=1 -D_GNU_SOURCE -D__ONNX_NO_DOC_STRINGS -Donnxruntime_providers_cuda_EXPORTS -I$SRC_DIR/include/onnxruntime -I$SRC_DIR/include/onnxruntime/core/session -I$SRC_DIR/build-ci/Release/_deps/pytorch_cpuinfo-src/include -I$SRC_DIR/build-ci/Release/_deps/google_nsync-src/public -I$SRC_DIR/build-ci/Release -I$SRC_DIR/onnxruntime -I$SRC_DIR/build-ci/Release/_deps/abseil_cpp-src -I$SRC_DIR/build-ci/Release/_deps/safeint-src -I$SRC_DIR/build-ci/Release/_deps/gsl-src/include -I$SRC_DIR/build-ci/Release/_deps/onnx-src -I$SRC_DIR/build-ci/Release/_deps/onnx-build -I$SRC_DIR/build-ci/Release/_deps/protobuf-src/src -I$SRC_DIR/build-ci/Release/_deps/flatbuffers-src/include -I$SRC_DIR/build-ci/Release/_deps/eigen-src -I$SRC_DIR/build-ci/Release/_deps/mp11-src/include -fvisibility-inlines-hidden -std=c++17 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -isystem $PREFIX/include -fdebug-prefix-map=$SRC_DIR=/usr/local/src/conda/onnxruntime-1.15.1 -fdebug-prefix-map=$PREFIX=/usr/local/src/conda-prefix -isystem /usr/local/cuda/include -ffunction-sections -fdata-sections -DCPUINFO_SUPPORTED -O3 -DNDEBUG -flto=auto -fno-fat-lto-objects -fPIC -Wall -Wextra -Wno-deprecated-copy -Wno-nonnull-compare -Wno-reorder -Wno-error=sign-compare -MD -MT CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o -MF CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o.d -o CMakeFiles/onnxruntime_providers_cuda.dir$SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc.o -c $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc
2023-06-27T08:18:12.8988968Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc: In member function 'onnxruntime::common::Status onnxruntime::contrib::cuda::Attention<T>::ComputeInternal(onnxruntime::OpKernelContext*) const':
2023-06-27T08:18:12.8995186Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:3: error: there are no arguments to 'ORT_UNUSED_VARIABLE' that depend on a template parameter, so a declaration of 'ORT_UNUSED_VARIABLE' must be available [-fpermissive]
2023-06-27T08:18:12.9000825Z   167 |   ORT_UNUSED_VARIABLE(is_mask_1d_key_seq_len_start);
2023-06-27T08:18:12.9011123Z       |   ^~~~~~~~~~~~~~~~~~~
2023-06-27T08:18:12.9011929Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:3: note: (if you use '-fpermissive', G++ will accept your code, but allowing the use of an undeclared name is deprecated)
2023-06-27T08:18:12.9012837Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc: In instantiation of 'onnxruntime::common::Status onnxruntime::contrib::cuda::Attention<T>::ComputeInternal(onnxruntime::OpKernelContext*) const [with T = onnxruntime::MLFloat16]':
2023-06-27T08:18:12.9013415Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.h:21:10:   required from here
2023-06-27T08:18:12.9014039Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:22: error: 'ORT_UNUSED_VARIABLE' was not declared in this scope; did you mean 'HAS_UNUSED_VARIABLE'?
2023-06-27T08:18:12.9014766Z   167 |   ORT_UNUSED_VARIABLE(is_mask_1d_key_seq_len_start);
2023-06-27T08:18:12.9015056Z       |   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-06-27T08:18:12.9015289Z       |   HAS_UNUSED_VARIABLE
2023-06-27T08:18:12.9015946Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc: In instantiation of 'onnxruntime::common::Status onnxruntime::contrib::cuda::Attention<T>::ComputeInternal(onnxruntime::OpKernelContext*) const [with T = float]':
2023-06-27T08:18:12.9016440Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.h:21:10:   required from here
2023-06-27T08:18:12.9017000Z $SRC_DIR/onnxruntime/contrib_ops/cuda/bert/attention.cc:167:22: error: 'ORT_UNUSED_VARIABLE' was not declared in this scope; did you mean 'HAS_UNUSED_VARIABLE'?
2023-06-27T08:18:12.9017380Z   167 |   ORT_UNUSED_VARIABLE(is_mask_1d_key_seq_len_start);
2023-06-27T08:18:12.9017662Z       |   ~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2023-06-27T08:18:12.9017904Z       |   HAS_UNUSED_VARIABLE

See microsoft/onnxruntime#16000 (comment), it should be easy to patch.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I was trying to look for recipes to lint for you, but it appears we have a merge conflict.
Please try to merge or rebase with the base branch to resolve this conflict.

Please ping the 'conda-forge/core' team (using the @ notation in a comment) if you believe this is a bug.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@traversaro
Copy link
Contributor Author

The build is successful. Test are failing as expected as there is no GPU in the test machines.

@traversaro

This comment was marked as outdated.

conda-forge-webservices[bot] and others added 3 commits June 27, 2023 11:52
@traversaro

This comment was marked as outdated.

@traversaro
Copy link
Contributor Author

@conda-forge-admin, please rerender

@traversaro

This comment was marked as outdated.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I wanted to let you know that I linted all conda-recipes in your PR (recipe) and found some lint.

Here's what I've got...

For recipe:

  • The outputs section contained an unexpected subsection name. string is not a valid subsection name.

@conda-forge-webservices
Copy link
Contributor

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

@traversaro traversaro closed this Jun 29, 2023
@traversaro traversaro reopened this Jun 29, 2023
@traversaro
Copy link
Contributor Author

traversaro commented Jun 30, 2023

Ok, I tested locally the GPU builds, Python via onnxruntime_test and C++ via https://github.com/ami-iit/onnx-cpp-benchmark, and they are working fine.

Once the PR has been merged, it should be possible to install non-CUDA builds via mamba install onnxruntime=*=*cpu and CUDA builds with mamba install onnxruntime=*=*cuda.

The PR is now ready for review @conda-forge/onnxruntime, this are the things it would be useful to get a feedback:

  • Even by only buildng with CUDA 11.2, the PR adds 16 jobs to the already existing 42 jobs. Is that too much? There are are combinations (for example cuda+novec) that we could skip?
  • I am not a big expert on the handling of build string, so feel free to provide feedback on that part (I investigate what to do in Have build with BUILD_WITH_CUDA enabled librealsense-feedstock#19 (comment)).
  • onnxruntime-gpu package on PyPI also include the TensorRTExecutionProvider, however at that moment I guess we can't build that on conda-forge. However, we can always add it in a future PR.

@traversaro
Copy link
Contributor Author

Hello @conda-forge/onnxruntime, do you have any input for this PR? Thanks in advance!

recipe/build.sh Outdated Show resolved Hide resolved
recipe/meta.yaml Outdated Show resolved Hide resolved
traversaro and others added 2 commits July 17, 2023 16:34
Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>
Co-authored-by: Keith Kraus <keith.j.kraus@gmail.com>
Copy link
Contributor

@jtilly jtilly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you, @traversaro!

@jtilly jtilly merged commit c4512e0 into conda-forge:main Jul 31, 2023
57 checks passed
@traversaro
Copy link
Contributor Author

Thanks @jtilly !

@CCRcmcpe
Copy link

CCRcmcpe commented Aug 7, 2023

mamba install onnxruntime=*=*cuda seems not so obvious IMO. Is it possible to make something like mamba install onnxruntime-cuda, like faiss-gpu does?

@traversaro
Copy link
Contributor Author

mamba install onnxruntime=*=*cuda seems not so obvious IMO. Is it possible to make something like mamba install onnxruntime-cuda, like faiss-gpu does?

Thanks for the suggestion @CCRcmcpe ! Can you open a new issue for discussing this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants