-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Build onnxruntime on arm64 linux with CUDA EP #16263
Comments
I would suggest to clone this repository, then check out the release suitable for you. You should take care, to install CUDA and CUDNN properly. If you install the Nvidia stuff the right way, compiling should be painless despite you having probably a slow CPU. In the first instance, I'd suggest to build CPU only. |
We don't have such prebuilt packages. CUDA on Linux ARM64 has two variants:
Though we may add a support for SBSA, we cannot test it. All our build servers are in Azure. Azure doesn't have such SKUs. So, I would suggest building it from source, and let us know if there was any build error. |
Thank you for your reply. I am trying to compile the onnxruntime-gpu on arm64 linux platform with:
There is no error reported during the compilation process. However, when I run the test programs afterwards, it shows that three of six tests failed. Here is the test log: TestLog.txt Then I try to used the compiled library to run the MaskRCNN model inference, it can be executed correctly with cpu, while return the same error with test log when calling cuda:
Moveover, when running inference on the gpu, it takes far longer to load the model than on the cpu, which is not normal, do you know what the problem is? Following is cuda environment:
|
Which GPU do you have on the device? |
1792 NVIDIA CUDA cores and 56 Tensor Cores Ampere, CUDA samples can run correctly as follows:
|
I rebuild onnxruntime_providers_cuda, although no error is reported, but many warnings are thrown, such as:
|
They are warnings. Not errors. The latest code doesn't have gsl-lite.hpp anymore. |
Since your GPU is with "compute capability 8.7" |
I followed your comment and added CMAKE_CUDA_ARCHITECTURES=87 to my cmake project, while it still reported the same error when executing inference:
I try to execute with tensorRT, so I compile onnxruntime-tensorrt:
I got the following compilation error:
|
May I know what device do you have? Is it an ARM server with Nvidia GPUs or a Jetson ? |
Jetson, NVIDIA Jetson AGX Orin I found NVIDIA cuDNN archive does not provide cuDNN-jetson separately, which needs to be installed through jetpack, and there is no Jetpack version corresponding to cuda 11.4 and cuDNN 8.2.4 (Most versions of onnxruntime are compatible with cuda 11.4 and cuDNN 8.2.4, as shown in CUDA Execution Provider Requirements), is there any other solution for environment problem. Jetpack Archive: https://developer.nvidia.com/embedded/jetpack-archive Currently, the cuda environment is as follows:
|
Sorry I don't have experience with Jetson. I searched around and found someone had a similar issue: "https://forums.developer.nvidia.com/t/issue-using-onnxruntime-with-cudaexecutionprovider-on-orin/219457/5" Would you try it? |
I tried that previous link you mentioned, and it did work on NVIDIA Jetson AGX Orin, thanks again for your help. |
Would you mind elaborating more what you changed? It seems we have SM87 in our cmake file: https://github.com/microsoft/onnxruntime/blob/main/cmake/CMakeLists.txt#L1312 But why it did not work, and how did you make it work? |
For the problem I encountered, it was caused by the unsuitable hardware of the nvidia developer kit (NVIDIA Jetson AGX Orin). The best way is to install the appropriate version of jetpack instead of installing cuda, cuDNN, etc. separately, because the cuDNN-jetson are not provided on the archive page. After installing jetpack, just recompile onnxruntime (make sure SM87 is in the cmake file, which needs to be added manually in the previous version 1.12) |
Thanks for the detailed explanation |
Describe the issue
I am trying to perform model inference on arm64 linux platform, however, I can't find a pre-build version suitable for gpu running (v1.12.1). Is there any other solution, or what do I need to pay attention to if I want to compile the gpu version of onnxruntime to run on arm64 linux
To reproduce
This is a question about model inference with gpu on arm64 linux platform, I would really appreciate if you could answer it
Urgency
No response
Platform
Linux
OS Version
20.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.12.1
ONNX Runtime API
C++
Architecture
ARM64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 11.4
The text was updated successfully, but these errors were encountered: