Build fails agains rocm 6.2/pytorch 2.5 #1162

IMbackK · 2024-11-25T22:29:10Z

🐛 Bug

Build fails at every object with:

clang++: error: unknown argument: '--use_fast_math'
clang++: error: unknown argument: '--extended-lambda'
clang++: error: unknown argument: '--generate-line-info'
clang++: error: unknown argument '--threads'; did you mean '-mthreads'?
clang++: error: unknown argument: '--ptxas-options=-v'
clang++: error: unknown argument: '--ptxas-options=-O2'
clang++: error: unknown argument: '--ptxas-options=-allow-expensive-optimizations=true'

indeed /opt/rocm/lib/llvm/bin/clang++ (and clang in general) dose not support these options looking at setup.py it is unclear to me how this could have ever compiled against llvm.

To Reproduce

PYTORCH_ROCM_ARCH=gfx908 python setup.py bdist_wheel

or install

Environment

Collecting environment information...
PyTorch version: 2.5.1
Is debug build: False
CUDA used to build PyTorch: N/A
ROCM used to build PyTorch: 6.2.41134-0

OS: Arch Linux (x86_64)
GCC version: (GCC) 14.2.1 20240910
Clang version: 18.1.8
CMake version: version 3.31.0
Libc version: glibc-2.40

Python version: 3.12.7 (main, Oct 1 2024, 11:15:50) [GCC 14.2.1 20240910] (64-bit runtime)
Python platform: Linux-6.10.6-arch1-1-x86_64-with-glibc2.40
Is CUDA available: True
CUDA runtime version: 12.6.77
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: AMD Radeon RX 6800 XT (gfx1030)
Nvidia driver version: Could not collect
cuDNN version: Could not collect
HIP runtime version: 6.2.41134
MIOpen runtime version: 3.2.0
Is XNNPACK available: True

The text was updated successfully, but these errors were encountered:

lw · 2024-11-26T08:58:12Z

Sorry, we're unable to provide support for ROCm as we don't use such devices ourselves. Perhaps someone from AMD will be able to weigh in.

My only comment is that the options that are "unsupported" are those that, with an NVIDIA setup, one would pass to nvcc. I don't know what the equivalent compiler is in an AMD setup. If you can find out, could you check if it's installed, and why it's not being called?

IMbackK · 2024-11-26T09:34:00Z

So i have figure this out, the problem is that in in my pytorch install
torch.cuda.is_available() is true
torch.utils.cpp_extension.ROCM_HOME is '/opt/rocm'
and torch.utils.cpp_extension.CUDA_HOME is '/opt/cuda'

thus we go down the cuda path here:

xformers/setup.py

Line 441 in 6e10bd2

(torch.cuda.is_available() and ((CUDA_HOME is not None)))

according to pytorch documentation the correct way to determine if pytorch is compiled against cuda or rocm is to check torch.version.hip/torch.version.cuda for None and indeed this results in the correct values here.

IMbackK · 2024-11-26T09:38:17Z

it seams torch.utils.cpp_extension.CUDA_HOME is set whenever torch is compiled while cuda was installed and carries no information on what backend pytorch was compiled against.

IMbackK linked a pull request Nov 26, 2024 that will close this issue

Avoid using cuda when it is installed but torch was compiled against ROCM/hip #1164

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Build fails agains rocm 6.2/pytorch 2.5 #1162

Build fails agains rocm 6.2/pytorch 2.5 #1162

IMbackK commented Nov 25, 2024

lw commented Nov 26, 2024

IMbackK commented Nov 26, 2024

IMbackK commented Nov 26, 2024

Build fails agains rocm 6.2/pytorch 2.5 #1162

Build fails agains rocm 6.2/pytorch 2.5 #1162

Comments

IMbackK commented Nov 25, 2024

🐛 Bug

To Reproduce

Environment

lw commented Nov 26, 2024

IMbackK commented Nov 26, 2024

IMbackK commented Nov 26, 2024