Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ninja: build stopped: subcommand failed. #640

Open
Liuwuyang1026 opened this issue Apr 2, 2024 · 8 comments
Open

ninja: build stopped: subcommand failed. #640

Liuwuyang1026 opened this issue Apr 2, 2024 · 8 comments

Comments

@Liuwuyang1026
Copy link

Describe the bug
RuntimeError: Error building extension 'bias_act_plugin': [1/2] D:\NVIDA CUDA\NVIDIA GPU Computing Toolkit\CUDA\v12.1\bin\nvcc --generate-dependencies-with-compile --dependency-output bia
s_act.cuda.o.d -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_in
terface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4068 -Xcompiler /wd4067 -Xcompiler /wd4624 -Xcompiler /wd4190 -Xcompiler /wd4018 -Xc
ompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=bias_act_plugin -DTORCH_API_INCLUDE_EXTENSION_H -IC:\User
s\29125\anaconda3\envs\stylegan\lib\site-packages\torch\include -IC:\Users\29125\anaconda3\envs\stylegan\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\29125\anaconda3
envs\stylegan\lib\site-packages\torch\include\TH -IC:\Users\29125\anaconda3\envs\stylegan\lib\site-packages\torch\include\THC "-ID:\NVIDA CUDA\NVIDIA GPU Computing Toolkit\CUDA\v12.1\incl
ude" -IC:\Users\29125\anaconda3\envs\stylegan\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_BFLOAT16_CONVERSIONS__ -D__CUDA_NO
HALF2_OPERATORS_ --expt-relaxed-constexpr -gencode=arch=compute_86,code=compute_86 -gencode=arch=compute_86,code=sm_86 -std=c++17 --use_fast_math -c C:\Users\29125\AppData\Local\torch_e
xtensions\torch_extensions\Cache\py39_cu121\bias_act_plugin\3cb576a0039689487cfba59279dd6d46-nvidia-geforce-rtx-3060-laptop-gpu\bias_act.cu -o bias_act.cuda.o
bias_act.cu
tmpxft_00007c80_00000000-10_bias_act.cudafe1.cpp
[2/2] "C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin/link.exe" bias_act.o bias_act.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@c
uda@at@@yahxz torch.lib /LIBPATH:C:\Users\29125\anaconda3\envs\stylegan\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\Users\29125\anaconda3\envs\stylegan\libs "/LIBPATH:D:\NVID
A CUDA\NVIDIA GPU Computing Toolkit\CUDA\v12.1\lib\x64" cudart.lib /out:bias_act_plugin.pyd
FAILED: bias_act_plugin.pyd
"C:\Program Files (x86)\Microsoft Visual Studio 14.0\VC\bin/link.exe" bias_act.o bias_act.cuda.o /nologo /DLL c10.lib c10_cuda.lib torch_cpu.lib torch_cuda.lib -INCLUDE:?warp_size@cuda@at
@@yahxz torch.lib /LIBPATH:C:\Users\29125\anaconda3\envs\stylegan\lib\site-packages\torch\lib torch_python.lib /LIBPATH:C:\Users\29125\anaconda3\envs\stylegan\libs "/LIBPATH:D:\NVIDA CUDA
\NVIDIA GPU Computing Toolkit\CUDA\v12.1\lib\x64" cudart.lib /out:bias_act_plugin.pyd
正在创建库 bias_act_plugin.lib 和对象 bias_act_plugin.exp
MSVCRT.lib(loadcfg.obj) : error LNK2001: 无法解析的外部符号 __enclave_config
MSVCRT.lib(loadcfg.obj) : error LNK2001: 无法解析的外部符号 __guard_eh_cont_table
MSVCRT.lib(loadcfg.obj) : error LNK2001: 无法解析的外部符号 __guard_eh_cont_count
MSVCRT.lib(loadcfg.obj) : error LNK2001: 无法解析的外部符号 __volatile_metadata
bias_act_plugin.pyd : fatal error LNK1120: 4 个无法解析的外部命令
ninja: build stopped: subcommand failed.

  • PyTorch version pytorch 2.2.2
  • CUDA toolkit version CUDA 12.1
  • NVIDIA driver version
  • GPU RTX 3060]
@Liuwuyang1026
Copy link
Author

I really need you!!Please!!!

@fak111
Copy link

fak111 commented Apr 25, 2024

me too

@fak111
Copy link

fak111 commented Apr 25, 2024

I met a similar problem using Ubuntu 22 with Anaconda as

Setting up PyTorch plugin "bias_act_plugin"... Failed!
:
FAILED: bias_act.cuda.o
/usr/bin/nvcc -DTORCH_EXTENSION_NAME=bias_act_plugin -...

In my case, removing the nvcc solved the problem: sudo apt remove nvidia-cuda-toolkit .

@hengfei-wang
Copy link

got so many troubles when installing customized cuda extensions on a cluster without root account.

Really need a tutorial on how to run this project on a cluster without root privileges. :(

@egaznep
Copy link

egaznep commented Jun 23, 2024

I managed to get the CUDA kernels working by doing the following (should not require admin rights)

  1. install a preferred flavor of conda (miniconda, anaconda, ...) if you don't have it
  2. create a fresh environment. install the desired python version, a torch version from the pytorch channel, as well as cuda runtime and library packages. for me I think the following was sufficient (FYI I just needed the custom CUDA kernels and not the full StyleGAN3 stuff):
  - nvidia::cuda-nvcc=*12.1
  - nvidia::cuda-cudart-dev=*12.1
  - nvidia::cuda-cudart=*12.1
  - nvidia::libcusparse-dev=*12.1
  - nvidia::libcublas-dev=*12.1
  - nvidia::libcusolver-dev

and from pip I got these installed

ipython               8.25.0
ninja                 1.11.1.1
pip                   24.0
scipy                 1.13.1
setuptools            69.5.1
torch                 2.3.1
wheel                 0.43.0
  1. When I tried to run stuff, I got errors indicating two headers could not be located, probably because of one of the nvidia conda packages. I had to copy two headers from their original folders to {ENV_DIR}/include/.

@hengfei-wang
Copy link

I finally solved this problem. It is related to the cuda installation. The cuda installed with cluster does not have some files. I reload a cuda module from pre-installed modules in cluster. Then the cuda extensions could be compiled successfully.

@aaroncoyner
Copy link

aaroncoyner commented Jul 23, 2024

got so many troubles when installing customized cuda extensions on a cluster without root account.

Really need a tutorial on how to run this project on a cluster without root privileges. :(

hey @hengfei-wang , I was having a similar issue on a cluster. for me, the server had outdated C++ compilers, so I used conda to install a newer version within the stylegan virtual environment: conda install -c conda-forge cxx-compiler. basically, it seems like it might be necessary to install the other non-Python dependencies in your virtual environment.

@hengfei-wang
Copy link

Hi, thank you for your kind advice @aaroncoyner. I solved the problem by importing a cuda kit from the pre-installed repo of the cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants