Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: no kernel image is available for execution on the device #107

Open
2000lf opened this issue Mar 27, 2024 · 1 comment

Comments

@2000lf
Copy link

2000lf commented Mar 27, 2024

when I train raise RuntimeError: CUDA error: no kernel image is available for execution on the device

Environment

sys.platform: linux
Python: 3.7.16 (default, Jan 17 2023, 22:20:44) [GCC 11.2.0]
CUDA available: True
GPU 0,1: NVIDIA A30
CUDA_HOME: /home/shiying/luofan/CUDA/cuda11.1
NVCC: Build cuda_11.1.TC455_06.29069683_0
GCC: gcc (GCC) 9.4.0
PyTorch: 1.8.0+cu111
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: NO AVX
  - CUDA Runtime 11.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  - CuDNN 8.0.5
  - Magma 2.5.2
  - Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, 

TorchVision: 0.9.0+cu111
OpenCV: 4.9.0
MMCV: 1.3.0
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.11.0
MMDetection3D: 0.11.0+73c596f

Error traceback
Traceback (most recent call last):
File "tools/train.py", line 254, in
main()
File "tools/train.py", line 250, in main
meta=meta)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmdet/apis/train.py", line 170, in train_detector
runner.run(data_loaders, cfg.workflow)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 125, in run
epoch_runner(data_loaders[i], **kwargs)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 50, in train
self.run_iter(data_batch, train_mode=True)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmcv/runner/epoch_based_runner.py", line 30, in run_iter
**kwargs)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmcv/parallel/data_parallel.py", line 67, in train_step
return self.module.train_step(*inputs[0], **kwargs[0])
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmdet/models/detectors/base.py", line 247, in train_step
losses = self(**data)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 84, in new_func
return old_func(*args, **kwargs)
File "/home/shiying/luofan/TransFusion/mmdet3d/models/detectors/base.py", line 58, in forward
return self.forward_train(**kwargs)
File "/home/shiying/luofan/TransFusion/mmdet3d/models/detectors/transfusion.py", line 142, in forward_train
gt_bboxes_ignore)
File "/home/shiying/luofan/TransFusion/mmdet3d/models/detectors/transfusion.py", line 179, in forward_pts_train
losses = self.pts_bbox_head.loss(*loss_inputs)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmcv/runner/fp16_utils.py", line 164, in new_func
return old_func(*args, **kwargs)
File "/home/shiying/luofan/TransFusion/mmdet3d/models/dense_heads/transfusion_head.py", line 1257, in loss
layer_loss_cls = self.loss_cls(layer_cls_score, layer_labels, layer_label_weights, avg_factor=max(num_pos, 1))
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmdet/models/losses/focal_loss.py", line 177, in forward
avg_factor=avg_factor)
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmdet/models/losses/focal_loss.py", line 86, in sigmoid_focal_loss
'none')
File "/home/shiying/zjx/envs/anaconda3/envs/transfusion/lib/python3.7/site-packages/mmcv/ops/focal_loss.py", line 55, in forward
input, target, weight, output, gamma=ctx.gamma, alpha=ctx.alpha)
RuntimeError: CUDA error: no kernel image is available for execution on the device

@Gaoeee
Copy link

Gaoeee commented Apr 22, 2024

Hi, did you solve it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants