Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RuntimeError: CUDA error: no kernel image is available for execution on the device #693

Closed
azuryl opened this issue Jan 14, 2020 · 21 comments
Labels
installation / environment invalid/unrelated unrelated to this project or invalid type of issues

Comments

@azuryl
Copy link

azuryl commented Jan 14, 2020

If you do not know the root cause of the problem / bug, and wish someone to help you, please
post according to this template:

Instructions To Reproduce the Issue:

  1. run demo
    python demo/demo.py --config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input input.jpg [--other-options] --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

  2. cuda 10.0
    +-----------------------------------------------------------------------------+
    | NVIDIA-SMI 418.43 Driver Version: 418.43 CUDA Version: 10.1 |
    |-------------------------------+----------------------+----------------------+
    | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
    | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
    |===============================+======================+======================|
    | 0 GeForce RTX 208... Off | 00000000:04:00.0 Off | N/A |
    | 31% 34C P0 55W / 250W | 0MiB / 10989MiB | 0% Default |
    +-------------------------------+----------------------+----------------------+
    | 1 GeForce RTX 208... Off | 00000000:06:00.0 Off | N/A |
    | 31% 34C P0 46W / 250W | 0MiB / 10989MiB | 0% Default |
    +-------------------------------+----------------------+----------------------+
    | 2 GeForce RTX 208... Off | 00000000:07:00.0 Off | N/A |
    | 31% 34C P0 51W / 250W | 0MiB / 10989MiB | 0% Default |
    +-------------------------------+----------------------+----------------------+
    | 3 GeForce RTX 208... Off | 00000000:08:00.0 Off | N/A |
    | 31% 33C P0 60W / 250W | 0MiB / 10989MiB | 0% Default |
    +-------------------------------+----------------------+----------------------+
    | 4 GeForce RTX 208... Off | 00000000:0C:00.0 Off | N/A |
    | 31% 35C P0 60W / 250W | 0MiB / 10989MiB | 0% Default |
    +-------------------------------+----------------------+----------------------+
    | 5 GeForce RTX 208... Off | 00000000:0D:00.0 Off | N/A |
    | 30% 30C P0 50W / 250W | 0MiB / 10989MiB | 1% Default |
    +-------------------------------+----------------------+----------------------+
    | 6 GeForce RTX 208... Off | 00000000:0E:00.0 Off | N/A |

  3. what you observed (including the full logs):

return _C.nms(boxes, scores, iou_threshold)
RuntimeError: CUDA error: no kernel image is available for execution on the device (nms_cuda at /tmp/pip-req-build-9d9zypi6/torchvision/csrc/cuda/nms_cuda.cu:127)
frame #0: c10::Error::Error(c10::SourceLocation, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&) + 0x6d (0x7f3cd35c7e7d in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torch/lib/libc10.so)
frame #1: nms_cuda(at::Tensor const&, at::Tensor const&, float) + 0x8d1 (0x7f3ca5dbaece in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torchvision/_C.so)
frame #2: nms(at::Tensor const&, at::Tensor const&, float) + 0x183 (0x7f3ca5d7eed7 in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torchvision/_C.so)
frame #3: + 0x79cf5 (0x7f3ca5d98cf5 in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torchvision/_C.so)
frame #4: + 0x765b0 (0x7f3ca5d955b0 in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torchvision/_C.so)
frame #5: + 0x70d1e (0x7f3ca5d8fd1e in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torchvision/_C.so)
frame #6: + 0x70fc2 (0x7f3ca5d8ffc2 in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torchvision/_C.so)
frame #7: + 0x5be4a (0x7f3ca5d7ae4a in /home/azuryl/anaconda3/envs/detectron2p37/lib/python3.7/site-packages/torchvision/_C.so)

frame #59: __libc_start_main + 0xf0 (0x7f3d0c2ca830 in /lib/x86_64-linux-gnu/libc.so.6)

Expected behavior:

If there are no obvious error in "what you observed" provided above,
please tell us the expected behavior.

If you expect the model to converge / work better, note that we do not give suggestions
on how to train a new model.
Only in one of the two conditions we will help with it:
(1) You're unable to reproduce the results in detectron2 model zoo.
(2) It indicates a detectron2 bug.

Environment:

Please paste the output of python -m detectron2.utils.collect_env.
If detectron2 hasn't been successfully installed, use python detectron2/utils/collect_env.py.

If your issue looks like an installation issue / environment issue,
please first try to solve it yourself with the instructions in
https://github.com/facebookresearch/detectron2/blob/master/INSTALL.md#common-installation-issues

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

python -m detectron2.utils.collect_env


sys.platform linux
Python 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0]
Numpy 1.17.4
Detectron2 Compiler GCC 5.4
Detectron2 CUDA Compiler 10.0
DETECTRON2_ENV_MODULE
PyTorch 1.3.1
PyTorch Debug Build False
torchvision 0.4.2
CUDA available True
GPU 0,1,2,3,4,5,6,7 GeForce RTX 2080 Ti
CUDA_HOME /usr/local/cuda
NVCC Cuda compilation tools, release 10.0, V10.0.130
Pillow 6.2.2
cv2 4.1.2


PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.0
  • NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  • CuDNN 7.6.5
    • Built with CuDNN 7.6.4
  • Magma 2.5.0
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS=-fvisibility-inlines-hidden -std=c++11 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -I/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/include -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/work=/usr/local/src/conda/pytorch-1.3.1 -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place=/usr/local/src/conda-prefix -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

python detectron2/utils/collect_env.py


sys.platform linux
Python 3.7.6 (default, Jan 8 2020, 19:59:22) [GCC 7.3.0]
Numpy 1.17.4
Detectron2 Compiler GCC 5.4
Detectron2 CUDA Compiler 10.0
DETECTRON2_ENV_MODULE
PyTorch 1.3.1
PyTorch Debug Build False
torchvision 0.4.2
CUDA available True
GPU 0,1,2,3,4,5,6,7 GeForce RTX 2080 Ti
CUDA_HOME /usr/local/cuda
NVCC Cuda compilation tools, release 10.0, V10.0.130
Pillow 6.2.2
cv2 4.1.2


PyTorch built with:

  • GCC 7.3
  • Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CUDA Runtime 10.0
  • NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_50,code=compute_50
  • CuDNN 7.6.5
    • Built with CuDNN 7.6.4
  • Magma 2.5.0
  • Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS=-fvisibility-inlines-hidden -std=c++11 -fmessage-length=0 -march=nocona -mtune=haswell -ftree-vectorize -fPIC -fstack-protector-strong -fno-plt -O2 -ffunction-sections -pipe -I/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place/include -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/work=/usr/local/src/conda/pytorch-1.3.1 -fdebug-prefix-map=/opt/conda/conda-bld/pytorch_1574381331675/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_place=/usr/local/src/conda-prefix -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

@ppwwyyxx
Copy link
Contributor

Issues is from torchvision installation and unrelated to detectron2.

You're probably using a version of torchvision that's built with a different version of cuda or with different compute compatibilities.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

@ppwwyyxx
It seems torchvision is no problem

import torch

import torchvision
print(torch.cuda.is_available())
True
a=torch.Tensor(5,3)
a=a.cuda()
print(a)
tensor([[7.0374e+22, 5.7886e+22, 6.7120e+22],
[6.7331e+22, 6.7120e+22, 1.8515e+28],
[7.3867e+20, 9.2358e-01, 1.8061e+28],
[4.4378e+27, 6.0900e-02, 7.0374e+22],
[6.0542e+22, 7.8675e+34, 4.6894e+27]], device='cuda:0')

@ppwwyyxx
Copy link
Contributor

It has a problem when you call torchvision's nms function on cuda tensor.

@ppwwyyxx ppwwyyxx added installation / environment invalid/unrelated unrelated to this project or invalid type of issues labels Jan 14, 2020
@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

which torchvision's version should

@ppwwyyxx
Copy link
Contributor

As install.md says the version that comes together with pytorch release should work. If not, that is either because you are not using this version, or because a bug in torchvision/pytorch.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

according to pytorch guide
conda install pytorch torchvision cudatoolkit=10.0 -c pytorch

@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Jan 14, 2020

First, you can have multiple versions and the command does not guarantee you'll run the torchvision & pytorch you just installed (http://ppwwyyxx.com/blog/2019/On-Environment-Packaging-in-Python/).
If you did run the torchvision & pytorch you installed with this command, then like I said above you should report issues to torchvision.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

according to pytorch
https://pytorch.org/get-started/previous-versions/
CUDA 10.0
conda install pytorch==1.2.0 torchvision==0.4.0 cudatoolkit=10.0 -c pytorch
but it is not meet the requirement of PyTorch ≥ 1.3

@ppwwyyxx
Copy link
Contributor

Your original command conda install pytorch torchvision cudatoolkit=10.0 -c pytorch is correct.

I'll say this again: the command does not guarantee that you'll use the version you installed with this command, especially if you're on a python environment with pytorch previously installed by other means (e.g. pip).
And, if you did actually use the version you installed with the above command, then you should report issues to torchvision.

I did not say it's a pytorch issue, so your comment at pytorch/pytorch#32151 (comment) is not accurate.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

I use the Virtual environment created by conda ,it is created specially for detection2

@ppwwyyxx
Copy link
Contributor

Conda's virtual environment does not guarantee much either. The correct way to know which version you're using is mentioned in the link I posted above:

Use import lib; print(lib.__file__) to know the location of library you're using. This method should be valid for all packages.

Once you found the location, it should have a "_C.so" file there and cuobjdump --list-elf _C.so will have more information.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

I found the most important is set Pillow==6.2.2 pyyaml==5.1 as you write inhttps://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=9_FzH13EjseR
after I install these two version it seems ok
python demo/demo.py --config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input input.jpg input2.jpg [--other-options] --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

[01/14 07:38:41 detectron2]: Arguments: Namespace(confidence_threshold=0.5, config_file='configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', input=['input.jpg', 'input2.jpg', '[--other-options]'], opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl'], output=None, video_input=None, webcam=False)
[01/14 07:38:52 detectron2]: input.jpg: detected 15 instances in 0.70s

I suggest you write in INSTALL.md

@ppwwyyxx
Copy link
Contributor

They are absolutely not related to your issue.

Also, they are already declared as dependencies. pip will either install them automatically or warn you that it cannot. So there is no need to mention them in INSTALL.md.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

first I used conda install pytorch torchvision cudatoolkit=10.0 -c pytorch
then I installed according to https://colab.research.google.com/drive/16jcaJoc6bCFAQ96jDe2HwtXj7BMD_-m5#scrollTo=9_FzH13EjseR
if pip install -U torch torchvision cython Pillow==6.2.2 pyyaml==5.1
...........
the demo can be run

@ppwwyyxx
Copy link
Contributor

then it is the pip install -U torch torchvision part that takes effect.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

"conda install pytorch torchvision cudatoolkit=10.0 -c pytorch" had installed torch torchvision
so I just pip install -U Pillow==6.2.2 pyyaml==5.1
.........
I found "conda install pytorch torchvision cudatoolkit=10.0 -c pytorch" the Pillow version is 7.00
when I use " pip install -U Pillow==6.2.2 pyyaml==5.1" the pillow 7.0 is uninstalled and installed pillow 6.2.2 and pyyaml version is too

@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Jan 14, 2020

Then it is unrelated to your original issue again.

If you're using pillow 7.0, torchvision will give a different error from your issur.

@azuryl
Copy link
Author

azuryl commented Jan 14, 2020

so if I use Pillow==6.2.2 pyyaml==5.1 the program is run ok

@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Jan 14, 2020

Pillow==6.2.2 does address a different error from torchvision, which does not support Pillow 7.0. But it is unrelated to your original issue, which is probably fixed before running pip install though you didn't realize.
A different error usually means a different issue.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
installation / environment invalid/unrelated unrelated to this project or invalid type of issues
Projects
None yet
Development

No branches or pull requests

2 participants