Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model takes very long to load and tutorial script fails #7

Closed
DragonDuck opened this issue Oct 11, 2019 · 5 comments
Closed

Model takes very long to load and tutorial script fails #7

DragonDuck opened this issue Oct 11, 2019 · 5 comments

Comments

@DragonDuck
Copy link

If you do not know the root cause of the problem / bug, and wish someone to help you, please
include:

When I try to run the code from the Detectron2 Tutorial Collab, the model takes an extremely long time to load and then crashes with a CUDA Error / Segmentation fault.

To Reproduce

  1. what changes you made / what code you wrote: None
  2. what command you run: Taken from Detectron2 Tutorial Collab
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import matplotlib.pyplot as plt
import numpy as np
import cv2

im = cv2.imread("./input.jpg")

from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
cfg = get_cfg()
cfg.merge_from_file("configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.5  # set threshold for this model
cfg.MODEL.WEIGHTS = "detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl"
predictor = DefaultPredictor(cfg)
outputs = predictor(im)
  1. what you observed (full logs are preferred):
  • The step DefaultPredictor(cfg) takes an extremely long time (>15 minutes). Specifically, the command build_model(cfg) within the class __init__() takes this long to complete.
  • Minimal usage of the GPU and maximal usage of the CPU (top lists the python process as taking up 100% of CPU power and approx. 3.5% of memory while the GPU takes only approximately 500MB out of available 16GB). It appears that PyTorch is attempting to execute everything on the CPU.
  • Once the model is finally built, attempting to run prediction results in an error:
WARNING [10/11 07:45:38 d2.config.compat]: Config 'configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml' has no VERSION. Assuming it to be compatible with latest v2.
Traceback (most recent call last):
  File "test.py", line 19, in <module>
    outputs = predictor(im)
  File "/home/jan/miniconda3/envs/detectron/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 49, in decorate_no_grad
    return func(*args, **kwargs)
  File "/home/jan/detectron2/detectron2/engine/defaults.py", line 171, in __call__
    height, width = original_image.shape[:2]
AttributeError: 'NoneType' object has no attribute 'shape'

However, I've checked all CUDA versions and everything points to CUDA 10.1, so I don't think this is a version mismatch:

$ conda list cuda
# packages in environment at /home/jan/miniconda3/envs/detectron:
#
# Name                    Version                   Build  Channel
cudatoolkit               10.1.168                      0  
$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2019 NVIDIA Corporation
Built on Wed_Apr_24_19:10:27_PDT_2019
Cuda compilation tools, release 10.1, V10.1.168
$ echo $LD_LIBRARY_PATH 
:/usr/local/cuda-10.1.back/lib64
$ nvidia-smi 
Fri Oct 11 07:45:41 2019       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.67       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   49C    P0    41W / 250W |    269MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Tesla P100-PCIE...  Off  | 00000000:00:05.0 Off |                    0 |
| N/A   51C    P0    41W / 250W |     10MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0     16135      C   python                                       259MiB |
+-----------------------------------------------------------------------------+

All required versions as per the install page are fulfilled:

$ python --version
Python 3.6.9 :: Anaconda, Inc.

$ conda list
# packages in environment at /home/jan/miniconda3/envs/detectron:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                        main  
absl-py                   0.8.1                    pypi_0    pypi
backcall                  0.1.0                    py36_0  
blas                      1.0                         mkl  
bzip2                     1.0.8                h7b6447c_0  
ca-certificates           2019.8.28                     0  
cairo                     1.14.12              h8948797_3  
certifi                   2019.9.11                py36_0  
cffi                      1.12.3           py36h2e261b9_0  
cloudpickle               1.2.2                    pypi_0    pypi
cudatoolkit               10.1.168                      0  
cycler                    0.10.0                   pypi_0    pypi
cython                    0.29.13                  pypi_0    pypi
decorator                 4.4.0                    py36_1  
detectron2                0.1                       dev_0    <develop>
ffmpeg                    4.0                  hcdf2ecd_0  
fontconfig                2.13.0               h9420a91_0  
freeglut                  3.0.0                hf484d3e_5  
freetype                  2.9.1                h8a8886c_1  
fvcore                    0.1                      pypi_0    pypi
glib                      2.56.2               hd408876_0  
graphite2                 1.3.13               h23475e2_0  
grpcio                    1.24.1                   pypi_0    pypi
harfbuzz                  1.8.8                hffaf4a1_0  
hdf5                      1.10.2               hba1933b_1  
icu                       58.2                 h9c2bf20_1  
intel-openmp              2019.4                      243  
ipython                   7.8.0            py36h39e3cac_0  
ipython_genutils          0.2.0                    py36_0  
jasper                    2.0.14               h07fcdf6_1  
jedi                      0.15.1                   py36_0  
jpeg                      9b                   h024ee3a_2  
kiwisolver                1.1.0                    pypi_0    pypi
libedit                   3.1.20181209         hc058e9b_0  
libffi                    3.2.1                hd88cf55_4  
libgcc-ng                 9.1.0                hdf63c60_0  
libgfortran-ng            7.3.0                hdf63c60_0  
libglu                    9.0.0                hf484d3e_1  
libopencv                 3.4.2                hb342d67_1  
libopus                   1.3                  h7b6447c_0  
libpng                    1.6.37               hbc83047_0  
libstdcxx-ng              9.1.0                hdf63c60_0  
libtiff                   4.0.10               h2733197_2  
libuuid                   1.0.3                h1bed415_2  
libvpx                    1.7.0                h439df22_0  
libxcb                    1.13                 h1bed415_1  
libxml2                   2.9.9                hea5a465_1  
markdown                  3.1.1                    pypi_0    pypi
matplotlib                3.1.1                    pypi_0    pypi
mkl                       2019.4                      243  
mkl-service               2.3.0            py36he904b0f_0  
mkl_fft                   1.0.14           py36ha843d7b_0  
mkl_random                1.1.0            py36hd6b4f25_0  
ncurses                   6.1                  he6710b0_1  
ninja                     1.9.0            py36hfd86e86_0  
numpy                     1.17.2           py36haad9e8e_0  
numpy-base                1.17.2           py36hde5b4d6_0  
olefile                   0.46                     py36_0  
opencv                    3.4.2            py36h6fd60c2_1  
openssl                   1.1.1d               h7b6447c_2  
parso                     0.5.1                      py_0  
pcre                      8.43                 he6710b0_0  
pexpect                   4.7.0                    py36_0  
pickleshare               0.7.5                    py36_0  
pillow                    6.2.0            py36h34e0f95_0  
pip                       19.2.3                   py36_0  
pixman                    0.38.0               h7b6447c_0  
portalocker               1.5.1                    pypi_0    pypi
prompt_toolkit            2.0.10                     py_0  
protobuf                  3.10.0                   pypi_0    pypi
ptyprocess                0.6.0                    py36_0  
py-opencv                 3.4.2            py36hb342d67_1  
pycocotools               2.0                      pypi_0    pypi
pycparser                 2.19                     py36_0  
pygments                  2.4.2                      py_0  
pyparsing                 2.4.2                    pypi_0    pypi
python                    3.6.9                h265db76_0  
python-dateutil           2.8.0                    pypi_0    pypi
pytorch                   1.3.0           py3.6_cuda10.1.243_cudnn7.6.3_0    pytorch
pyyaml                    5.1.2                    pypi_0    pypi
readline                  7.0                  h7b6447c_5  
setuptools                41.4.0                   py36_0  
shapely                   1.6.4.post2              pypi_0    pypi
six                       1.12.0                   py36_0  
sqlite                    3.30.0               h7b6447c_0  
tensorboard               2.0.0                    pypi_0    pypi
termcolor                 1.1.0                    pypi_0    pypi
tk                        8.6.8                hbc83047_0  
torchvision               0.4.1                py36_cu101    pytorch
tqdm                      4.36.1                   pypi_0    pypi
traitlets                 4.3.3                    py36_0  
wcwidth                   0.1.7                    py36_0  
werkzeug                  0.16.0                   pypi_0    pypi
wheel                     0.33.6                   py36_0  
xz                        5.2.4                h14c3975_4  
yacs                      0.1.6                    pypi_0    pypi
zlib                      1.2.11               h7b6447c_3  
zstd                      1.3.7                h0b5b093_0  

$ gcc --version
gcc (Ubuntu 5.4.0-6ubuntu1~16.04.11) 5.4.0 20160609

Expected behavior

PyTorch appears to run primarily on the CPU instead of the GPU. I expect it to run primarily on the GPU. As I've made no edits to the code, I also expect it to run error-free.

Environment

$ python -m detectron2.utils.collect_env
---------------------  -------------------------------------------------------------------
Python                 3.6.9 |Anaconda, Inc.| (default, Jul 30 2019, 19:07:31) [GCC 7.3.0]
Detectron2 Compiler    GCC 5.4
DETECTRON2_ENV_MODULE  <not set>
PyTorch                1.3.0
PyTorch Debug Build    False
CUDA available         True
GPU 0,1                Tesla P100-PCIE-16GB
Pillow                 6.2.0
cv2                    3.4.2
---------------------  -------------------------------------------------------------------
PyTorch built with:
  - GCC 7.3
  - Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v0.20.5 (Git Hash 0125f28c61c1f822fd48570b4c1066f96fcb9b2e)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_35,code=sm_35;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_50,code=compute_50
  - CuDNN 7.6.3
  - Magma 2.5.1
  - Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=True, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 
@ppwwyyxx
Copy link
Contributor

ppwwyyxx commented Oct 11, 2019

I think you can easily verify that it does not take more than 1 minute on colab, so this should be an environment issue.

What looks problematic in your environment is that your version of pytorch is not built with the compute compatibility for P100. In that case when some ops are run for the first time nvidia-driver will spend some time compiling them. Running it the second time on the same machine should be faster.

Running the code you showed also needs to download 178MB of model for the first time. This is fast on colab, but may be slow on your machine.

AttributeError: 'NoneType' object has no attribute 'shape'

It is saying that opencv cannot read the input image you gave. You may need to check your input path or the opencv installation.

@Zehaos
Copy link

Zehaos commented Oct 11, 2019

The version of cudatoolkit is 10.1.168, while pytorch1.3 is build with cuda 10.1.243. Maybe this is the problem live in.

@ppwwyyxx
Copy link
Contributor

Similar reports have been seen in #27 and pytorch/pytorch#537. You need to find a version of pytorch whose "NVCC architecture flags" include the compute compatibility of your GPU.
Closing as the issue is about pytorch installation.

@ppwwyyxx
Copy link
Contributor

Has been fixed in pytorch according to #27

@DragonDuck
Copy link
Author

Thank you very much for the help! I can confirm that the Conda version of PyTorch from last week wasn't properly compiled to support my GPU. This has been fixed and the newest PyTorch version downloaded via Conda works error-free.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants