cannot run demo on CPU mode #36

teddybearz · 2016-11-30T06:09:13Z

running inside the latest docker tensorflow:

docker run -it -p 8888:8888 tensorflow/tensorflow

`

root@f54905c5bdaf:/notebooks/Faster-RCNN_TF# python ./tools/demo.py --model /VGGnet_fast_rcnn_iter_70000.ckpt
Traceback (most recent call last):
File "./tools/demo.py", line 11, in
from networks.factory import get_network
File "/notebooks/Faster-RCNN_TF/tools/../lib/networks/init.py", line 8, in
from .VGGnet_train import VGGnet_train
File "/notebooks/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in
from networks.network import Network
File "/notebooks/Faster-RCNN_TF/tools/../lib/networks/network.py", line 3, in
import roi_pooling_layer.roi_pooling_op as roi_pool_op
File "/notebooks/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in
_roi_pooling_module = tf.load_op_library(filename)
File "/usr/local/lib/python2.7/dist-packages/tensorflow/python/framework/load_library.py", line 63, in load_op_library
raise errors._make_specific_exception(None, None, error_msg, error_code)
tensorflow.python.framework.errors.NotFoundError: /notebooks/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _Z22ROIPoolBackwardLaucherPKffiiiiiiiS0_PfPKiRKN5Eigen9GpuDeviceE

root@f54905c5bdaf:/notebooks/Faster-RCNN_TF# nm -gC lib/roi_pooling_layer/roi_pooling.so |grep GpuDevice
U ROIPoolForwardLaucher(float const*, float, int, int, int, int, int, int, float const*, float*, int*, Eigen::GpuDevice const&)
U ROIPoolBackwardLaucher(float const*, float, int, int, int, int, int, int, int, float const*, float*, int const*, Eigen::GpuDevice const&)
U Eigen::GpuDevice const& tensorflow::OpKernelContext::eigen_deviceEigen::GpuDevice() const

`

teddybearz · 2016-11-30T06:16:21Z

to reproduce (after download VGGnet_fast_rcnn_iter_70000.ckpt to ~/):

`
docker run -v ~/VGGnet_fast_rcnn_iter_70000.ckpt:/VGGnet_fast_rcnn_iter_70000.ckpt -it -p 8888:8888 tensorflow/tensorflow bash

sudo apt-get update
sudo apt-get install -y git
sudo apt-get install -y python-opencv
sudo apt-get install -y python-tk

pip install cython
pip install easydict
pip install image

sudo ln /dev/null /dev/raw1394

git clone --recursive https://github.com/smallcorgi/Faster-RCNN_TF.git

cd Faster-RCNN_TF/lib
make
cd ..
python ./tools/demo.py --model /VGGnet_fast_rcnn_iter_70000.ckpt

`

tyyyang · 2016-12-10T11:25:56Z

I also encounter the same problem.

donnyyou · 2016-12-17T05:15:13Z

I have encountered the same fault, too. And I wonder the solution to this problem. Thanks!

jacobunderlinebenseal · 2016-12-26T10:05:31Z

me too

jaig · 2017-01-09T08:36:29Z

I am facing the similar problem when I start to train it on CPU or run a demo. Solution for this ?

nsivaramakrishnan · 2017-01-09T10:34:00Z

Hi,
I am getting the same error while trying to run demo.py:
tensorflow.python.framework.errors_impl.NotFoundError: /home/fmc/rcnn/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _Z22ROIPoolBackwardLaucherPKffiiiiiiiS0_PfPKiRKN5Eigen9GpuDeviceE
I added "-D_GLIBCXX_USE_CXX11_ABI=0" in make,sh. I use g++ version 5.4.0 and TF V0.12. Btw, am trying to run this on CPU. Any help is highly appreciated.
-Siva

jaig · 2017-01-11T06:31:21Z

Can we train this model using CPU itself?

oplkqingy · 2017-01-22T07:10:41Z

I meet similar issue in ubuntu16.04 with g++ version 5.4.0 and TF v0.12.Befor add "-D_GLIBCXX_USE_CXX11_ABI=0" in make.sh, show "_ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE" when run the demo, and after add ,show "_Z22ROIPoolBackwardLaucherPKffiiiiiiiS0_PfPKiRKN5Eigen9GpuDeviceE" when run the demo.

I have'nt GPU,How can I run the demo in CPU-noly mode?

raviv · 2017-01-22T14:43:13Z

Having the same problem (_Z22ROIPoolBackwardLaucherPKffiiiiiiiS0_PfPKiRKN5Eigen9GpuDeviceE) when trying to train on CPU.
Adding "-D_GLIBCXX_USE_CXX11_ABI=0" to the g++ command in make.sh and re-making didn't help.
Thanks.

civilman628 · 2017-02-05T20:55:57Z

g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
	roi_pooling_op.cu.o -I $TF_INC  -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS  -D_GLIBCXX_USE_CXX11_ABI=0 \
	-lcudart -L $CUDA_PATH/lib64

DiegoGLagash · 2017-02-06T13:25:45Z

same problem here.

pbarker · 2017-02-12T23:26:49Z

same problem here as well

EunmiKang · 2017-02-13T07:07:35Z

me too :(

andresrommier · 2017-02-15T07:56:35Z

Had to modify the make.sh file to change the GPU architecture to match mine (sm_61), then had to change the Cuda path (in Arch linux is /opt/cuda).

wxwang0601 · 2017-03-11T13:37:24Z

same problem！
Befor add "-D_GLIBCXX_USE_CXX11_ABI=0" in make.sh, show "_ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE" when run the demo, and after add ,show "_Z22ROIPoolBackwardLaucherPKffiiiiiiiS0_PfPKiRKN5Eigen9GpuDeviceE" when run the demo.

@googleios @raviv have u solve the problem？

louisquinn · 2017-03-17T02:08:08Z

Hi all, I've figured out a workaround to use only the CPU. I have only tested this method for the demo script, not sure if it will work for training, but it should.

Download and Install CUDA:
https://developer.nvidia.com/cuda-downloads

Compile for GPU OR Copy my .so
You can download my .so file from here: https://drive.google.com/open?id=0B-0d5quIGY5XVEJvYU9XRkVJTWM
Or you can run make.sh and compile with CUDA (not sure if this will work)

Include these lines of code at the top of your Python scripts
import os
os.environ['CUDA_VISIBLE_DEVICES'] = ''

guotong1988 · 2017-04-01T02:41:54Z

I succeed to run another faster-rcnn on CPU from this repo

shinyke · 2017-04-07T09:17:14Z

@louisquinn I succeed with your method. thx~

jhcruvinel · 2017-04-11T14:35:23Z

@louisquinn,
I would like to know how you managed to install and run the example you mentioned without a GPU

jhcruvinel · 2017-04-11T14:37:22Z

@guotong1988, tf-faster-rcnn requires GPU. How you managed to install without a GPU

jhcruvinel · 2017-04-17T17:18:04Z

@louisquinn, I was able to reproduce your script. It worked.

ghost · 2017-05-07T19:15:40Z

How you managed to install without a GPU ?

jhcruvinel · 2017-05-08T17:12:50Z

I installed the CUDA driver, although the machine does not have the card. Then I set it to use CPU only. It worked!

liydxl · 2017-05-12T11:23:59Z

@louisquinn, hi, I add " import os os.environ['CUDA_VISIBLE_DEVICES'] = ''" " to file "demo.py" and "_init_paths.py" and "setup.py". But it seems do not work , the error message is "RuntimeError: Invalid DISPLAY variable".
Which file should "os.environ['CUDA_VISIBLE_DEVICES'] = ''" " be add to?

sidak · 2017-07-06T16:05:35Z

The method of installing Cuda mentioned by @louisquinn works for me! Thanks! 😄

louisquinn · 2017-07-12T06:43:23Z

@xiaoqo
Apologies for the late reply!
You should add the line to "demo.py", however it MUST be before the Tensorflow session is created, so before line 112.

Also, you guys will be interested in this: https://github.com/tensorflow/models/tree/master/object_detection
Official API for deep learning object detection with various state of the art models and frameworks, no more VGG16! It's really easy to use. If you install Tensorflow for CPU it will run out of the box, however if you installed for GPU and wish to run CPU only, you will have to use the same method I mentioned in this thread.

sunzj · 2017-07-13T09:21:56Z

Hi

i find the root causes of the issue. when use CPU only mode without installing cude , library roi_pooling.so compile function "ROIPoolBackwardLaucher" into it.However, the function is implemented in cuda related module and only for GPU.So when execute demo, can't find the implement of function ROIPoolBackwardLaucher,crash happen.

i prepare a patch for that issue, and verified the issue is gone after applying the patch.
when i try to push the patch, i find there was a patch there but isn't merged:

you can refer to:
0dcb55c

or use my patch:
https://drive.google.com/file/d/0BxlQuWrSazOxd29PNjVIenZneHM/view?usp=sharing

Best wishes!
Zhuojin

lfc87 · 2017-07-19T13:14:11Z

@louisquinn i did following:

installed cuda
downloaded your .so file and replaced it here Faster-RCNN_TF/lib/roi_pooling_layer
these two rows i’ve pasted to Faster-RCNN_TF/lib in setup.py
import os
os.environ['CUDA_VISIBLE_DEVICES'] = ''
now i do make and receive an error

`python setup.py build_ext --inplace
running build_ext
skipping 'utils/bbox.c' Cython extension (up-to-date)
skipping 'utils/nms.c' Cython extension (up-to-date)
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
skipping 'nms/gpu_nms.cpp' Cython extension (up-to-date)
rm -rf build
bash make.sh
/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1254): warning: calling a constexpr host function("real") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1254): warning: calling a constexpr host function("imag") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1254): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1259): warning: calling a constexpr host function("real") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1259): warning: calling a constexpr host function("imag") from a host device function("abs") is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/../../../Eigen/src/Core/MathFunctions.h(1259): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/src/Tensor/TensorRandom.h(133): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/src/Tensor/TensorRandom.h(138): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/src/Tensor/TensorRandom.h(212): warning: calling a constexpr host function from a host device function is not allowed. The experimental flag '--expt-relaxed-constexpr' can be used to allow this.

/home/liverpool/.local/lib/python3.5/site-packages/tensorflow/include/unsupported/Eigen/CXX11/src/Tensor/TensorRandom.h(217): warning: calling a constexpr host function from a host device function is not all`

And if i run sudo make, i receive following:

`

python setup.py build_ext --inplace
running build_ext
skipping 'utils/bbox.c' Cython extension (up-to-date)
skipping 'utils/nms.c' Cython extension (up-to-date)
skipping 'nms/cpu_nms.c' Cython extension (up-to-date)
skipping 'nms/gpu_nms.cpp' Cython extension (up-to-date)
rm -rf build
bash make.sh
Traceback (most recent call last):
File "", line 1, in
ImportError: No module named tensorflow
make.sh: line 13: nvcc: command not found
g++: error: GOOGLE_CUDA=1: No such file or directory
`

Can anyone help me with that?

Kind Regards
Igor

liuqi05 · 2017-09-01T02:55:15Z

@louisquinn , Hi, i follow your advices, and i copy your roi_pooling.so fie to my repo. And modify demo.py file to add os.environ['CUDA_VISIBLE_DEVICES'] = ''. Then i run the demo, but it display:
Traceback (most recent call last):
File "./tools/demo.py", line 11, in
from networks.factory import get_network
File "/home/joseph/test/Faster-RCNN_TF/tools/../lib/networks/init.py", line 8, in
from .VGGnet_train import VGGnet_train
File "/home/joseph/test/Faster-RCNN_TF/tools/../lib/networks/VGGnet_train.py", line 2, in
from networks.network import Network
File "/home/joseph/test/Faster-RCNN_TF/tools/../lib/networks/network.py", line 3, in
import roi_pooling_layer.roi_pooling_op as roi_pool_op
File "/home/joseph/test/Faster-RCNN_TF/tools/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in
_roi_pooling_module = tf.load_op_library(filename)
File "/home/joseph/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 64, in load_op_library
None, None, error_msg, error_code)
tensorflow.python.framework.errors_impl.NotFoundError: libcudart.so.8.0: cannot open shared object file: No such file or directory.
Btw, am trying to run this on CPU and my computer has no GPU. So can you give me some advice about this error? Thank you in advance.

louisquinn · 2017-09-01T03:34:01Z

@liuqi05
It looks like you didn't install CUDA 8.0 and CuDNN 5.1.
For this method to work, you have to set your system up as though you do have a GPU.
Replacing the roi_pooling.so file is just so you don't have to compile it yourself.

I would like to refer you to the official Tensorflow Object Detection API:
https://github.com/tensorflow/models/tree/master/object_detection
All you need to do is add the os.environ['CUDA_VISIBLE_DEVICES'] = '' line to run on CPU with this framework

liuqi05 · 2017-09-01T05:47:44Z

@louisquinn, thank you for your quick reply. But i want to know which file i should add the os.environ['CUDA_VISIBLE_DEVICES'] = '' line to run on CPU with the framework you suggest. train.py and eval.py files ?

louisquinn · 2017-09-01T05:55:06Z

For the official framework:
If you installed Tensorflow without GPU support and you don't have a GPU, it will automatically process on the CPU.

If you have a GPU and installed with GPU support you will have to add the os.environ line.
If you add the os.environ line it should be defined at any point before you define your tf.Session

liuqi05 · 2017-09-01T06:04:40Z

@louisquinn, Now i understand. I do not need add the line to files. Because i installed Tensorflow without GPU support. Thank you for your patience. Now i am trying to run locally step by step. When i encounter problem, may be i need your help again. And thank you again.

louisquinn · 2017-09-01T06:06:56Z

@liuqi05
No worries! I recommend starting with one of the pre-trained models to learn how the framework works.
You can email me direct at louisquinn.contact@gmail.com

liuqi05 · 2017-09-01T06:31:14Z

@louisquinn, Thank you very much. I will send mail to you.

dongdongrj · 2017-09-04T02:24:18Z

Hi all, I want know if the anaconda3 and python3.6 can be run the project?
In my environment the error log report as below:
ModuleNotFoundError: No module named 'easydict'
(tensorflow) dongdong@ubuntu:~/ai/tensorflow/Faster-RCNN_TF$ conda install -c https://conda.anaconda.org/auto easydict
Fetching package metadata .............
Solving package specifications: .

UnsatisfiableError: The following specifications were found to be in conflict:

easydict -> python 2.7* -> openssl 1.0.1*
python 3.6*
Use "conda info " to see the dependencies for each package.

Thanks!

dongdongrj · 2017-09-05T08:43:36Z

@louisquinn
Hi , I want know if the anaconda3 and python3.6 can be run the project?
In my environment the error log report as below:
ModuleNotFoundError: No module named 'easydict'
(tensorflow) dongdong@ubuntu:~/ai/tensorflow/Faster-RCNN_TF$ conda install -c https://conda.anaconda.org/auto easydict
Fetching package metadata .............
Solving package specifications: .

UnsatisfiableError: The following specifications were found to be in conflict:

easydict -> python 2.7* -> openssl 1.0.1*
python 3.6*
Use "conda info " to see the dependencies for each package.
Thanks!

Nofcity · 2017-11-27T12:13:43Z

@jhcruvinel ,I have no NVIDIA's card ,but i run make.sh and compile with CUDA, installed the CUDA driver ,when i do "python demo.py --cpu --model /Faster-RCNN_TF-master/input_model/VGGnet_fast_rcnn_iter_70000.ckpt".The result is this :Loaded network /Faster-RCNN_TF-master/input_model/VGGnet_fast_rcnn_iter_70000.ckpt
NVIDIA: no NVIDIA devices found
unknown error
so what should i do？thanks！～～～

chaochaow mentioned this issue Mar 1, 2017

undefined symbol in roi_pooling_layer #90

Open

cannot run demo on CPU mode #36

cannot run demo on CPU mode #36

Comments

teddybearz commented Nov 30, 2016

teddybearz commented Nov 30, 2016

tyyyang commented Dec 10, 2016

donnyyou commented Dec 17, 2016

jacobunderlinebenseal commented Dec 26, 2016

jaig commented Jan 9, 2017

nsivaramakrishnan commented Jan 9, 2017

jaig commented Jan 11, 2017

oplkqingy commented Jan 22, 2017

raviv commented Jan 22, 2017

civilman628 commented Feb 5, 2017 • edited Loading

DiegoGLagash commented Feb 6, 2017

pbarker commented Feb 12, 2017

EunmiKang commented Feb 13, 2017

andresrommier commented Feb 15, 2017

wxwang0601 commented Mar 11, 2017

louisquinn commented Mar 17, 2017

guotong1988 commented Apr 1, 2017

shinyke commented Apr 7, 2017

jhcruvinel commented Apr 11, 2017

jhcruvinel commented Apr 11, 2017

jhcruvinel commented Apr 17, 2017

ghost commented May 7, 2017

jhcruvinel commented May 8, 2017

liydxl commented May 12, 2017

sidak commented Jul 6, 2017

louisquinn commented Jul 12, 2017

sunzj commented Jul 13, 2017

lfc87 commented Jul 19, 2017 • edited Loading

liuqi05 commented Sep 1, 2017

louisquinn commented Sep 1, 2017

liuqi05 commented Sep 1, 2017

louisquinn commented Sep 1, 2017

liuqi05 commented Sep 1, 2017

louisquinn commented Sep 1, 2017

liuqi05 commented Sep 1, 2017

dongdongrj commented Sep 4, 2017

dongdongrj commented Sep 5, 2017

Nofcity commented Nov 27, 2017

civilman628 commented Feb 5, 2017 •

edited

Loading

lfc87 commented Jul 19, 2017 •

edited

Loading