Undefined symbol: _ZTIN10tensorflow8OpKernelE #108

GaryWooCN · 2017-11-07T12:48:16Z

Hi, I am running the master trunk and encounter the error when do training. Could anyone help on this? Thanks.

File "./faster_rcnn/../lib/roi_pooling_layer/roi_pooling_op.py", line 5, in
_roi_pooling_module = tf.load_op_library(filename)
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/load_library.py", line 56, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename, status)
File "/usr/local/anaconda2/lib/python2.7/site-packages/tensorflow/python/framework/errors_impl.py", line 473, in exit
c_api.TF_GetCode(self.status.status))
tensorflow.python.framework.errors_impl.NotFoundError: ./faster_rcnn/../lib/roi_pooling_layer/roi_pooling.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

The ./lib/make.sh is as following:

!/usr/bin/env bash
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
echo $TF_INC

CUDA_PATH=/usr/local/cuda-8.0/

cd roi_pooling_layer

nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_60

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc
roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

for gcc5-built tf

#g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=1 -o roi_pooling.so roi_pooling_op.cc \

roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

cd ..

add building psroi_pooling layer

cd psroi_pooling_layer
nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_60

#g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc \

psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o psroi_pooling.so psroi_pooling_op.cc
psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

cd ..

freeksg66 · 2017-11-09T02:25:48Z

tensorflow/tensorflow#13607
I use this issue and fixed it.

Kongsea · 2017-12-06T04:33:53Z

I encountered exactly this error too.
Have you solved it now?

Kongsea · 2017-12-06T05:04:27Z

I have downloaded roi_pooling.so from https://github.com/CharlesShang/TFFRCNN/blob/roi_pooling/lib/roi_pooling_layer/roi_pooling.so and replaced my compiled roi_pooling.so according to @CharlesShang .
It encountered another error:
tensorflow.python.framework.errors_impl.NotFoundError: faster_rcnn/../lib/roi_pooling_layer/roi_pooling.so: invalid ELF header

Kongsea · 2017-12-06T09:29:58Z

I finally downgraded tensorflow from 1.4 to 1.3 and added -D_GLIBCXX_USE_CXX11_ABI=0, then this problem was solved.

yh284914425 · 2017-12-09T09:49:46Z

where to add -D_GLIBCXX_USE_CXX11_ABI=0？
and I use tensorflow_gpu-1.4.0-cp27-none-linux_x86_64.whl and my gcc version is 5.4.0 。The ./lib/make.sh is as following.How should the file be modified? Can you help me? Thanks

`#!/usr/bin/env bash
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
echo $TF_INC

CUDA_PATH=/usr/local/cuda/

cd roi_pooling_layer

nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

#g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc \

roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

for gcc5-built tf

g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc
roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS -D_GLIBCXX_USE_CXX11_ABI=0
-lcudart -L $CUDA_PATH/lib64
cd ..

add building psroi_pooling layer

cd psroi_pooling_layer
nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_52

g++ -std=c++11 -shared -o psroi_pooling.so psroi_pooling_op.cc
psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

#g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o psroi_pooling.so psroi_pooling_op.cc \

psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

cd ..`
@Kongsea

yh284914425 · 2017-12-09T09:53:02Z

I don't know which places need to be annotated, and those places need to be modified.please help me @Kongsea

Kongsea · 2017-12-11T02:21:46Z

Downgrade your tensorflow to r1.3.

Try to modify this line
g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS -D_GLIBCXX_USE_CXX11_ABI=0 -lcudart -L $CUDA_PATH/lib64

to

g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc roi_pooling_op.cu.o -I $TF_INC -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS -D_GLIBCXX_USE_CXX11_ABI=0 -lcudart -L $CUDA_PATH/lib64

selinachenxi · 2018-01-26T13:42:31Z

It doesn't have to downgrade to 1.3. I am using 1.4 with gcc 5.4.
In make.sh file, add
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
at the beginning, then add
-L $TF_LIB -ltensorflow_framework
behind -L $CUDA_PATH/lib64
re make, it works.

zhangweilion · 2018-02-12T01:24:30Z

@selinachenxi
my tensorflow is 1.4 gcc 5.4
I modify the make.sh , just below, and it doesn't work

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')

#adding by zw
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
#end adding by zw

TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')

#adding by zw
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
#end adding by zw

CUDA_PATH=/usr/local/cuda/
CXXFLAGS=''

if [[ "$OSTYPE" =~ ^darwin ]]; then
CXXFLAGS+='-undefined dynamic_lookup'
fi

cd roi_pooling_layer

if [ -d "$CUDA_PATH" ]; then
nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc
-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $CXXFLAGS
-arch=sm_37

g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc \
	roi_pooling_op.cu.o -I $TF_INC  -D GOOGLE_CUDA=1 -fPIC $CXXFLAGS \
	-lcudart -L $TF_LIB -ltensorflow_framework -L $CUDA_PATH/lib64

else
g++ -std=c++11 -shared -o roi_pooling.so roi_pooling_op.cc
-I $TF_INC -fPIC $CXXFLAGS
fi

cd ..

Kongsea · 2018-02-22T06:49:03Z

This bash works:

#!/usr/bin/env bash
TF_INC=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())')
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
echo $TF_INC

CUDA_PATH=/usr/local/cuda/

cd roi_pooling_layer

nvcc -std=c++11 -c -o roi_pooling_op.cu.o roi_pooling_op_gpu.cu.cc \
	-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_61

g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o roi_pooling.so roi_pooling_op.cc \
	roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64 -L $TF_LIB -ltensorflow_framework

cd ..

cd psroi_pooling_layer

nvcc -std=c++11 -c -o psroi_pooling_op.cu.o psroi_pooling_op_gpu.cu.cc \
	-I $TF_INC -D GOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -arch=sm_61

g++ -std=c++11 -shared -D_GLIBCXX_USE_CXX11_ABI=0 -o psroi_pooling.so psroi_pooling_op.cc \
	psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64 -L $TF_LIB -ltensorflow_framework

cd ..

xmeng525 · 2019-02-21T02:33:45Z

I had similar problem because of namespace. I changed my "new_op.cu.cc" from

namespace tensorflow{
// my code
}

to

using namespace tensorflow;
// my code

and it is fixed.

vllsm · 2019-03-08T08:52:41Z

It doesn't have to downgrade to 1.3. I am using 1.4 with gcc 5.4.
In make.sh file, add
TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')
at the beginning, then add
-L $TF_LIB -ltensorflow_framework
behind -L $CUDA_PATH/lib64
re make, it works.

THX so much

leavewave · 2019-04-03T01:48:40Z

I had similar problem because of namespace. I changed my "new_op.cu.cc" from
namespace tensorflow{
// my code
}
to
using namespace tensorflow;
// my code
and it is fixed.

hi, where is this file? i cannot find it.

helinwang · 2021-03-11T20:40:39Z

I ran into similar issue, the problem was I manually compiled TF and tries to load another TF operator library. The problem is due two the two *.so files are compiled by different ABI.
The fix for me was compiling my custom TF with --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0"

E.g.,

bazel build --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" --config=v2 --copt=-mavx --copt=-msse4.2 //tensorflow/tools/pip_package:build_pip_package

ArmageddonKnight · 2022-05-03T18:49:23Z

Adding this linking option works for me: -Wl,--no-as-needed

Reference: https://stackoverflow.com/questions/48189818/undefined-symbol-ztin10tensorflow8opkernele

FeiDao7943 · 2022-06-07T15:38:26Z

I just avoid this issue in change version of g++, gcc, TF, and CUDA.
It works on both colab and physical computers.
You can try in this environment, that seems not so reasonable but effective.

Ubuntu 18.04.5 LTS
tensorflow-gpu==1.13.1
numpy==1.16.0 (this might be the key)
gcc (Ubuntu 5.5.0-12ubuntu1) 5.5.0
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CUDA 10.0

And "-D_GLIBCXX_USE_CXX11_ABI=0" in the "tf_xxxx_complie.sh" should be deleted

Brunda02 · 2022-11-17T10:08:02Z

My current environment is
tensorflow-gpu==1.13.1
gcc==7.5.0
CUDA=10.0
I am getting the same error .
Can anyone suggest which environment I should use

FeiDao7943 · 2022-11-17T11:25:43Z

@Brunda02 :
I hope this list is useful for you, especially the different place with yours. By the way, this environment is tested on the Google Colab and my PC, I am not so sure that it can work on other machine.

List:
Ubuntu 18.04.5 LTS
tensorflow-gpu==1.13.1
numpy==1.16.0 (this might be the key)
gcc (Ubuntu 5.5.0-12ubuntu1) 5.5.0
g++ (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
CUDA 10.0

And "-D_GLIBCXX_USE_CXX11_ABI=0" in the "tf_xxxx_complie.sh" should be deleted

Brunda02 · 2022-11-17T12:25:08Z

@FeiDao7943 what is tf_xxxx_complie.sh?

FeiDao7943 · 2022-11-18T04:33:34Z

@Brunda02 tf_xxxx_complie.sh total 3 files. In location: ./frustum-pointnets-master/models/tf_ops/ there are 3 folders, and there is a file named tf_xxxx_complie.sh in each folder, which xxxx is the name of the folder. And each folder just has only one .sh file.

And "-D_GLIBCXX_USE_CXX11_ABI=0" in the tf_xxxx_complie.sh should be deleted, if not exist then ignore it.

Zardinality mentioned this issue Dec 14, 2017

have you tried it in tensorflow1.4? Zardinality/TF-deformable-conv#5

Open

This was referenced Sep 26, 2018

make lib wrong FakerYFX/InceptText-Tensorflow#5

Open

cd /lib;./make.sh error FakerYFX/InceptText-Tensorflow#7

Open

JulesDoe mentioned this issue May 15, 2019

environment zju3dv/mvpose#13

Closed

lychan110 mentioned this issue Jun 5, 2019

NameError: global name 'nn_distance' is not defined optas/latent_3d_points#21

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Undefined symbol: _ZTIN10tensorflow8OpKernelE #108

Undefined symbol: _ZTIN10tensorflow8OpKernelE #108

GaryWooCN commented Nov 7, 2017

freeksg66 commented Nov 9, 2017

Kongsea commented Dec 6, 2017

Kongsea commented Dec 6, 2017

Kongsea commented Dec 6, 2017 •

edited

Loading

yh284914425 commented Dec 9, 2017

yh284914425 commented Dec 9, 2017

Kongsea commented Dec 11, 2017 •

edited

Loading

selinachenxi commented Jan 26, 2018

zhangweilion commented Feb 12, 2018

Kongsea commented Feb 22, 2018 •

edited

Loading

xmeng525 commented Feb 21, 2019 •

edited

Loading

vllsm commented Mar 8, 2019

leavewave commented Apr 3, 2019

helinwang commented Mar 11, 2021 •

edited

Loading

ArmageddonKnight commented May 3, 2022

FeiDao7943 commented Jun 7, 2022

Brunda02 commented Nov 17, 2022

FeiDao7943 commented Nov 17, 2022

Brunda02 commented Nov 17, 2022

FeiDao7943 commented Nov 18, 2022

Undefined symbol: _ZTIN10tensorflow8OpKernelE #108

Undefined symbol: _ZTIN10tensorflow8OpKernelE #108

Comments

GaryWooCN commented Nov 7, 2017

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

for gcc5-built tf

roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

add building psroi_pooling layer

psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

freeksg66 commented Nov 9, 2017

Kongsea commented Dec 6, 2017

Kongsea commented Dec 6, 2017

Kongsea commented Dec 6, 2017 • edited Loading

yh284914425 commented Dec 9, 2017

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

roi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

for gcc5-built tf

add building psroi_pooling layer

if you install tf using already-built binary, or gcc version 4.x, uncomment the two lines below

psroi_pooling_op.cu.o -I $TF_INC -fPIC -lcudart -L $CUDA_PATH/lib64

yh284914425 commented Dec 9, 2017

Kongsea commented Dec 11, 2017 • edited Loading

selinachenxi commented Jan 26, 2018

zhangweilion commented Feb 12, 2018

Kongsea commented Feb 22, 2018 • edited Loading

xmeng525 commented Feb 21, 2019 • edited Loading

vllsm commented Mar 8, 2019

leavewave commented Apr 3, 2019

helinwang commented Mar 11, 2021 • edited Loading

ArmageddonKnight commented May 3, 2022

FeiDao7943 commented Jun 7, 2022

Brunda02 commented Nov 17, 2022

FeiDao7943 commented Nov 17, 2022

Brunda02 commented Nov 17, 2022

FeiDao7943 commented Nov 18, 2022

Kongsea commented Dec 6, 2017 •

edited

Loading

Kongsea commented Dec 11, 2017 •

edited

Loading

Kongsea commented Feb 22, 2018 •

edited

Loading

xmeng525 commented Feb 21, 2019 •

edited

Loading

helinwang commented Mar 11, 2021 •

edited

Loading