Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tensorflow.python.framework.errors_impl.NotFoundError: /home/research/data/hdrnet/hdrnet/lib/hdrnet_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE Backend TkAgg is interactive backend. Turning interactive mode on. #2

Open
qinghua2016 opened this issue Aug 7, 2017 · 17 comments

Comments

@qinghua2016
Copy link

qinghua2016 commented Aug 7, 2017

when I run the command: python train.py, it occures the error as follows:
tensorflow.python.framework.errors_impl.NotFoundError: /home/research/data/hdrnet/hdrnet/lib/hdrnet_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE
Backend TkAgg is interactive backend. Turning interactive mode on.
my tensorflow version is 1.1.0,do you know why?

@shun1024
Copy link

shun1024 commented Aug 7, 2017

@mgharbi got similar error here, it happens at

https://github.com/mgharbi/hdrnet/blob/master/hdrnet/hdrnet_ops.py#L27

Error Info:

tensorflow.python.framework.errors_impl.NotFoundError: hdrnet_ops.so: undefined symbol: _ZN10tensorflow7strings6StrCatB5cxx11ERKNS0_8AlphaNumE

Environment:

ubuntu 14.04
tensorflow 1.1.0
cuda 8.0


I solved it by add CFLAGS = -fPIC -I$(TF_INC) -O2 -D_GLIBCXX_USE_CXX11_ABI=0 in Makefile

@mgharbi
Copy link
Owner

mgharbi commented Aug 21, 2017

I unfortunately could not reproduce this error on gcc-5.0. Does adding the -D_GLIBCXX_USE_CXX11_ABI=0 flag help? I'll add it to the Makefile.

@dxue2012
Copy link

dxue2012 commented Aug 21, 2017

I encountered a similar error when using gcc-5.0 on ubuntu 16.04. Adding the D_GLIBCXX_USE_CXX11_ABI=0 flag did fix it for me. (I adapted this commit from @tcassou from his fork)

@tcassou
Copy link

tcassou commented Aug 22, 2017

Hi all,

  • As pointed out by @dxue2012 I had the same issue with Ubuntu 16.04and a version of gcc > 5.0.0, and adding the flag -D _GLIBCXX_USE_CXX11_ABI=0 solves it.
  • On OSX (tested with version 10.12.6), you have to replace this flag by -undefined dynamic_lookup.
  • I could run everything with trouble on CentOS 7 (except that CUDA was installed under a different path).
    Hope it helps!

@mgharbi
Copy link
Owner

mgharbi commented Aug 22, 2017

Merci Thomas,
Feel free to pull-request your updates, otherwise I'll add those changes to the Makefile and close the issue.

@tcassou
Copy link

tcassou commented Aug 22, 2017

Salut Michael,
I committed a few other small changes to my forked version of your repo (great work by the way!), so it's a bit less convenient to send a PR at this point...
Related to the Makefile, I ended up inserting some comments, since I'm always switching between different machines, and did not want to push it much further (OS dependent Makefile):

# Use flag -D _GLIBCXX_USE_CXX11_ABI=0 for gcc > 5
# Use flag -undefined dynamic_lookup for OSX
CFLAGS = -fPIC -I$(TF_INC) -O2
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart
# Use flag -D _GLIBCXX_USE_CXX11_ABI=0 for gcc > 5
NVFLAGS = -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC -I $(TF_INC) \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true

Thomas

@Rachnog
Copy link

Rachnog commented Aug 28, 2017

Hi to all, I have the same problem, but adding flag into Makefile doesn't help me and I still see the error. Anyone can know what can be the issue?

Ubuntu 16.04; Tensorflow 1.3 built with Bazel; CUDA 8.0

@tcassou
Copy link

tcassou commented Aug 29, 2017

@Rachnog I installed tensorflow-gpu v1.1 directly with pip, and did not build it myself, that could explain the difference.

@tisawe
Copy link

tisawe commented Aug 30, 2017

I installed tensorflow-gpu v1.0.1 and v1.3 with pip, and your solution does not work for me.

@22avinash
Copy link

Hi,
Any update on this issue?
I am getting the same error after building the Makefile from hdrnet directory.
I am on ubuntu 16.04 with gcc and g++ 4.8, cuda 8.0, tensorflow 1.3.
Also setting D_GLIBCXX_USE_CXX11_ABI=0 did not helped me.
Can anyone help me with this?

@cchen156
Copy link

@22avinash @tisawe Did you solve the problem finally?

@xxAna
Copy link

xxAna commented Oct 25, 2018

Hello,
I am on Linux 9.4, tensorflow 1.1, cuda 8.0
Also setting D_GLIBCXX_USE_CXX11_ABI=0 did not helped me.
Did you solve the problem finally? @mgharbi @22avinash @tisawe @cchen156

@dongrongliang
Copy link

dongrongliang commented Jan 18, 2019

Fixed it by setting -D _GLIBCXX_USE_CXX11_ABI=1, replacing CC = c++ with CC = g++ and converting system prior gcc version to 4.8

@anilsathyan7
Copy link

Facing similar issue when we try to freeze the pretrained models ...

/hdrnet/lib/hdrnet_ops.so: undefined symbol: Z37BilateralSliceApplyGradKernelLauncherRKN5Eigen9GpuDeviceEPKfPKxS4_S6_S4_S6_S4_bPfS7_S7

Using gcc 4.8, Python 2.7, Ubuntu 16.04, TF: 1.12.0, CUDA 9.0
Also tried setting flags to -D _GLIBCXX_USE_CXX11_ABI=1 and -D _GLIBCXX_USE_CXX11_ABI=0 ..

Here is the Makefile : -

TF_INC ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())'`
TF_LIB ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())'`
# TF_INC ?= /usr/local/lib/python2.7/dist-packages/tensorflow/include
CUDA_HOME ?= /usr/local/cuda

SRC_DIR = ops

BUILD_DIR = build
LIB_DIR = lib

CC = g++ -std=c++11
NVCC = nvcc -std c++11
CFLAGS = -D_GLIBCXX_USE_CXX11_ABI=1 -I$(TF_INC)/external/nsync/public -L$(TF_LIB) -ltensorflow_framework -fPIC -I$(TF_INC)  
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart
NVFLAGS = -x cu -Xcompiler -fPIC -I$(TF_INC) -I$(SRC_DIR)\
					-gencode=arch=compute_30,code=\"sm_30,compute_30\" \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true --ptxas-options=-v -lineinfo


SRC = bilateral_slice.cc
CUDA_SRC = bilateral_slice.cu.cc
CUDA_OBJ = $(addprefix $(BUILD_DIR)/,$(CUDA_SRC:.cc=.o))
SRCS = $(addprefix $(SRC_DIR)/, $(SRC))

all: $(LIB_DIR)/hdrnet_ops.so

# Main library
$(LIB_DIR)/hdrnet_ops.so: $(CUDA_OBJ) $(LIB_DIR) $(SRCS)
	$(CC) -shared -o $@ $(SRCS) $(CUDA_OBJ) $(CFLAGS) $(LDFLAGS) 

# Cuda kernels
$(BUILD_DIR)/%.o: $(SRC_DIR)/%.cc $(BUILD_DIR)
	$(NVCC) -c  $< -o $@ $(NVFLAGS)

$(BUILD_DIR):
	mkdir -p $@


$(LIB_DIR):
	mkdir -p $@

clean:
	rm -rf $(BUILD_DIR) $(LIB_DIR)


@SystemErrorWang
Copy link

encountered similar problem trying to import the hdrnet_ops, the error message is as below:
NotFoundError: /home/wangxinrui/Downloads/hdr_models/hdrnet-master/hdrnet/lib/hdrnet_ops.so: undefined symbol: _ZN10tensorflow12OpDefBuilder4AttrENSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
I am working on ubuntu16.04, python3.6.8, cuda9.0 with cudnn7.3.1, tensorflow 1.12.0. I tried almost every methods discusses above (and other related issues) but still cannot correctly make the op. any suggestion or help are appreciated.

@ColdCodeCool
Copy link

Facing similar issue when we try to freeze the pretrained models ...

/hdrnet/lib/hdrnet_ops.so: undefined symbol: Z37BilateralSliceApplyGradKernelLauncherRKN5Eigen9GpuDeviceEPKfPKxS4_S6_S4_S6_S4_bPfS7_S7

Using gcc 4.8, Python 2.7, Ubuntu 16.04, TF: 1.12.0, CUDA 9.0
Also tried setting flags to -D _GLIBCXX_USE_CXX11_ABI=1 and -D _GLIBCXX_USE_CXX11_ABI=0 ..

Here is the Makefile : -

TF_INC ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_include())'`
TF_LIB ?= `python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())'`
# TF_INC ?= /usr/local/lib/python2.7/dist-packages/tensorflow/include
CUDA_HOME ?= /usr/local/cuda

SRC_DIR = ops

BUILD_DIR = build
LIB_DIR = lib

CC = g++ -std=c++11
NVCC = nvcc -std c++11
CFLAGS = -D_GLIBCXX_USE_CXX11_ABI=1 -I$(TF_INC)/external/nsync/public -L$(TF_LIB) -ltensorflow_framework -fPIC -I$(TF_INC)  
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart
NVFLAGS = -x cu -Xcompiler -fPIC -I$(TF_INC) -I$(SRC_DIR)\
					-gencode=arch=compute_30,code=\"sm_30,compute_30\" \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true --ptxas-options=-v -lineinfo


SRC = bilateral_slice.cc
CUDA_SRC = bilateral_slice.cu.cc
CUDA_OBJ = $(addprefix $(BUILD_DIR)/,$(CUDA_SRC:.cc=.o))
SRCS = $(addprefix $(SRC_DIR)/, $(SRC))

all: $(LIB_DIR)/hdrnet_ops.so

# Main library
$(LIB_DIR)/hdrnet_ops.so: $(CUDA_OBJ) $(LIB_DIR) $(SRCS)
	$(CC) -shared -o $@ $(SRCS) $(CUDA_OBJ) $(CFLAGS) $(LDFLAGS) 

# Cuda kernels
$(BUILD_DIR)/%.o: $(SRC_DIR)/%.cc $(BUILD_DIR)
	$(NVCC) -c  $< -o $@ $(NVFLAGS)

$(BUILD_DIR):
	mkdir -p $@


$(LIB_DIR):
	mkdir -p $@

clean:
	rm -rf $(BUILD_DIR) $(LIB_DIR)

Hi, did you solve your problem? I met the same issue as you.

@stefanielinear
Copy link

Custom ops are registered by linking against libtensorflow_framework.so in TensorFlow 1.4 and above.

So refactor Makefile as follows.

TF_CFLAGS ?= `python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))'`
TF_LFLAGS ?= `python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))'`

# TF_INC ?= /usr/local/lib/python2.7/dist-packages/tensorflow/include
CUDA_HOME ?= /usr/local/cuda-9.2 # Replace with your cuda home

SRC_DIR = ops

BUILD_DIR = build
LIB_DIR = lib

CC = c++ -std=c++11
NVCC = nvcc -std c++11
CFLAGS = -fPIC -O2 $(TF_CFLAGS)
LDFLAGS = -L$(CUDA_HOME)/lib64 -lcudart $(TF_LFLAGS)
NVFLAGS = -DGOOGLE_CUDA=1 -x cu -Xcompiler -fPIC $(TF_CFLAGS) \
					-expt-relaxed-constexpr -Wno-deprecated-gpu-targets -ftz=true


SRC = bilateral_slice.cc
CUDA_SRC = bilateral_slice.cu.cc
CUDA_OBJ = $(addprefix $(BUILD_DIR)/,$(CUDA_SRC:.cc=.o))
SRCS = $(addprefix $(SRC_DIR)/, $(SRC))

all: $(LIB_DIR)/hdrnet_ops.so

# Main library
$(LIB_DIR)/hdrnet_ops.so: $(CUDA_OBJ) $(LIB_DIR) $(SRCS)
	$(CC) -shared -o $@ $(SRCS) $(CUDA_OBJ) $(CFLAGS) $(LDFLAGS) 

# Cuda kernels
$(BUILD_DIR)/%.o: $(SRC_DIR)/%.cc $(BUILD_DIR)
	$(NVCC) -c  $< -o $@ $(NVFLAGS)

$(BUILD_DIR):
	mkdir -p $@


$(LIB_DIR):
	mkdir -p $@

clean:
	rm -rf $(BUILD_DIR) $(LIB_DIR)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests