Skip to content
This repository has been archived by the owner on Apr 30, 2020. It is now read-only.

load 'tvm_runtime_op.so' error #3

Closed
zouzhene opened this issue Oct 21, 2019 · 22 comments
Closed

load 'tvm_runtime_op.so' error #3

zouzhene opened this issue Oct 21, 2019 · 22 comments

Comments

@zouzhene
Copy link

when run
tvm_runtime_ops = load_library.load_op_library('tvm_runtime_op.so')
get
tensorflow.python.framework.errors_impl.NotFoundError: tvm_runtime_op.so: undefined symbol: _ZN3tvm7runtime6Module12LoadFromFileERKSsS3_

@tobegit3hub
Copy link
Owner

It seems to be the problem of you dynamic library file. How do you build the tvm_runtime_op.so file? @zouzhene

@junrushao
Copy link

Doesn’t it look like some ABI issues?

@zouzhene
Copy link
Author

@tobegit3hub
`#!/bin/bash
set -x
set -e
TF_CFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_compile_flags()))') )
TF_LFLAGS=( $(python -c 'import tensorflow as tf; print(" ".join(tf.sysconfig.get_link_flags()))') )

g++ -g -std=c++11 -shared tvm_runtime_kernels.cc tvm_runtime_ops.cc -o tvm_runtime_op.so -fPIC ${TF_CFLAGS[@]} ${TF_LFLAGS[@]} -O2 -I${TVM_HOME}/include -I${TVM_HOME}/3rdparty/dmlc-core/include -I${TVM_HOME}/3rdparty/dlpack/include -I/usr/local/cuda/include -ltvm_runtime -L${TVM_HOME}/build -ldl -lpthread
`

Some warning like this:
In file included from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/logging.h:25:0, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/lib/core/refcount.h:22, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/tensor_coding.h:21, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/resource_handle.h:19, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/allocator.h:24, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:23, from tvm_runtime_kernels.cc:8: /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/default/logging.h:77:0: warning: "LOG" redefined #define LOG(severity) _TF_LOG_##severity ^ In file included from /usr/tvm/3rdparty/dmlc-core/include/dmlc/io.h:15:0, from /usr/tvm/include/tvm/runtime/module.h:29, from tvm_runtime_kernels.cc:5: /usr/tvm/3rdparty/dmlc-core/include/dmlc/./logging.h:257:0: note: this is the location of the previous definition #define LOG(severity) LOG_##severity.stream() ^ In file included from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/logging.h:25:0, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/lib/core/refcount.h:22, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/tensor_coding.h:21, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/resource_handle.h:19, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/allocator.h:24, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:23, from tvm_runtime_kernels.cc:8: /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/default/logging.h:89:0: warning: "VLOG" redefined #define VLOG(lvl) \ ^ In file included from /usr/tvm/3rdparty/dmlc-core/include/dmlc/io.h:15:0, from /usr/tvm/include/tvm/runtime/module.h:29, from tvm_runtime_kernels.cc:5: /usr/tvm/3rdparty/dmlc-core/include/dmlc/./logging.h:255:0: note: this is the location of the previous definition #define VLOG(x) LOG_INFO.stream() ^ In file included from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/logging.h:25:0, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/lib/core/refcount.h:22, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/tensor_coding.h:21, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/resource_handle.h:19, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/allocator.h:24, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:23, from tvm_runtime_kernels.cc:8: /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/default/logging.h:97:0: warning: "CHECK" redefined #define CHECK(condition) \ ^ In file included from /usr/tvm/3rdparty/dmlc-core/include/dmlc/io.h:15:0, from /usr/tvm/include/tvm/runtime/module.h:29, from tvm_runtime_kernels.cc:5: /usr/tvm/3rdparty/dmlc-core/include/dmlc/./logging.h:205:0: note: this is the location of the previous definition #define CHECK(x) \ ^ In file included from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/logging.h:25:0, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/lib/core/refcount.h:22, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/tensor_coding.h:21, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/resource_handle.h:19, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/allocator.h:24, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:23, from tvm_runtime_kernels.cc:8: /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/default/logging.h:251:0: warning: "CHECK_EQ" redefined #define CHECK_EQ(val1, val2) CHECK_OP(Check_EQ, ==, val1, val2) ^ In file included from /usr/tvm/3rdparty/dmlc-core/include/dmlc/io.h:15:0, from /usr/tvm/include/tvm/runtime/module.h:29, from tvm_runtime_kernels.cc:5: /usr/tvm/3rdparty/dmlc-core/include/dmlc/./logging.h:213:0: note: this is the location of the previous definition #define CHECK_EQ(x, y) CHECK_BINARY_OP(_EQ, ==, x, y) ^ In file included from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/logging.h:25:0, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/lib/core/refcount.h:22, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/platform/tensor_coding.h:21, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/resource_handle.h:19, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/allocator.h:24, from /usr/local/lib/python3.5/dist-packages/tensorflow/include/tensorflow/core/framework/op_kernel.h:23, from tvm_runtime_kernels.cc:8:

@zhuochenKIDD
Copy link

@zouzhene
I guess you use pre-built tensorflow? What's your compiler version?

@zouzhene
Copy link
Author

@zhuochenKIDD I use pre-built tensorflow ,version is tensorflow-gpu==1.15.0

@VoVAllen
Copy link

VoVAllen commented Oct 30, 2019

I met the same problem, with tensorflow 2.0.0

tensorflow.python.framework.errors_impl.NotFoundError: tvm_runtime_op.so: undefined symbol: _ZN3tvm7runtime6Module12LoadFromFileERKSsS3_

@VoVAllen
Copy link

VoVAllen commented Oct 30, 2019

Seems the linking error, which so should I link to get Module:LoadFromFile?

@junrushao
Copy link

On these issues, could you try “nm -g tvm_runtime_op.so” to see if similar symbols exist? If so, this would be indication of compiler ABI incompatibility

@VoVAllen
Copy link

VoVAllen commented Oct 30, 2019

I removed the --shared flag and compilation failed, saying

undefined reference to `tvm::runtime::Module::LoadFromFile(std::string const&, std::string const&)'

@VoVAllen
Copy link

image
It's not correctly linked

@VoVAllen
Copy link

VoVAllen commented Oct 30, 2019

image
I found this. I used the pip install version of tf 2.0. Should I change it to 1?

@VoVAllen
Copy link

Rebuilding tvm with cmake -DCMAKE_CXX_FLAGS=-D_GLIBCXX_USE_CXX11_ABI=0 .. solves the problem. However there's another error for llvm's cxx11 abi.........................................

@zhuochenKIDD
Copy link

@VoVAllen Perhaps you need to build llvm from source, using -D_GLIBCXX_USE_CXX11_ABI=0 : apache/tvm#521 (comment)

I guess @tobegit3hub uses self-built TensorFlow with high version gcc(>=5), so ABI incompatibility issue did not occur.

I you use pre-built TensorFlow from pip install, tf is built by gcc 4.8.5, check your env's compiler version for ABI incompatibility.

@VoVAllen
Copy link

It's annoying. You have to rebuilt tensorflow or llvm/tvm to solve this. Both of them took long time to do so...

@junrushao
Copy link

junrushao commented Nov 1, 2019

TVM can be viewed as two parts:

  1. libtvm.so: TVM compiler that does code generation, which may require LLVM for host-side code generation. The binary generated has stable pure C ABI, and guaranteed to be thread safe.
  2. libtvm_runtime.so: TVM runtime that can load and run operators generated by TVM compiler, which does not depend on LLVM or anything else.

Especially, TVM runtime can be run in header-only mode, which means it could be potentially build together with TensorFlow to avoid any ABI issues. In this case, I would say that it could potentially avoid having to rebuild LLVM.

@tobegit3hub
Copy link
Owner

Sorry for the late response.

We have updated the code for the final Python API and the build script was updated to avoid conflict of re-loading tvm_runtime. Would you try the latest code now with the instructions in README? @zhuochenKIDD @VoVAllen

@lsy643
Copy link

lsy643 commented Jan 3, 2020

@tobegit3hub I have a similar issue here. In order to build correctly, I have upgrade cmake to 3.16, build with c++14 rather than c++11, and add include_directories(/usr/local/cuda/include). Besides, I am also using self-built tensorflow with version 1.14

And when I am running the addone example, I got this error tvm_dso_op.so: undefined symbol: _ZTIN10tensorflow8OpKernelE.

@gmagogsfm
Copy link

@lsy643

I am getting same error:
tvm_dso_op.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Have you, by any chance, found a fix already? Thanks!

@lsy643
Copy link

lsy643 commented Mar 8, 2020

@gmagogsfm

Yes, I rewrite the CMakelists.txt. I use target_link_libraries rather than target_link_options to link the Tensorflow Library.

By the way, directly compile TVM runtime source files into your project rather than link the prebuilt library

@gmagogsfm
Copy link

@lsy643

Thanks for the replay Siyuan. If you use target_link_libraries, does that mean you are not getting link flags from tf.sysconfig.get_link_flags? It seems a bit fragile when TF changes it link flags in the future.

Meanwhile, I did some investigation and found the issue is that target_link_options place -ltensorflow_framework in front of compile targets in g++ command, this effectively discards all symbols in tensorflow_framework.so*.

I fixed the issue by building tvm_dso_op.so using Bazel, which handles library dependency ordering correctly.

@lsy643
Copy link

lsy643 commented Mar 8, 2020

@gmagogsfm
Yes, you are right about the extra works need if target_link_libraries is used. And good to know that you build your project with Bazel

@tobegit3hub
Copy link
Owner

Thanks all for the usage. Now the patch is already merged into TVM upstream and you can build tvm with USE_TF_TVMDSOOP=ON to use this.

The issue will be closed and only maintain in tvm codebase.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants