Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building custom op instructions out of date #13607

Closed
yaroslavvb opened this issue Oct 10, 2017 · 24 comments
Closed

Building custom op instructions out of date #13607

yaroslavvb opened this issue Oct 10, 2017 · 24 comments
Assignees
Labels
type:build/install Build and install issues

Comments

@yaroslavvb
Copy link
Contributor

yaroslavvb commented Oct 10, 2017

Following instructions here
https://www.tensorflow.org/extend/adding_an_op

To try to rebuild this op

First I ran into issue with nsync headers, fixed by following
#12482 (comment)

Then while trying to load the .so file I run into
tensorflow.python.framework.errors_impl.NotFoundError: ./max_align_bytes_op.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

So the definition for tensorflow::OpKernel is missing

tf commit: 22a886b
cc @allenlavoie

@allenlavoie
Copy link
Member

Sorry about that. The "adding an op" docs are updated at head (and will go live once 1.4 is released), but right now there's a bit of a mismatch. The important bits:

TF_LIB=$(python -c 'import tensorflow as tf; print(tf.sysconfig.get_lib())')

And add to the compiler flags:
-L$TF_LIB -ltensorflow_framework

It is also possible to build with --config=monolithic if you want to load custom ops with RTLD_GLOBAL in Python (should work just as it did pre-1.4).

@allenlavoie
Copy link
Member

@MarkDaoust maybe we should add a little warning on some of these docs that they're versioned, with a link to head?

@MarkDaoust MarkDaoust self-assigned this Oct 10, 2017
@yaroslavvb
Copy link
Contributor Author

Thanks, that fixed it! Looks like desired alignment in latest tf is 32
There used to be a time when TF docs had a pull-down that allowed to select version of docs, including "Master".

@twmht
Copy link

twmht commented Nov 21, 2017

@allenlavoie

it did not work.

there is no libtensorflow_framework.so under $TL_LIB

My $TL_LIB is /home/tumh/.pyenv/versions/tensorflow/lib/python2.7/site-packages/tensorflow/core

@JuiHsuan-Kuo
Copy link

I add both TF_INC and TF_LIB but still cannot fix it.
Does someone has same problem?

@allenlavoie
Copy link
Member

@twmht that library path comes from a version before TensorFlow 1.4; maybe you have a couple different Python installations with different TensorFlow versions?

@JuiHsuan-Kuo which TensorFlow version, and what's the error? Note that the updated documentation for 1.4+ is at https://www.tensorflow.org/versions/master/extend/adding_an_op

@bearrito
Copy link

Are there relevant/accurate instructions for 1.3.0? I am seeing the same issue as @twmht with lack of tensorflow_framework from my 1.3.0 install.

@bearrito
Copy link

If I do upgrade to 1.4.0 (which my codebase cannot support) I am able to build the shared object and avoid the missing kernel call but then I'm faced with

ImportError: libtensorflow_framework.so: cannot open shared object file: No such file or directory

@allenlavoie
Copy link
Member

@bearrito
https://www.tensorflow.org/versions/r1.3/extend/adding_an_op are the 1.3 instructions. https://www.tensorflow.org/versions/master/extend/adding_an_op are for master.

What's the context for that ImportError? Importing TensorFlow?

@gbolcer
Copy link

gbolcer commented Dec 21, 2017

After messing around w/ the DYLD and LD lib paths, I ended up just brute forcing it....

bazel build --config=cuda //tensorflow/tools/lib_package:libtensorflow
cp bazel-bin/tensorflow/tools/lib_package/libtensorflow.tar.gz /opt/tensorlibs
tar xvfmz libtensorflow.tar.gz

And then just used linkopt

bazel build -c opt --config=cuda //dragnn/core:gen_dragnn_bulk_ops_py_wrappers_cc --verbose_failures --linkopt=/opt/tensorlibs/lib/libtensorflow_framework.so

@panfengli
Copy link

How about the build instructions for GPU ops. Tensorflow tutorials even never mention it. @allenlavoie

I find one here, but it also meets the same problem

longouyang added a commit to longouyang/fast-LayerNorm-TF that referenced this issue Nov 9, 2018
flags for building ops changed in october 2017: tensorflow/tensorflow#13607

i got the new flags from https://www.tensorflow.org/guide/extend/op#compile_the_op_using_your_system_compiler_tensorflow_binary_installation

compiling gives lots of warnings about alignments being off, but all tests pass
@kasrayazdani
Copy link

I'm using tf.14. I get the same error when running a custom op: undefined symbol: _ZTIN10tensorflow8OpKernelE

when I compile using this flag: -L$TF_LIB -ltensorflow_framework,

I get this error:
/usr/bin/ld: cannot find -ltensorflow_framework.
can someone help please?

@vaibkumr
Copy link

I'm using tf.14. I get the same error when running a custom op: undefined symbol: _ZTIN10tensorflow8OpKernelE

when I compile using this flag: -L$TF_LIB -ltensorflow_framework,

I get this error:
/usr/bin/ld: cannot find -ltensorflow_framework.
can someone help please?

I am having the exact same issue

@kasrayazdani
Copy link

I resolved the issue after spending days of scraping the internet to compile a custom tf op. I am including the steps that I had to take for other people to use.

  • downgrade your python version to 3.6
  • downgrade your tf to 1.10.0
  • use CUDA 9

@MarkDaoust
Copy link
Member

@kasrayazdani, sorry I missed your first post. That does not sound like fun.

There appears to be a new repository for this:

https://github.com/tensorflow/custom-op

I will see if I can delete the old doc and just leave a pointer to this new repo.

@MarkDaoust MarkDaoust reopened this Aug 20, 2019
@Saduf2019
Copy link
Contributor

yaroslavvb
please let us know if the comments above help resolve the issue.

@HH721
Copy link

HH721 commented Dec 26, 2021

TF : 1.13.2 (build from source) / cuda 10.0 , cudnn7.4.24.2 , gpu-driver-460
OS : ubuntu 18.4
gcc : 7.5.0
cmake : 3.22.1
python 3.6.9

My tensorflow construction order :

bazel build --config=cuda --cxxopt="-D_GLIBCXX_USE_CXX11_ABI=0" //tensorflow:libtensorflow_cc.so //tensorflow/tools/pip_package:build_pip_package

I also made a custom op for myself, but
while trying to load the .so file I run into

_fast_oll = tf.load_op_library('obj_loader/cmake-build/libobj_loader.so')

File "/home/shyan/testtensorflow/azmoon/lib/python3.6/site-packages/tensorflow/python/framework/load_library.py", line 61, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: obj_loader/cmake-build/libobj_loader.so: undefined symbol: _ZTIN10tensorflow8OpKernelE

Can anyone help me?

@mohantym mohantym self-assigned this May 11, 2022
@mohantym
Copy link
Contributor

mohantym commented May 13, 2022

Hi @yaroslavvb ! Have you checked latest document on a registering a custom op from 2.8 version yet? Thanks!

@mohantym mohantym added the stat:awaiting response Status - Awaiting response from author label May 13, 2022
@google-ml-butler
Copy link

This issue has been automatically marked as stale because it has no recent activity. It will be closed if no further activity occurs. Thank you.

@google-ml-butler google-ml-butler bot added the stale This label marks the issue/pr stale - to be closed automatically if no activity label May 20, 2022
@google-ml-butler
Copy link

Closing as stale. Please reopen if you'd like to work on this further.

@MarkDaoust
Copy link
Member

@anirudh161 has been working on this with some of our engineers.

@mohantym
Copy link
Contributor

Ok @MarkDaoust ! Reopening this issue then. Thank you!

@mohantym mohantym reopened this May 30, 2022
@mohantym mohantym added type:build/install Build and install issues and removed stale This label marks the issue/pr stale - to be closed automatically if no activity stat:awaiting response Status - Awaiting response from author labels May 30, 2022
@mohantym mohantym removed their assignment May 30, 2022
@anirudh161
Copy link
Contributor

Marking this issue as resolved as there has been no activity on it. Please open a new issue if you're facing issues with creating custom ops. Thanks!

@google-ml-butler
Copy link

Are you satisfied with the resolution of your issue?
Yes
No

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:build/install Build and install issues
Projects
None yet
Development

No branches or pull requests