Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no such target '@org_tensorflow//third_party/gpus/crosstool:crosstool' #318

Closed
jorgemf opened this issue Feb 10, 2017 · 16 comments
Closed
Labels
type:performance Performance Issue

Comments

@jorgemf
Copy link

jorgemf commented Feb 10, 2017

I am trying to compile tensorflow_model_server from master.

Error:

ERROR: no such target '@org_tensorflow//third_party/gpus/crosstool:crosstool': target 'crosstool' not declared in package 'third_party/gpus/crosstool' defined by /home/username/.cache/bazel/_bazel_jorge/2fd988219920b10e9ede8d3b5720f3d2/external/org_tensorflow/third_party/gpus/crosstool/BUILD.

Steps to reproduce:

  1. Configure tensorflow with CUDA support
cd tensorflow
./configure
cd ..
  1. Build //tensorflow_serving/model_servers:tensorflow_model_server
bazel build -c opt --config=cuda --genrule_strategy=standalone --spawn_strategy=standalone --verbose_failures  //tensorflow_serving/model_servers:tensorflow_model_server

As a side note when I try to compile tensorflow_model_server from an external project it works but doesn't have support for the GPU.

The solutions of #225 don't work

EDITED: finally I made this script in order to compile it with CUDA support: https://gist.github.com/jorgemf/0f2025a45e1568663f4c20551a5881f1

@mutasem-mattar
Copy link

I am facing same problem

@jorgemf
Copy link
Author

jorgemf commented Mar 1, 2017

I was able to make it compile. Here is an script to do it so: https://gist.github.com/jorgemf/0f2025a45e1568663f4c20551a5881f1

You only need to modify the variables and the exports with the values you want and everything works.

It works because:

  1. TensorFlow Serving needs to see the values of the variables, which ./configure doesn't export and they are not visible when compiling TensorFlow Serving
  2. in serving/tools/bazel.rc the you have to replace @org_tensorflow//third_party/gpus/crosstool with @local_config_cuda//crosstool:toolchain
  3. If your gcc version is >5 it wont work, so I set gcc-5 in serving/tensorflow/third_party/gpus/cuda_configure.bzl when it is available
  4. Finally I used -c opt --config=cuda --spawn_strategy=standalone as options to compile //tensorflow_serving/model_servers:tensorflow_model_server but it should work for other targets

@tianyapiaozi
Copy link
Contributor

@jorgemf works for me

@tsaiian
Copy link

tsaiian commented Mar 15, 2017

@jorgemf I got successful compile with your script, but it seems doesn't have support for the GPU.

after added with tf.device("/gpu") to mnist_saved_model.py, I got follow error message:

tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device to node 'Variable_1': Could not satisfy explicit device specification '/device:GPU:*' because no devices matching that specification are registered in this process; available devices: /job:localhost/replica:0/task:0/cpu:0

Steps to reproduce:

git clone --recurse-submodules https://github.com/tensorflow/serving
cd serving

export TF_NEED_CUDA=1
export TF_NEED_GCP=1
export TF_NEED_JEMALLOC=1
export TF_NEED_HDFS=0
export TF_NEED_OPENCL=0
export TF_ENABLE_XLA=0
export TF_CUDA_VERSION=8.0
export TF_CUDNN_VERSION=5
export TF_CUDA_COMPUTE_CAPABILITIES="3.5,5.2,6.1"
export CUDA_TOOLKIT_PATH="/usr/local/cuda"
export CUDNN_INSTALL_PATH="/usr/local/cuda"
export GCC_HOST_COMPILER_PATH="/usr/bin/gcc"
export PYTHON_BIN_PATH="/home/opt/anaconda/envs/py2/bin/python"
export CC_OPT_FLAGS="-march=native"
export PYTHON_LIB_PATH="/home/opt/anaconda/envs/py2/lib/python2.7/site-packages"

cd tensorflow
./configure
cd ..

# Ref: https://github.com/tensorflow/serving/issues/318#issuecomment-283498443
sed -i.bak 's/@org_tensorflow\/\/third_party\/gpus\/crosstool/@local_config_cuda\/\/crosstool:toolchain/g' tools/bazel.rc

bazel build -c opt --config=cuda --spawn_strategy=standalone //tensorflow_serving/model_servers:tensorflow_model_server

# add `with tf.device("/gpu")` to `mnist_saved_model.py`
sed -i '138s/.*/with tf.device("\/gpu"):/' tensorflow_serving/example/mnist_saved_model.py
sed -i '139s/.*/  if __name__ == "__main__":/' tensorflow_serving/example/mnist_saved_model.py
sed -i '140s/.*/    tf.app.run()/' tensorflow_serving/example/mnist_saved_model.py

bazel build //tensorflow_serving/example:mnist_saved_model
bazel-bin/tensorflow_serving/example/mnist_saved_model /tmp/mnist_model

@jorgemf
Copy link
Author

jorgemf commented Mar 15, 2017

with tf.device("/gpu") is not a device, it should be with tf.device("/gpu:0"). In order to check if it has been compiled with gpu support you only need to execute nvidia-smi when the program is launched, if it has support for GPU it will use all the GPU memory even before creating any graph.

@vtablan
Copy link

vtablan commented Mar 16, 2017

Thanks @jorgemf for the compile script. It worked for me too, and it was a lot simpler than my solution at #349 :). However, it doesn't seem to be using the GPU for me either. tensorflow_model_server does not appear as a process listed by nvidia-smi.

My saved model does not explicitly request GPU allocation, but it should use the GPU by default, if available. And, as you say, tf-serving should allocate most GPU RAM on launch, and it clearly doesn't.
This used to work with an older version of serving, so I guess it's related to recent changes.

@jorgemf
Copy link
Author

jorgemf commented Mar 16, 2017

@vtablan have you set $TENSORFLOW_SERVING_REPO_PATH? Otherwise it might not work.

I have just tested and it doesn't compile, I am not sure whether is my scripts fault or due to some internal change. Anyway I cannot review the scrip for every commit. Here is the error:

ERROR: 417f6219aa9e6aa8dd92c15ce8c78038/external/tf_serving/tensorflow_serving/batching/BUILD:141:1: C++ compilation of rule '@tf_serving//tensorflow_serving/batching:batching_session' failed: crosstool_wrapper_driver_is_not_gcc failed: error executing command external/local_config_cuda/crosstool/clang/bin/crosstool_wrapper_driver_is_not_gcc -U_FORTIFY_SOURCE '-D_FORTIFY_SOURCE=1' -fstack-protector -fPIE -Wall -Wunused-but-set-parameter ... (remaining 174 argument(s) skipped): com.google.devtools.build.lib.shell.BadExitStatusException: Process exited with status 1.

Try a version of 15 days ago, the same I used to compile it. It should work. In my experience TensorFlow Serving is in development and broken a lot of times.

@vtablan
Copy link

vtablan commented Mar 17, 2017

Interesting... I had cloned my repository just before posting my previous comment, and it compiled fine for me, with your script. I had edited your script to hardcode the location of the repository, and the use of pyhton3, and associated python path. Other than that, I have made no changes to your script.

What I was saying above is that the tensorflow_model_server binary produced by the compilation does not seem to be using the GPU. I can work with that (for my model, CPU is fast enough at application time), so I don't think I'll spend more time on this. If I find a few minutes, I might try compiling the r0.5.1 version, which is meant to be a 'release', and see if that fares any better.

I'll post an update if that's successful.

@vtablan
Copy link

vtablan commented Mar 17, 2017

@jorgemf Success - I now have a tf-model_server that does indeed use the GPU. To get there I used:

  • tag 0.5.1 for tf-serving,
  • tag v1.0.1 for the tensorflow sourcetree contained therein,
  • your compile script (slightly modified to use my local paths)

Thanks again for providing the script!

@jorgemf
Copy link
Author

jorgemf commented Mar 30, 2017

Master compiles now for me with GPU support. Closing the issue

@sailor88128
Copy link

I wonder if the script is the new version... below is the error: @jorgemf

compile_tensorflow_serving.sh: 35: compile_tensorflow_serving.sh: function: not found
/usr/local/lib/python2.7/dist-packages,/usr/lib/python2.7/dist-packages
compile_tensorflow_serving.sh: 63: compile_tensorflow_serving.sh: Syntax error: "}" unexpected

@jorgemf
Copy link
Author

jorgemf commented Dec 11, 2017

@sailor88128 It might be you are using another shell. It works for linux only

@sailor88128
Copy link

I just use it in nvidia-docker ubuntu 16. Oh, I use cudnn 6.0.21, but I have change 7.0 to 6.0 in the script, is that the problem?

@jorgemf
Copy link
Author

jorgemf commented Dec 12, 2017

@sailor88128 Yes it is. The script is very specific because it doesn't work. You have to use the versions in the script and the correct bazel version that I do not remember now. Otherwise it wont compile

@sailor88128
Copy link

Oh, got it. Thanks a lot.
Is there any image (dockerfile) you used, with ubuntu 16 + tensorflow-gpu + cuda8.0 + tf serving?
@jorgemf

@jorgemf
Copy link
Author

jorgemf commented Dec 12, 2017

@sailor88128, no. I used my local machine. You can try the official images of tensorflow: https://hub.docker.com/r/tensorflow/tensorflow/tags/

@peddybeats peddybeats added the type:performance Performance Issue label Nov 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:performance Performance Issue
Projects
None yet
Development

No branches or pull requests

7 participants