Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building TF serving from source on Jetson Xavier #1277

Closed
ewirbel opened this issue Mar 14, 2019 · 9 comments
Closed

Building TF serving from source on Jetson Xavier #1277

ewirbel opened this issue Mar 14, 2019 · 9 comments

Comments

@ewirbel
Copy link

ewirbel commented Mar 14, 2019

Feature Request

Describe the problem the feature is intended to solve

I am trying to get TF serving 1.13 with GPU support (server side api) running on a Jetson AGX Xavier board. I have managed to use the Tensorflow pip wheel provided by NVidia, and the to install the client side python package, but I need the model server (to run remote inferences on the board).

Describe the solution

Provide docker images for aarch64, with GPU support, or provide a toolchain for aarch64.

Describe alternatives you've considered

I have unsuccessfully tried to build tensorflow serving from source:

  • the docker-ce client is not available for aarch64 so I cannot run the docker installation (and I cannot find official docker images for the board from NVidia)

  • I tried to replicate what is in the Dockerfile, by installing bazel, cloning the serving Github and running the same bazel command as the dockerfile. My bazel version is the following

bazel version
WARNING: The following rc files are no longer being read, please transfer their contents or import their path into one of the standard rc files:
.bazelrc
WARNING: --batch mode is deprecated. Please instead explicitly shut down your Bazel server using the command "bazel shutdown".
INFO: Invocation ID: 7aaba226-9820-41a1-90d8-685da07742f5
Build label: 0.20.0- (@non-git)
Build target: bazel-out/aarch64-opt/bin/src/main/java/com/google/devtools/build/lib/bazel/BazelServer_deploy.jar
Build time: Wed Mar 13 14:49:38 2019 (1552488578)
Build timestamp: 1552488578
Build timestamp as int: 1552488578

When running the bazel build command I get the following error

bazel build --verbose_failures -c opt --config=cuda --config=nativeopt --copt="-fPIC" tensorflow_serving/model_servers:tensorflow_model_server
INFO: Invocation ID: 584c76e9-26c5-4440-9927-338e2424fbf8
ERROR: No toolchain found for cpu 'aarch64'. Valid toolchains are:
[local_linux: --cpu='local' --compiler='compiler',
local_darwin: --cpu='darwin' --compiler='compiler',
local_windows: --cpu='x64_windows' --compiler='msvc-cl',]
INFO: Elapsed time: 0.322sINFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)

Additional context

I managed to build TF serving 1.12 with GPU support and bazel 0.15.2

@netfs
Copy link
Collaborator

netfs commented Apr 30, 2019

TF Serving does not have support for aarch64 architecture (in BUILD system, and i suspect the code might need changes too).

TF (core) afaik has support for aarch64 for the lite ecosystem:
https://www.tensorflow.org/lite/guide/build_arm64

Happy to accept patches to add aarch64 support to TF code base.

@netfs
Copy link
Collaborator

netfs commented Apr 30, 2019

There is TF SIG Build too, to see what others are doing regarding aarch64 builds. Try asking on their mailing list.

@helmut-hoffer-von-ankershoffen

TensorFlow Serving builds quite nicely on Jetson devices nowadays - have a look at https://github.com/helmuthva/jetson/tree/master/workflow/deploy/tensorflow-serving-base/src or https://github.com/helmuthva/jetson for the bigger picture of this project.

@helmut-hoffer-von-ankershoffen
Copy link

helmut-hoffer-von-ankershoffen commented Sep 10, 2019

Docker images to get TensorFlow Serving up and running on Jetson Nano and Jetson AGX Xavier devices are now published on DockerHub - see https://hub.docker.com/u/helmuthva

To allow GPU access from inside the container the following devices have to be mounted when running the container:

  • /dev/nvhost-ctrl
  • /dev/nvhost-ctrl-gpu
  • /dev/nvhost-prof-gpu
  • /dev/nvmap
  • /dev/nvhost-gpu
  • /dev/nvhost-as-gpu

@deaffella
Copy link

Docker images to get TensorFlow Serving up and running on Jetson Nano and Jetson AGX Xavier devices are now published on DockerHub - see https://hub.docker.com/u/helmuthva

To allow GPU access from inside the container the following devices have to be mounted when running the container:

  • /dev/nvhost-ctrl
  • /dev/nvhost-ctrl-gpu
  • /dev/nvhost-prof-gpu
  • /dev/nvmap
  • /dev/nvhost-gpu
  • /dev/nvhost-as-gpu

Hi! I want to use tensorflow serving on my Jetson TX2.
I've successfully pulled docker image and tried to create a new container with all these devices mounting. When the container starts, the RAM is filled by more than 90% and I get messages in the container logs about the lack of RAM.
The query execution time fluctuates about 5 seconds, which is a lot for me. When using tensorflow serving on a weaker computer without a GPU, I get a runtime of about 0.5-1 second.
What am I doing wrong? Please help me

@omartin2010
Copy link

omartin2010 commented Mar 8, 2020

Did you figure this out, @deaffella ? I'm planning to do that if I can, to make it simpler to serve my a model on my tx2. Basically the issue I have with the images Helmut is referring to above are too large for my device (which already has some things on it and the images uses 6+GB). Trying to build with bazel, I get this output :

RUN bazel build     --color=yes     --curses=yes     --jobs="${JOBS}"     --verbose_failures     --output_filter=DONT_MATCH_ANYTHING     --config=cuda     --config=nativeopt     --config=jetson     --copt="-fPIC"     tensorflow_serving/model_servers:tensorflow_model_server &&     cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server
 ---> Running in c271cf5b58c4
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch()
ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch()
INFO: Elapsed time: 33.719s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
The command '/bin/sh -c bazel build     --color=yes     --curses=yes     --jobs="${JOBS}"     --verbose_failures     --output_filter=DONT_MATCH_ANYTHING     --config=cuda     --config=nativeopt     --config=jetson     --copt="-fPIC"     tensorflow_serving/model_servers:tensorflow_model_server &&     cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server' returned a non-zero code: 1nsorflow; fetching 21s

Should I open a new issue ? Not sure what to take a look at, here.

@littlepai
Copy link

Did you figure this out, @deaffella ? I'm planning to do that if I can, to make it simpler to serve my a model on my tx2. Basically the issue I have with the images Helmut is referring to above are too large for my device (which already has some things on it and the images uses 6+GB). Trying to build with bazel, I get this output :

RUN bazel build     --color=yes     --curses=yes     --jobs="${JOBS}"     --verbose_failures     --output_filter=DONT_MATCH_ANYTHING     --config=cuda     --config=nativeopt     --config=jetson     --copt="-fPIC"     tensorflow_serving/model_servers:tensorflow_model_server &&     cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server
 ---> Running in c271cf5b58c4
Extracting Bazel installation...
Starting local Bazel server and connecting to it...
ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch()
ERROR: error loading package '': Encountered error while reading extension file 'third_party/toolchains/preconfig/generate/archives.bzl': no such package '@org_tensorflow//third_party/toolchains/preconfig/generate': type 'repository_ctx' has no method patch()
INFO: Elapsed time: 33.719s
INFO: 0 processes.
FAILED: Build did NOT complete successfully (0 packages loaded)
The command '/bin/sh -c bazel build     --color=yes     --curses=yes     --jobs="${JOBS}"     --verbose_failures     --output_filter=DONT_MATCH_ANYTHING     --config=cuda     --config=nativeopt     --config=jetson     --copt="-fPIC"     tensorflow_serving/model_servers:tensorflow_model_server &&     cp /tensorflow-serving/bazel-bin/tensorflow_serving/model_servers/tensorflow_model_server /usr/local/bin/tensorflow_model_server' returned a non-zero code: 1nsorflow; fetching 21s

Should I open a new issue ? Not sure what to take a look at, here.

We have the same thing
Any new developments?

@sanatmpa1 sanatmpa1 self-assigned this Dec 17, 2021
@sanatmpa1
Copy link

@ewirbel,

Can you take a look at this link which contains docker image of TF serving for Jetson Xavier and let us know if it helps? Thanks!

@sanatmpa1
Copy link

@ewirbel,

Closing this issue due to lack of recent activity. Please feel free to reopen the issue with more details if the problem still persists. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants