Skip to content

Issues with backends #822

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
Spartee opened this issue Jul 23, 2021 · 1 comment
Open

Issues with backends #822

Spartee opened this issue Jul 23, 2021 · 1 comment

Comments

@Spartee
Copy link
Contributor

Spartee commented Jul 23, 2021

Hello, as the RedisAI team knows, CrayLabs uses RedisAI within SmartSim. Currently we have a few issues that we figured would best be grouped together such that they can be tracked as an "epic" of sorts.

For RedisAI 1.2.3

General

TensorFlow Cmake

  • Currently, if not downloaded by the get_deps.sh script, the findTensorFlow.cmake file is used to determine the location of tensorflow. This file is out of date. for RAI 1.2.x newer tensorflow versions should be ok, but it throws errors.

ONNX

Backends

  • In general it seems like RedisAI relies on docker images to perform builds for the backends. Because of the environments (OS, system libraries, etc) these docker containers have, the backends are built with newer versions of GLIBC and other dependencies that make them unusable on older OS's even though those backend libraries (libtorch, tensorflow, etc) are readily supported. it seems as if others are running into these issues as well GLIBC_2.27 not found on Xenial GPU Docker image #724
  • For example, we currently do not use the Torch backend for this reason, we pip install the torch library and use the shared libraries that come with that. this works fine by just setting Torch_DIR and then bypassing torch in the get_deps script but invoking torch in the build.
  • I've documented this for ONNX in ticket lib64/libm.so GLIBC issue with ONNX GPU backend on Linux  #826

Key Point In general I think it would be best if the cmake and build setup were enabled to allow the user to pass environment variables that specify the locations of the backends they have already built.
For

  • PyTorch this works (pass Torch_DIR)
  • For TensorFlow this does not (see above)
  • For ONNX this does not because of the vendored ONNX version.

This would also help users who would like to try to get RedisAI built for AMD GPUs

More clarity about the roadmap in terms of expected dates and versions would be much appreciated as well. ex. #591

@Spartee
Copy link
Contributor Author

Spartee commented Jan 11, 2022

So update on this issue for anyone following.

  1. MacOS support has been completely removed in 1.2.5
  2. Outdated TensorFlow.cmake removed in 1.2.5 (yay!)
  3. Gcc 10 work is still in progress remove duplicate symbols for real gcc10 compatibility  #825
  4. RAI 1.2.4 is broken on all clang based compilers
  5. Older ubuntu builds seem to have been ditched so it seems theres no hope for lib64/libm.so GLIBC issue with ONNX GPU backend on Linux  #826

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant