Issues with backends #822

Spartee · 2021-07-23T22:53:25Z

Hello, as the RedisAI team knows, CrayLabs uses RedisAI within SmartSim. Currently we have a few issues that we figured would best be grouped together such that they can be tracked as an "epic" of sorts.

For RedisAI 1.2.3

General

RAI doesn't build with GCC 10 Build error with gcc 10.3.0 #777
workaround in place from @DvirDukhan is here remove duplicate symbols for real gcc10 compatibility #825

TensorFlow Cmake

Currently, if not downloaded by the get_deps.sh script, the findTensorFlow.cmake file is used to determine the location of tensorflow. This file is out of date. for RAI 1.2.x newer tensorflow versions should be ok, but it throws errors.

ONNX

Due to RedisAI vendoring ONNX, it is impossible to build RedisAI with the standard ONNX libraries. This brings a few headaches
- anyone looking to compile on an arch other than ubuntu latest is forced to download the vendored version, its dependencies, build and then manually build RedisAI. There are also no instructions for this we are aware of.
- The vendored version, to our knowledge, doesn't compile on OSX (I see this is being worked on now onnxruntime 1.7.2 build and documentation #785 )
At the current moment, the get_deps script for ONNX on OSX points to a dead link Unable to build RedisAI from Source - bash get_deps.sh #743
the release notes state that ONNX 1.6 is supported, however the get_deps script for OSX seems to point to 1.7.1 (which doesn't exist). what version is actually supported?
When are the standard ONNX shared libraries going to work with RedisAI again? do y'all plan on maintaining a fork of ONNX forever?
- Answer from RAI: Not forever, coming weeks will revert back to ONNX because Support plugging in custom user-defined allocators for sharing between sessions microsoft/onnxruntime#8059 has been merged
GLIBC issues when compiling for GPU on SUSE Linux 15.2 lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

Backends

In general it seems like RedisAI relies on docker images to perform builds for the backends. Because of the environments (OS, system libraries, etc) these docker containers have, the backends are built with newer versions of GLIBC and other dependencies that make them unusable on older OS's even though those backend libraries (libtorch, tensorflow, etc) are readily supported. it seems as if others are running into these issues as well GLIBC_2.27 not found on Xenial GPU Docker image #724
For example, we currently do not use the Torch backend for this reason, we pip install the torch library and use the shared libraries that come with that. this works fine by just setting Torch_DIR and then bypassing torch in the get_deps script but invoking torch in the build.
I've documented this for ONNX in ticket lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

Key Point In general I think it would be best if the cmake and build setup were enabled to allow the user to pass environment variables that specify the locations of the backends they have already built.
For

PyTorch this works (pass Torch_DIR)
For TensorFlow this does not (see above)
For ONNX this does not because of the vendored ONNX version.

This would also help users who would like to try to get RedisAI built for AMD GPUs

More clarity about the roadmap in terms of expected dates and versions would be much appreciated as well. ex. #591

The text was updated successfully, but these errors were encountered:

Spartee · 2022-01-11T22:54:31Z

So update on this issue for anyone following.

MacOS support has been completely removed in 1.2.5
Outdated TensorFlow.cmake removed in 1.2.5 (yay!)
Gcc 10 work is still in progress remove duplicate symbols for real gcc10 compatibility #825
RAI 1.2.4 is broken on all clang based compilers
Older ubuntu builds seem to have been ditched so it seems theres no hope for lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

Spartee mentioned this issue Jul 27, 2021

lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues with backends #822

Issues with backends #822

Spartee commented Jul 23, 2021 •

edited

Loading

Spartee commented Jan 11, 2022

Issues with backends #822

Issues with backends #822

Comments

Spartee commented Jul 23, 2021 • edited Loading

General

TensorFlow Cmake

ONNX

Backends

Spartee commented Jan 11, 2022

Spartee commented Jul 23, 2021 •

edited

Loading