lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

Spartee · 2021-07-27T01:04:32Z

Describe the bug
The pre-built onnx backend provided by RedisAI expects that GLIBC_2.27 is available on the system. Many systems, especially in High Performance Computing (HPC), do not have this.

To Reproduce
Steps to reproduce the behavior:

GIT_LFS_SKIP_SMUDGE=1 git clone --recursive https://github.com/RedisAI/RedisAI.git --branch v1.2.3 --depth=1
CC=gcc CXX=g++ WITH_PT=0 WITH_TF=0 WITH_TFLITE=0 WITH_ORT=1 bash get_deps.sh gpu
CC=gcc CXX=g++ GPU=1 WITH_PT=0 WITH_TF=0 WITH_TFLITE=0 WITH_ORT=1 WITH_UNIT_TESTS=0 make -C opt clean build
start redisAI and set/run any onnx model.

or just ldd the redisai_onnxruntime.so

and you get:

tf-test) [spartee@horizon 17:44:07 redisai_onnxruntime]$ ldd redisai_onnxruntime.so 
./redisai_onnxruntime.so: /lib64/libm.so.6: version `GLIBC_2.27' not found (required by /lus/cls01029/spartee/poseidon/backend-test/smartsim/lib/backends/redisai_onnxruntime/./lib/libonnxruntime.so.1.7.1)w

looking at libm on our systems it seems like we are laughably close (1 minor version away)

(tf-test) [spartee@horizon 19:49:38 on_wlm]$ strings /lib64/libm.so.6 | grep GLIBC
GLIBC_2.2.5
GLIBC_2.4
GLIBC_2.15
GLIBC_2.18
GLIBC_2.23
GLIBC_2.24
GLIBC_2.25
GLIBC_2.26
GLIBC_PRIVATE
GLIBC_2.15

But the odd thing is... the tensorflow shared library, when compiled for GPU, does not have the same problem...

# ldd on tensorflow
libm.so.6 => /lib64/libm.so.6 (0x00007fccb629a000)

I'm guessing this is because tensorflow is the one y'all are directly downloading from vendor? (i.e. Google)

Expected (wanted?) behavior
Ideally RedisAI could build an audit shared libraries the backends depend on to ensure that they will work on systems without such requirements. My guess is that the GPU builds for the backends are using some specific docker container that has extra goodies for the sake of ease of use, but not actually needed. @chayim is this the case?

I realize that #785 is currently being worked on, but this particular problem is a big issue for us, and we have also seen a similar problem with PyTorch which is why we switch to compiling in our own PyTorch (see #822)

Environment (please complete the following information):

OS: Suse Linux
Version [e.g. 1.2.2]: 15.2
Platfrom [e.g. x86, Jetson, ARM]: Intel x86
Runtime [e.g. CPU, CUDA]: CUDA 11.2 (tested 11.3 as well)

The text was updated successfully, but these errors were encountered:

Spartee mentioned this issue Jul 27, 2021

Issues with backends #822

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

Spartee commented Jul 27, 2021

lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

lib64/libm.so GLIBC issue with ONNX GPU backend on Linux #826

Comments

Spartee commented Jul 27, 2021