ImportError: .../jaxlib/xla_extension.so: symbol cudnnSetCTCLossDescriptorEx version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference #2494

jacobjinkelly · 2020-03-24T02:29:32Z

I followed instructions on the README for installing with the following versions

PYTHON_VERSION=cp36  # alternatives: cp36, cp37, cp38
CUDA_VERSION=cuda101  # alternatives: cuda92, cuda100, cuda101, cuda102
PLATFORM=linux_x86_64  # alternatives: linux_x86_64
BASE_URL='https://storage.googleapis.com/jax-releases'
pip install --upgrade $BASE_URL/$CUDA_VERSION/jaxlib-0.1.42-$PYTHON_VERSION-none-$PLATFORM.whl

pip install --upgrade jax

I set the following environment variables

export LD_LIBRARY_PATH=/pkgs/cuda-10.1/lib64:/pkgs/cudnn-10.0-v7.4.2/lib64:$LD_LIBRARY_PATH
export XLA_FLAGS=--xla_gpu_cuda_data_dir=/pkgs/cuda-10.1/

I get the error message listed in the title as soon as I import jax.

Related issues include #989

The text was updated successfully, but these errors were encountered:

mattjj · 2020-03-24T02:57:39Z

Is this a fresh install of CUDA / cuDNN?

jacobjinkelly · 2020-03-24T04:52:05Z

Ah yes that appears to have been the issue. I changed to cudnn-10.2-v7.6.5 and this solved it.

mattjj · 2020-03-24T05:13:03Z

Woo! We did it without having to ask Peter for help!

py4 · 2020-03-27T04:05:30Z

I have cuda 10 and cudnn 7.4.1 and have the same issue (not a fresh cuda/cudnn installation). Should i necessarily install another cuda/cudann version? @mattjj

jacobjinkelly · 2020-03-27T15:46:43Z

@py4 So the original reason for the error was actually me not understanding compatibility between versions of CUDA and cuDNN correctly (as described in this table. I got it to work with CUDA 10.1.243 with driver 430.50 and cudnn-10.1-v7.6.3.30 (I think the 10.1 means it's installed to work with CUDA 10.1, and it's version 7.6.30). To my understanding, according to this table, you'd need a driver version of at least 410.48 for CUDA 10.0. Note that you may still get some warning messages. Even after getting it to work, I still got the following warnings:

2020-03-26 15:12:11.901833: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer.so.6'; dlerror: libnvinfer.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /pkgs/cuda-10.1/lib64:/pkgs/cudnn-10.1-v7.6.3.30/lib64: 2020-03-26 15:12:11.902503: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libnvinfer_plugin.so.6'; dlerror: libnvinfer_plugin.so.6: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /pkgs/cuda-10.1/lib64:/pkgs/cudnn-10.1-v7.6.3.30/lib64: 2020-03-26 15:12:11.902514: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:30] Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly. 2020-03-26 15:12:31.352429: W external/org_tensorflow/tensorflow/compiler/xla/service/hlo_pass_fix.h:49] Unexpectedly high number of iterations in HLO passes, exiting fixed point loop. 2020-03-26 15:12:45.680246: W external/org_tensorflow/tensorflow/compiler/xla/service/hlo_pass_fix.h:49] Unexpectedly high number of iterations in HLO passes, exiting fixed point loop.
which seem to be about cuDNN not having additional plugins to use TensorRT.

P.S.

You can figure out the driver version via nvidia-smi and you can find the specific version of CUDA (i.e. 10.1.243 in particular instead of just knowing 10.1) by checking /usr/local/cuda/version.txt (or wherever CUDA is installed on your machine)

hawkinsp · 2020-03-27T17:17:24Z

@py4 yes, that means you need to install a newer CuDNN. Can you give that a go? Hope that helps!

refraction-ray · 2020-05-15T03:08:09Z

Just for reference, I have the same issue and the reason is also version unmatch or out-of-date amongst drivers, cuda, cudnn and jaxlib. It hard to determine which combination will fail since this list doesn't cover the whole story.
Anyway, cases work for me:
GPU driver 418/430 + jaxlib 0.1.47 + cuda 10.1.243 + cudnn 7.6.5.32
cases fail for me:
GPU driver 418/430 + jaxlib 0.1.47 + cuda 10.0 + cudnn 7.5.1, though this combination works for gpu tensorflow.

mattjj added bug Something isn't working build and removed bug Something isn't working labels Mar 24, 2020

jacobjinkelly closed this as completed Mar 24, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ImportError: .../jaxlib/xla_extension.so: symbol cudnnSetCTCLossDescriptorEx version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference #2494

ImportError: .../jaxlib/xla_extension.so: symbol cudnnSetCTCLossDescriptorEx version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference #2494

jacobjinkelly commented Mar 24, 2020

mattjj commented Mar 24, 2020

jacobjinkelly commented Mar 24, 2020

mattjj commented Mar 24, 2020

py4 commented Mar 27, 2020 •

edited

Loading

jacobjinkelly commented Mar 27, 2020

hawkinsp commented Mar 27, 2020

refraction-ray commented May 15, 2020

ImportError: .../jaxlib/xla_extension.so: symbol cudnnSetCTCLossDescriptorEx version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference #2494

ImportError: .../jaxlib/xla_extension.so: symbol cudnnSetCTCLossDescriptorEx version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference #2494

Comments

jacobjinkelly commented Mar 24, 2020

mattjj commented Mar 24, 2020

jacobjinkelly commented Mar 24, 2020

mattjj commented Mar 24, 2020

py4 commented Mar 27, 2020 • edited Loading

jacobjinkelly commented Mar 27, 2020

hawkinsp commented Mar 27, 2020

refraction-ray commented May 15, 2020

py4 commented Mar 27, 2020 •

edited

Loading