cannot find libdevice #989

murphyk · 2019-07-07T23:52:35Z

Hi

Jax cannot find libdevice.
I'm running Python 3.7 with cuda 10.0 on my personal laptop qwith a GeForce RTX 2080.
I installed jax using pip.

I made a little test script shown below

import os
os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"
os.environ["CUDA_HOME"]="/usr"


import jax
import jax.numpy as np
print("jax version {}".format(jax.__version__))
from jax.lib import xla_bridge
print("jax backend {}".format(xla_bridge.get_backend().platform))


from jax import random
key = random.PRNGKey(0)
x = random.normal(key, (5,5))
print(x)

The output is shown below.

jax version 0.1.39
jax backend gpu
2019-07-07 16:44:03.905071: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
Traceback (most recent call last):

  File "<ipython-input-15-e39e42274024>", line 1, in <module>
    runfile('/home/murphyk/github/pyprobml/scripts/jax_debug.py', wdir='/home/murphyk/github/pyprobml/scripts')

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 827, in runfile
    execfile(filename, namespace)

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/spyder_kernels/customize/spydercustomize.py", line 110, in execfile
    exec(compile(f.read(), filename, 'exec'), namespace)

  File "/home/murphyk/github/pyprobml/scripts/jax_debug.py", line 18, in <module>
    x = random.normal(key, (5,5))

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py", line 389, in normal
    return _normal(key, shape, dtype)

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py", line 123, in f_jitted
    out = xla.xla_call(flat_fun, *args_flat, device_values=device_values)

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py", line 663, in call_bind
    ans = primitive.impl(f, *args, **params)

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 606, in xla_call_impl
    compiled_fun = xla_callable(fun, device_values, *map(abstractify, args))

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py", line 208, in memoized_fun
    ans = call(f, *args)

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 621, in xla_callable
    compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args)

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 207, in compile_jaxpr
    backend=xb.get_backend()), result_shape

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", line 535, in Compile
    return backend.compile(self.computation, compile_options)

  File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", line 118, in compile
    compile_options.device_assignment)

RuntimeError: Not found: ./libdevice.compute_20.10.bc not found

The text was updated successfully, but these errors were encountered:

murphyk · 2019-07-08T00:03:11Z

I think I want it to find this file

/home/murphyk/miniconda3/lib/libdevice.10.bc

I tried

 export XLA_FLAGS="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"

to no avail.

murphyk · 2019-07-08T00:05:16Z

or maybe this file?

/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc

hawkinsp · 2019-07-08T13:33:00Z

How did you install CUDA? What operating system and what release is this?

In fact, did you install CUDA at all?

murphyk · 2019-07-08T15:47:06Z

This is a tensorbook running ubuntu 18.04. It is prebundled with cuda 10.0 installed by vendor: https://lambdalabs.com/lambda-stack-deep-learning-software I forgot to mention that both torch and tf2 can find the GPU. Below is some diagnostics from TF. Is there a way for jax to "piggy back" off the TF GPU support? import tensorflow as tf from tensorflow import keras print("tf version {}".format(tf.__version__)) if tf.test.is_gpu_available(): print(tf.test.gpu_device_name()) else: print("TF cannot find GPU") tf version 2.0.0-beta1 /device:GPU:0 2019-07-08 08:37:58.187392: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.188242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.095 pciBusID: 0000:01:00.0 2019-07-08 08:37:58.188295: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-07-08 08:37:58.188306: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-07-08 08:37:58.188314: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2019-07-08 08:37:58.188322: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2019-07-08 08:37:58.188365: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2019-07-08 08:37:58.188373: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2019-07-08 08:37:58.188383: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-07-08 08:37:58.188425: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.189732: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.190444: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-07-08 08:37:58.190461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-07-08 08:37:58.190484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-07-08 08:37:58.190488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-07-08 08:37:58.190682: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.191461: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.192553: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5) 2019-07-08 08:37:58.193098: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.194157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.095 pciBusID: 0000:01:00.0 2019-07-08 08:37:58.194200: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-07-08 08:37:58.194211: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-07-08 08:37:58.194219: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2019-07-08 08:37:58.194227: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2019-07-08 08:37:58.194241: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2019-07-08 08:37:58.194250: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2019-07-08 08:37:58.194276: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-07-08 08:37:58.194347: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.195272: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.196356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-07-08 08:37:58.196380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-07-08 08:37:58.196388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-07-08 08:37:58.196393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-07-08 08:37:58.196735: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.197553: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.198262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5)

…

On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***> wrote: How did you install CUDA? What operating system and what release is this? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub <#989>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q> .

murphyk · 2019-07-08T15:48:06Z

What's also weird is that JAX says it can find the GPU but when I try to run some actual code, I get the libdevice error: import os ...: os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib" ...: import jax ...: import jax.numpy as np ...: from jax import grad, jacfwd, jacrev, jit, vmap ...: from jax.experimental import optimizers ...: print("jax version {}".format(jax.__version__)) ...: from jax.lib import xla_bridge ...: print("jax backend {}".format(xla_bridge.get_backend().platform)) jax version 0.1.39 jax backend gpu In [4]: from jax import random ...: key = random.PRNGKey(0) ...: x = random.normal(key, (5,5)) 2019-07-08 08:40:15.551685: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice. 2019-07-08 08:40:15.551706: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130] Searched for CUDA in the following directories: 2019-07-08 08:40:15.551710: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] /home/murphyk/miniconda3/lib 2019-07-08 08:40:15.551713: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] /usr/local/cuda 2019-07-08 08:40:15.551715: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] . 2019-07-08 08:40:15.551718: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work. 2019-07-08 08:40:16.203818: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20 Traceback (most recent call last): File "<ipython-input-4-d8a87c178f8a>", line 3, in <module> x = random.normal(key, (5,5)) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py", line 389, in normal return _normal(key, shape, dtype) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py", line 123, in f_jitted out = xla.xla_call(flat_fun, *args_flat, device_values=device_values) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py", line 663, in call_bind ans = primitive.impl(f, *args, **params) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 606, in xla_call_impl compiled_fun = xla_callable(fun, device_values, *map(abstractify, args)) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py", line 208, in memoized_fun ans = call(f, *args) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 621, in xla_callable compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 207, in compile_jaxpr backend=xb.get_backend()), result_shape File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", line 535, in Compile return backend.compile(self.computation, compile_options) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", line 118, in compile compile_options.device_assignment) RuntimeError: Not found: ./libdevice.compute_20.10.bc not found In [5]: In [5]:

…

On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote: This is a tensorbook running ubuntu 18.04. It is prebundled with cuda 10.0 installed by vendor: https://lambdalabs.com/lambda-stack-deep-learning-software I forgot to mention that both torch and tf2 can find the GPU. Below is some diagnostics from TF. Is there a way for jax to "piggy back" off the TF GPU support? import tensorflow as tf from tensorflow import keras print("tf version {}".format(tf.__version__)) if tf.test.is_gpu_available(): print(tf.test.gpu_device_name()) else: print("TF cannot find GPU") tf version 2.0.0-beta1 /device:GPU:0 2019-07-08 08:37:58.187392: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.188242: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.095 pciBusID: 0000:01:00.0 2019-07-08 08:37:58.188295: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-07-08 08:37:58.188306: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-07-08 08:37:58.188314: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2019-07-08 08:37:58.188322: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2019-07-08 08:37:58.188365: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2019-07-08 08:37:58.188373: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2019-07-08 08:37:58.188383: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-07-08 08:37:58.188425: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.189732: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.190444: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-07-08 08:37:58.190461: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-07-08 08:37:58.190484: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-07-08 08:37:58.190488: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-07-08 08:37:58.190682: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.191461: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.192553: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5) 2019-07-08 08:37:58.193098: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.194157: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with properties: name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 memoryClockRate(GHz): 1.095 pciBusID: 0000:01:00.0 2019-07-08 08:37:58.194200: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudart.so.10.0 2019-07-08 08:37:58.194211: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcublas.so.10.0 2019-07-08 08:37:58.194219: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcufft.so.10.0 2019-07-08 08:37:58.194227: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcurand.so.10.0 2019-07-08 08:37:58.194241: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusolver.so.10.0 2019-07-08 08:37:58.194250: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcusparse.so.10.0 2019-07-08 08:37:58.194276: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcudnn.so.7 2019-07-08 08:37:58.194347: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.195272: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.196356: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu devices: 0 2019-07-08 08:37:58.196380: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect StreamExecutor with strength 1 edge matrix: 2019-07-08 08:37:58.196388: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 2019-07-08 08:37:58.196393: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N 2019-07-08 08:37:58.196735: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.197553: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero 2019-07-08 08:37:58.198262: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute capability: 7.5) On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***> wrote: > How did you install CUDA? What operating system and what release is this? > > — > You are receiving this because you authored the thread. > Reply to this email directly, view it on GitHub > <#989>, > or mute the thread > <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q> > . >

murphyk · 2019-07-08T15:52:34Z

One more thing. When I turn my GPU off (*), JAX falls back to CPU and then it works like a charm. It would be great if the user could choose which mode JAX uses from within Python, without having to turn the GPU off. (*) I don't literally turn the GPU off - I don't even know how to do that! I simply open a browser before I open python. The browser seems to "lock up" the GPU and then all of torch, tf or jax say the GPU is unavailable. This is actually pretty annoying, and the only fix I have found so far seems to be to type 'shutdown -h now' and then start my python IDE before my browser. (Do you know a better way?) But this is not a JAX issue, of course :)

…

On Mon, Jul 8, 2019 at 8:47 AM Kevin Murphy ***@***.***> wrote: What's also weird is that JAX says it can find the GPU but when I try to run some actual code, I get the libdevice error: import os ...: os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib" ...: import jax ...: import jax.numpy as np ...: from jax import grad, jacfwd, jacrev, jit, vmap ...: from jax.experimental import optimizers ...: print("jax version {}".format(jax.__version__)) ...: from jax.lib import xla_bridge ...: print("jax backend {}".format(xla_bridge.get_backend().platform)) jax version 0.1.39 jax backend gpu In [4]: from jax import random ...: key = random.PRNGKey(0) ...: x = random.normal(key, (5,5)) 2019-07-08 08:40:15.551685: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice. 2019-07-08 08:40:15.551706: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130] Searched for CUDA in the following directories: 2019-07-08 08:40:15.551710: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] /home/murphyk/miniconda3/lib 2019-07-08 08:40:15.551713: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] /usr/local/cuda 2019-07-08 08:40:15.551715: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] . 2019-07-08 08:40:15.551718: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work. 2019-07-08 08:40:16.203818: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20 Traceback (most recent call last): File "<ipython-input-4-d8a87c178f8a>", line 3, in <module> x = random.normal(key, (5,5)) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py", line 389, in normal return _normal(key, shape, dtype) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py", line 123, in f_jitted out = xla.xla_call(flat_fun, *args_flat, device_values=device_values) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py", line 663, in call_bind ans = primitive.impl(f, *args, **params) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 606, in xla_call_impl compiled_fun = xla_callable(fun, device_values, *map(abstractify, args)) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py", line 208, in memoized_fun ans = call(f, *args) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 621, in xla_callable compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", line 207, in compile_jaxpr backend=xb.get_backend()), result_shape File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", line 535, in Compile return backend.compile(self.computation, compile_options) File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", line 118, in compile compile_options.device_assignment) RuntimeError: Not found: ./libdevice.compute_20.10.bc not found In [5]: In [5]: On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote: > This is a tensorbook running ubuntu 18.04. > It is prebundled with cuda 10.0 installed by vendor: > https://lambdalabs.com/lambda-stack-deep-learning-software > > I forgot to mention that both torch and tf2 can find the GPU. > Below is some diagnostics from TF. Is there a way for jax to "piggy back" > off the TF GPU support? > > import tensorflow as tf > from tensorflow import keras > print("tf version {}".format(tf.__version__)) > if tf.test.is_gpu_available(): > print(tf.test.gpu_device_name()) > else: > print("TF cannot find GPU") > > > tf version 2.0.0-beta1 > /device:GPU:0 > 2019-07-08 08:37:58.187392: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.188242: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with > properties: > name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 > memoryClockRate(GHz): 1.095 > pciBusID: 0000:01:00.0 > 2019-07-08 08:37:58.188295: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcudart.so.10.0 > 2019-07-08 08:37:58.188306: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcublas.so.10.0 > 2019-07-08 08:37:58.188314: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcufft.so.10.0 > 2019-07-08 08:37:58.188322: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcurand.so.10.0 > 2019-07-08 08:37:58.188365: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcusolver.so.10.0 > 2019-07-08 08:37:58.188373: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcusparse.so.10.0 > 2019-07-08 08:37:58.188383: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcudnn.so.7 > 2019-07-08 08:37:58.188425: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.189732: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.190444: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu > devices: 0 > 2019-07-08 08:37:58.190461: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect > StreamExecutor with strength 1 edge matrix: > 2019-07-08 08:37:58.190484: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 > 2019-07-08 08:37:58.190488: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N > 2019-07-08 08:37:58.190682: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.191461: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.192553: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow > device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, > name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute > capability: 7.5) > 2019-07-08 08:37:58.193098: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.194157: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with > properties: > name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 > memoryClockRate(GHz): 1.095 > pciBusID: 0000:01:00.0 > 2019-07-08 08:37:58.194200: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcudart.so.10.0 > 2019-07-08 08:37:58.194211: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcublas.so.10.0 > 2019-07-08 08:37:58.194219: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcufft.so.10.0 > 2019-07-08 08:37:58.194227: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcurand.so.10.0 > 2019-07-08 08:37:58.194241: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcusolver.so.10.0 > 2019-07-08 08:37:58.194250: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcusparse.so.10.0 > 2019-07-08 08:37:58.194276: I > tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully > opened dynamic library libcudnn.so.7 > 2019-07-08 08:37:58.194347: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.195272: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.196356: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu > devices: 0 > 2019-07-08 08:37:58.196380: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect > StreamExecutor with strength 1 edge matrix: > 2019-07-08 08:37:58.196388: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 > 2019-07-08 08:37:58.196393: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N > 2019-07-08 08:37:58.196735: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.197553: I > tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA > node read from SysFS had negative value (-1), but there must be at least > one NUMA node, so returning NUMA node zero > 2019-07-08 08:37:58.198262: I > tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow > device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, > name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute > capability: 7.5) > > > > On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***> > wrote: > >> How did you install CUDA? What operating system and what release is this? >> >> — >> You are receiving this because you authored the thread. >> Reply to this email directly, view it on GitHub >> <#989>, >> or mute the thread >> <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q> >> . >> >

murphyk · 2019-07-08T17:41:46Z

To summarize, this is the error JAX produces 2019-07-08 10:24:47.184009: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20 ... RuntimeError: Not found: ./libdevice.compute_20.10.bc not found I do have libdevice.10.bc installed (in several places), locate libdevice /usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc /home/murphyk/miniconda3/lib/libdevice.10.bc I've tried setting #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib" #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib" #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib/nvidia-cuda-toolkit" #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr" None work. I decided to look at the file nvptx_backend_lib.cc (possibly installed by TF, not JAX, not sure). locate nvptx_backend_lib.cc /home/murphyk/.cache/bazel/_bazel_murphyk/87cf5205cc24305d7da6a4fb49af7044/external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc And searched for the string "Defaulting to libdevice..." but could not find it! Instead I find "Defaulting to telling LLVM that we're compiling for sm_" << sm_version; The source code actually lists {7,5} as a valid combination. So I don;t know why JAX says "Unknown compute capability (7, 5) " - its in the lookup table. Maybe JAX wheel is compiled with an older version of this file?

…

On Mon, Jul 8, 2019 at 8:52 AM Kevin Murphy ***@***.***> wrote: One more thing. When I turn my GPU off (*), JAX falls back to CPU and then it works like a charm. It would be great if the user could choose which mode JAX uses from within Python, without having to turn the GPU off. (*) I don't literally turn the GPU off - I don't even know how to do that! I simply open a browser before I open python. The browser seems to "lock up" the GPU and then all of torch, tf or jax say the GPU is unavailable. This is actually pretty annoying, and the only fix I have found so far seems to be to type 'shutdown -h now' and then start my python IDE before my browser. (Do you know a better way?) But this is not a JAX issue, of course :) On Mon, Jul 8, 2019 at 8:47 AM Kevin Murphy ***@***.***> wrote: > What's also weird is that JAX says it can find the GPU but when I try to > run some actual code, I get the libdevice error: > > import os > > ...: > os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib" > > ...: import jax > > ...: import jax.numpy as np > > ...: from jax import grad, jacfwd, jacrev, jit, vmap > > ...: from jax.experimental import optimizers > > ...: print("jax version {}".format(jax.__version__)) > > ...: from jax.lib import xla_bridge > > ...: print("jax backend {}".format(xla_bridge.get_backend().platform)) > > jax version 0.1.39 > > jax backend gpu > > > In [4]: from jax import random > > ...: key = random.PRNGKey(0) > > ...: x = random.normal(key, (5,5)) > > > 2019-07-08 08:40:15.551685: W > external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129] > Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result > in compilation or runtime failures, if the program we try to run uses > routines from libdevice. > > 2019-07-08 08:40:15.551706: W > external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130] > Searched for CUDA in the following directories: > > 2019-07-08 08:40:15.551710: W > external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] > /home/murphyk/miniconda3/lib > > 2019-07-08 08:40:15.551713: W > external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] > /usr/local/cuda > > 2019-07-08 08:40:15.551715: W > external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] > . > > 2019-07-08 08:40:15.551718: W > external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135] > You can choose the search directory by setting xla_gpu_cuda_data_dir in > HloModule's DebugOptions. For most apps, setting the environment variable > XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work. > > 2019-07-08 08:40:16.203818: W > external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] > Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20 > > Traceback (most recent call last): > > > File "<ipython-input-4-d8a87c178f8a>", line 3, in <module> > > x = random.normal(key, (5,5)) > > > File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py", > line 389, in normal > > return _normal(key, shape, dtype) > > > File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py", > line 123, in f_jitted > > out = xla.xla_call(flat_fun, *args_flat, device_values=device_values) > > > File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py", > line 663, in call_bind > > ans = primitive.impl(f, *args, **params) > > > File > "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", > line 606, in xla_call_impl > > compiled_fun = xla_callable(fun, device_values, *map(abstractify, args)) > > > File > "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py", > line 208, in memoized_fun > > ans = call(f, *args) > > > File > "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", > line 621, in xla_callable > > compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args) > > > File > "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", > line 207, in compile_jaxpr > > backend=xb.get_backend()), result_shape > > > File > "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", > line 535, in Compile > > return backend.compile(self.computation, compile_options) > > > File > "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", > line 118, in compile > > compile_options.device_assignment) > > > RuntimeError: Not found: ./libdevice.compute_20.10.bc not found > > > > In [5]: > > > In [5]: > > On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote: > >> This is a tensorbook running ubuntu 18.04. >> It is prebundled with cuda 10.0 installed by vendor: >> https://lambdalabs.com/lambda-stack-deep-learning-software >> >> I forgot to mention that both torch and tf2 can find the GPU. >> Below is some diagnostics from TF. Is there a way for jax to "piggy >> back" off the TF GPU support? >> >> import tensorflow as tf >> from tensorflow import keras >> print("tf version {}".format(tf.__version__)) >> if tf.test.is_gpu_available(): >> print(tf.test.gpu_device_name()) >> else: >> print("TF cannot find GPU") >> >> >> tf version 2.0.0-beta1 >> /device:GPU:0 >> 2019-07-08 08:37:58.187392: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.188242: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with >> properties: >> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 >> memoryClockRate(GHz): 1.095 >> pciBusID: 0000:01:00.0 >> 2019-07-08 08:37:58.188295: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcudart.so.10.0 >> 2019-07-08 08:37:58.188306: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcublas.so.10.0 >> 2019-07-08 08:37:58.188314: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcufft.so.10.0 >> 2019-07-08 08:37:58.188322: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcurand.so.10.0 >> 2019-07-08 08:37:58.188365: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcusolver.so.10.0 >> 2019-07-08 08:37:58.188373: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcusparse.so.10.0 >> 2019-07-08 08:37:58.188383: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcudnn.so.7 >> 2019-07-08 08:37:58.188425: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.189732: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.190444: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu >> devices: 0 >> 2019-07-08 08:37:58.190461: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect >> StreamExecutor with strength 1 edge matrix: >> 2019-07-08 08:37:58.190484: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 >> 2019-07-08 08:37:58.190488: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N >> 2019-07-08 08:37:58.190682: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.191461: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.192553: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow >> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, >> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute >> capability: 7.5) >> 2019-07-08 08:37:58.193098: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.194157: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with >> properties: >> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 >> memoryClockRate(GHz): 1.095 >> pciBusID: 0000:01:00.0 >> 2019-07-08 08:37:58.194200: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcudart.so.10.0 >> 2019-07-08 08:37:58.194211: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcublas.so.10.0 >> 2019-07-08 08:37:58.194219: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcufft.so.10.0 >> 2019-07-08 08:37:58.194227: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcurand.so.10.0 >> 2019-07-08 08:37:58.194241: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcusolver.so.10.0 >> 2019-07-08 08:37:58.194250: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcusparse.so.10.0 >> 2019-07-08 08:37:58.194276: I >> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >> opened dynamic library libcudnn.so.7 >> 2019-07-08 08:37:58.194347: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.195272: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.196356: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu >> devices: 0 >> 2019-07-08 08:37:58.196380: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect >> StreamExecutor with strength 1 edge matrix: >> 2019-07-08 08:37:58.196388: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 >> 2019-07-08 08:37:58.196393: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N >> 2019-07-08 08:37:58.196735: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.197553: I >> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >> node read from SysFS had negative value (-1), but there must be at least >> one NUMA node, so returning NUMA node zero >> 2019-07-08 08:37:58.198262: I >> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow >> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, >> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute >> capability: 7.5) >> >> >> >> On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***> >> wrote: >> >>> How did you install CUDA? What operating system and what release is >>> this? >>> >>> — >>> You are receiving this because you authored the thread. >>> Reply to this email directly, view it on GitHub >>> <#989>, >>> or mute the thread >>> <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q> >>> . >>> >>

murphyk · 2019-07-16T01:51:29Z

The folks at Lambda (maker of my TensorBook laptop) looked at the source code and suggested this fix: ``` mkdir -p ~/xla/nvvm/libdevice cp /usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc ~/xla/nvvm/libdevice export XLA_FLAGS="--xla_gpu_cuda_data_dir=/home/murphyk/xla" ``` This actually works :) Maybe worth updating the set of locations that JAX searches for libdevice.10.bc?

…

On Mon, Jul 8, 2019 at 10:41 AM Kevin Murphy ***@***.***> wrote: To summarize, this is the error JAX produces 2019-07-08 10:24:47.184009: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20 ... RuntimeError: Not found: ./libdevice.compute_20.10.bc not found I do have libdevice.10.bc installed (in several places), locate libdevice /usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc /home/murphyk/miniconda3/lib/libdevice.10.bc I've tried setting #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib" #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib" #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib/nvidia-cuda-toolkit" #os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr" None work. I decided to look at the file nvptx_backend_lib.cc (possibly installed by TF, not JAX, not sure). locate nvptx_backend_lib.cc /home/murphyk/.cache/bazel/_bazel_murphyk/87cf5205cc24305d7da6a4fb49af7044/external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc And searched for the string "Defaulting to libdevice..." but could not find it! Instead I find "Defaulting to telling LLVM that we're compiling for sm_" << sm_version; The source code actually lists {7,5} as a valid combination. So I don;t know why JAX says "Unknown compute capability (7, 5) " - its in the lookup table. Maybe JAX wheel is compiled with an older version of this file? On Mon, Jul 8, 2019 at 8:52 AM Kevin Murphy ***@***.***> wrote: > One more thing. When I turn my GPU off (*), JAX falls back to CPU and > then it works like a charm. > It would be great if the user could choose which mode JAX uses from > within Python, without having to turn the GPU off. > > > (*) I don't literally turn the GPU off - I don't even know how to do > that! I simply open a browser before I open python. > The browser seems to "lock up" the GPU and then all of torch, tf or jax > say the GPU is unavailable. > This is actually pretty annoying, and the only fix I have found so far > seems to be to type 'shutdown -h now' > and then start my python IDE before my browser. (Do you know a better > way?) > But this is not a JAX issue, of course :) > > > > On Mon, Jul 8, 2019 at 8:47 AM Kevin Murphy ***@***.***> wrote: > >> What's also weird is that JAX says it can find the GPU but when I try to >> run some actual code, I get the libdevice error: >> >> import os >> >> ...: >> os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib" >> >> ...: import jax >> >> ...: import jax.numpy as np >> >> ...: from jax import grad, jacfwd, jacrev, jit, vmap >> >> ...: from jax.experimental import optimizers >> >> ...: print("jax version {}".format(jax.__version__)) >> >> ...: from jax.lib import xla_bridge >> >> ...: print("jax backend >> {}".format(xla_bridge.get_backend().platform)) >> >> jax version 0.1.39 >> >> jax backend gpu >> >> >> In [4]: from jax import random >> >> ...: key = random.PRNGKey(0) >> >> ...: x = random.normal(key, (5,5)) >> >> >> 2019-07-08 08:40:15.551685: W >> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129] >> Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result >> in compilation or runtime failures, if the program we try to run uses >> routines from libdevice. >> >> 2019-07-08 08:40:15.551706: W >> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130] >> Searched for CUDA in the following directories: >> >> 2019-07-08 08:40:15.551710: W >> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] >> /home/murphyk/miniconda3/lib >> >> 2019-07-08 08:40:15.551713: W >> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] >> /usr/local/cuda >> >> 2019-07-08 08:40:15.551715: W >> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133] >> . >> >> 2019-07-08 08:40:15.551718: W >> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135] >> You can choose the search directory by setting xla_gpu_cuda_data_dir in >> HloModule's DebugOptions. For most apps, setting the environment variable >> XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work. >> >> 2019-07-08 08:40:16.203818: W >> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105] >> Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20 >> >> Traceback (most recent call last): >> >> >> File "<ipython-input-4-d8a87c178f8a>", line 3, in <module> >> >> x = random.normal(key, (5,5)) >> >> >> File >> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py", >> line 389, in normal >> >> return _normal(key, shape, dtype) >> >> >> File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py", >> line 123, in f_jitted >> >> out = xla.xla_call(flat_fun, *args_flat, device_values=device_values) >> >> >> File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py", >> line 663, in call_bind >> >> ans = primitive.impl(f, *args, **params) >> >> >> File >> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", >> line 606, in xla_call_impl >> >> compiled_fun = xla_callable(fun, device_values, *map(abstractify, args)) >> >> >> File >> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py", >> line 208, in memoized_fun >> >> ans = call(f, *args) >> >> >> File >> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", >> line 621, in xla_callable >> >> compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args) >> >> >> File >> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py", >> line 207, in compile_jaxpr >> >> backend=xb.get_backend()), result_shape >> >> >> File >> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", >> line 535, in Compile >> >> return backend.compile(self.computation, compile_options) >> >> >> File >> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py", >> line 118, in compile >> >> compile_options.device_assignment) >> >> >> RuntimeError: Not found: ./libdevice.compute_20.10.bc not found >> >> >> >> In [5]: >> >> >> In [5]: >> >> On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote: >> >>> This is a tensorbook running ubuntu 18.04. >>> It is prebundled with cuda 10.0 installed by vendor: >>> https://lambdalabs.com/lambda-stack-deep-learning-software >>> >>> I forgot to mention that both torch and tf2 can find the GPU. >>> Below is some diagnostics from TF. Is there a way for jax to "piggy >>> back" off the TF GPU support? >>> >>> import tensorflow as tf >>> from tensorflow import keras >>> print("tf version {}".format(tf.__version__)) >>> if tf.test.is_gpu_available(): >>> print(tf.test.gpu_device_name()) >>> else: >>> print("TF cannot find GPU") >>> >>> >>> tf version 2.0.0-beta1 >>> /device:GPU:0 >>> 2019-07-08 08:37:58.187392: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.188242: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with >>> properties: >>> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 >>> memoryClockRate(GHz): 1.095 >>> pciBusID: 0000:01:00.0 >>> 2019-07-08 08:37:58.188295: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcudart.so.10.0 >>> 2019-07-08 08:37:58.188306: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcublas.so.10.0 >>> 2019-07-08 08:37:58.188314: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcufft.so.10.0 >>> 2019-07-08 08:37:58.188322: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcurand.so.10.0 >>> 2019-07-08 08:37:58.188365: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcusolver.so.10.0 >>> 2019-07-08 08:37:58.188373: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcusparse.so.10.0 >>> 2019-07-08 08:37:58.188383: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcudnn.so.7 >>> 2019-07-08 08:37:58.188425: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.189732: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.190444: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu >>> devices: 0 >>> 2019-07-08 08:37:58.190461: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect >>> StreamExecutor with strength 1 edge matrix: >>> 2019-07-08 08:37:58.190484: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 >>> 2019-07-08 08:37:58.190488: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N >>> 2019-07-08 08:37:58.190682: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.191461: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.192553: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow >>> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, >>> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute >>> capability: 7.5) >>> 2019-07-08 08:37:58.193098: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.194157: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with >>> properties: >>> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5 >>> memoryClockRate(GHz): 1.095 >>> pciBusID: 0000:01:00.0 >>> 2019-07-08 08:37:58.194200: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcudart.so.10.0 >>> 2019-07-08 08:37:58.194211: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcublas.so.10.0 >>> 2019-07-08 08:37:58.194219: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcufft.so.10.0 >>> 2019-07-08 08:37:58.194227: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcurand.so.10.0 >>> 2019-07-08 08:37:58.194241: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcusolver.so.10.0 >>> 2019-07-08 08:37:58.194250: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcusparse.so.10.0 >>> 2019-07-08 08:37:58.194276: I >>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully >>> opened dynamic library libcudnn.so.7 >>> 2019-07-08 08:37:58.194347: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.195272: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.196356: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu >>> devices: 0 >>> 2019-07-08 08:37:58.196380: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect >>> StreamExecutor with strength 1 edge matrix: >>> 2019-07-08 08:37:58.196388: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0 >>> 2019-07-08 08:37:58.196393: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N >>> 2019-07-08 08:37:58.196735: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.197553: I >>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA >>> node read from SysFS had negative value (-1), but there must be at least >>> one NUMA node, so returning NUMA node zero >>> 2019-07-08 08:37:58.198262: I >>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow >>> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0, >>> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute >>> capability: 7.5) >>> >>> >>> >>> On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***> >>> wrote: >>> >>>> How did you install CUDA? What operating system and what release is >>>> this? >>>> >>>> — >>>> You are receiving this because you authored the thread. >>>> Reply to this email directly, view it on GitHub >>>> <#989>, >>>> or mute the thread >>>> <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q> >>>> . >>>> >>>

iamlemec · 2019-10-01T04:56:16Z

I'm getting this same error with python3.7 and CUDA 10.0. It seems like it doesn't actually check CUDA_DIR? Symlinking my CUDA_DIR to /usr/local/cuda solved the problem.

lhk · 2019-12-05T18:57:10Z

Same problem here. My cuda installation is not in /usr/local/cuda, but in /usr/lib/cuda. After symlinking it works

martinosorb · 2020-04-15T16:06:31Z

murphyk's fix worked for me, but it's rather ugly. I hope a better solution can be found soon :) thanks

KeAWang · 2020-04-24T00:54:25Z

I tried murphyk's fix but I still get RuntimeError: Internal: libdevice not found at ./libdevice.10.bc

skye · 2020-04-24T19:15:54Z

Here's my understanding of this issue:

jax depends on XLA, which is built as part of TF and bundled up into the jaxlib package. By default, TF is compiled to look for cuda and cudnn in /usr/local/cuda: https://github.com/tensorflow/tensorflow/blob/master/third_party/gpus/cuda_configure.bzl#L14

So symlinking your cuda install to /usr/local/cuda should work. Make sure libdevice actually exists... I always have a hard time figuring out which Nvidia downloads contain libraries, but I think libdevice is shipped as part of https://developer.nvidia.com/cuda-toolkit.

Alternatively, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda should work. I recommend exporting this outside the Python interpreter to be sure it's being picked up when jaxlib is loaded (there's probably a more targeted way to do it, but this will limit mistakes).

Is anyone still having problems after trying these methods?

We should also potentially make a jax-specific environment variable to set a custom cuda install path, or at least document the XLA_FLAGS one more clearly... I can do that once we verify this actually works.

KeAWang · 2020-04-24T19:35:28Z

Maybe I have a different issue then... I tried both symlinking and setting XLA_FLAGS, and my libdevice.10.bc is located at /usr/local/cuda/nvvm/libdevice/libdevice.10.bc, but I'm still getting the same RuntimeError: Internal: libdevice not found at ./libdevice.10.bc

Full stack trace:

>>> import jax.numpy as np
>>> np.sin(3)
2020-04-24 15:42:25.181013: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:70] Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result in compilation or runtime failures, if the program we try to run uses routines from libdevice.
2020-04-24 15:42:25.181030: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:71] Searched for CUDA in the following directories:
2020-04-24 15:42:25.181035: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74]   ./cuda_sdk_lib
2020-04-24 15:42:25.181038: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74]   /usr/local/cuda-10.2
2020-04-24 15:42:25.181041: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:74]   .
2020-04-24 15:42:25.181043: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions.  For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-04-24 15:42:25.181949: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/gpu_backend_lib.cc:311] libdevice is required by this HLO module but was not found at ./libdevice.10.bc
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/alex/miniconda3/envs/dev/lib/python3.8/site-packages/jax/numpy/lax_numpy.py", line 413, in fn
    return lax_fn(x)
  File "/home/alex/miniconda3/envs/dev/lib/python3.8/site-packages/jax/lax/lax.py", line 161, in sin
    return sin_p.bind(x)
  File "/home/alex/miniconda3/envs/dev/lib/python3.8/site-packages/jax/core.py", line 199, in bind
    return self.impl(*args, **kwargs)
  File "/home/alex/miniconda3/envs/dev/lib/python3.8/site-packages/jax/interpreters/xla.py", line 166, in apply_primitive
    compiled_fun = xla_primitive_callable(prim, *map(arg_spec, args), **params)
  File "/home/alex/miniconda3/envs/dev/lib/python3.8/site-packages/jax/interpreters/xla.py", line 197, in xla_primitive_callable
    compiled = built_c.Compile(compile_options=options, backend=backend)
  File "/home/alex/miniconda3/envs/dev/lib/python3.8/site-packages/jaxlib/xla_client.py", line 576, in Compile
    return backend.compile(self.computation, compile_options)
  File "/home/alex/miniconda3/envs/dev/lib/python3.8/site-packages/jaxlib/xla_client.py", line 152, in compile
    return _xla.LocalExecutable.Compile(c_computation,
RuntimeError: Internal: libdevice not found at ./libdevice.10.bc

KeAWang · 2020-04-24T19:45:48Z

Ok upon looking at the stack trace again, it looks like XLA is searching in /usr/local/cuda-10.2 instead of /usr/local/cuda. Making another symlink fixed this issue for me.

Anyone know why it's searching for cuda-10.2? I installed using the automatic method in the README:

pip install --upgrade https://storage.googleapis.com/jax-releases/`nvidia-smi | sed -En "s/.* CUDA Version: ([0-9]*)\.([0-9]*).*/cuda\1\2/p"`/jaxlib-0.1.45-`python3 -V | sed -En "s/Python ([0-9]*)\.([0-9]*).*/cp\1\2/p"`-none-linux_x86_64.whl jax

iamlemec · 2020-04-24T20:10:55Z

@skye Both those methods are working for me. As @KeAWang documented, I'm also seeing that in the absence of XLA_FLAGS info it will look in /usr/local/cuda-XXX, depending on the CUDA version. Would be great if the XLA folks could either actually check CUDA_DIR or simply not have an error message claiming to do so.

skye · 2020-04-24T21:07:10Z

Thanks @KeAWang and @iamlemec. Agreed this could be much clearer. I've filed an internal bug against XLA with your suggestions and some of my own :) These are the suggestions:

The actionable information from the WARNING log could be included directly in the error message (e.g. which path(s) to symlink, XLA_FLAGS=...).
The warning message mentions ${CUDA_DIR}/nvvm/libdevice., but it appears $CUDA_DIR isn't actually used. I'm not sure if this is a standard-ish env var to use, so we could either look use it, or not mention it at all.
We could provide installation instructions for where CUDA Toolkit should be installed. I can do this for the JAX instructions, but maybe https://www.tensorflow.org/install/gpu should be updated as well.
Should we add more default paths to check? e.g. /usr/lib/nvidia-cuda-toolkit?

Please comment if I should correct anything or you have other suggestions!

martinosorb · 2020-04-26T12:58:27Z

I personally have cuda in /opt/cuda, not sure why. Also, $CUDA_DIR does not seem to be defined by default, I don't know if it's defined while the script runs but that seems unlikely (I have very little understanding of these issues though).

KeAWang · 2020-04-26T15:28:33Z

I also have it in /opt/cuda because my installation is through the Arch User Repository package.

stevensslee · 2020-05-01T20:37:59Z

Hi All,
I've tried murphyk's fix and symlinking the cuda directory to /usr/local/cuda, but have received this possible error:

2020-04-29 11:18:32.823934: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] Searched for CUDA in the following directories:
2020-04-29 11:18:32.823950: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:80] /home/steven/xla
2020-04-29 11:18:32.823961: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:80] /usr/local/cuda
2020-04-29 11:18:32.823973: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:80] .
2020-04-29 11:18:32.823985: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:82] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-04-29 11:18:32.872919: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version. Custom ptxas location can be specified using $PATH.

I was wondering if anyone has any more information about this output? Running jax code seems to "work" as it does not through an error, but I'm not sure if the GPU is actually being used.
For further information, I have used conda to create an environment for both Cuda and Jax.

Thanks for reading!

skye · 2020-05-01T21:58:26Z

Does /usr/local/cuda/bin/ptxas exist? You may need to install the CUDA toolkit if not.

tomweingarten · 2020-05-02T19:56:48Z

For people running into this problem after an install of Ubuntu 20.04 with Ubuntu's cuda toolkit package, KeAWang's suggestion works but you need cuda-10.1 instead:

sudo ln -s /usr/lib/cuda /usr/local/cuda-10.1

tigerneil · 2020-05-11T16:51:58Z

(jax3) ubuntu@ip-172-31-13-179:~$ python
Python 3.7.0 (default, Oct  9 2018, 10:31:47)
[GCC 7.3.0] :: Anaconda, Inc. on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import jax
>>> from jax.lib import xla_bridge
>>> print(xla_bridge.get_backend().platform)
gpu
>>>

Thanks @murphyk It worked for me.

use run editor add env variable. make it remotely work.

tigerneil · 2020-05-11T17:12:37Z

Hi All,
I've tried murphyk's fix and symlinking the cuda directory to /usr/local/cuda, but have received this possible error:

2020-04-29 11:18:32.823934: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] Searched for CUDA in the following directories:
2020-04-29 11:18:32.823950: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:80] /home/steven/xla
2020-04-29 11:18:32.823961: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:80] /usr/local/cuda
2020-04-29 11:18:32.823973: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:80] .
2020-04-29 11:18:32.823985: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:82] You can choose the search directory by setting xla_gpu_cuda_data_dir in HloModule's DebugOptions. For most apps, setting the environment variable XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2020-04-29 11:18:32.872919: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:76] Can't find ptxas binary in ${CUDA_DIR}/bin. Will back to the GPU driver for PTX -> sass compilation. This is OK so long as you don't see a warning below about an out-of-date driver version. Custom ptxas location can be specified using $PATH.

I was wondering if anyone has any more information about this output? Running jax code seems to "work" as it does not through an error, but I'm not sure if the GPU is actually being used.
For further information, I have used conda to create an environment for both Cuda and Jax.

Thanks for reading!

you may try my code above to check whether gpu is used.

refraction-ray · 2020-05-15T03:19:38Z

@skye ，set XLA_FLAGS works for me. I believe this is a very important piece of information that should be in README#installation part as soon as possible. Since in most of setups, cuda installation is not in the default path XLA is looking for. And the error is confusing unless they can find this issue :)

…ation. See jax-ml#989

skye · 2020-05-22T21:59:23Z

Hi, sorry for the delay on this. I've created a PR with updated installation instructions: #3190. Please comment if you have any suggestions. We can do even more to address this situation (@hawkinsp suggested bundling libdevice with jaxlib), but hopefully this will help for now.

…ation. (#3190) See #989

v-i-s-h · 2020-06-05T01:07:21Z

I have a conda environment for jax with cudatoolkit and cuDNN installed from anaconda channel. I am in Manajro Linux and use optirun ro enable GPU. I get the same error when executing (the first example in README)

XLA_FLAGS=--xla_gpu_cuda_data_dir=<conda-env-path>/lib/ optirun python gp.py

even after setting XLA_FLAGS.

My CUDA version is

cudatoolkit               10.1.243             h6bb024c_0    anaconda
cudnn                     7.6.5                cuda10.1_0    anaconda

Nvidia driver version is 418.113.
PS: optirun with tensorflow and pytorch runs fine.

skye · 2020-06-05T01:10:30Z

Do you get the same error without optirun? Also, can you try creating a symlink as described above and in the README? This will help narrow down where the problem is.

…ation. (jax-ml#3190) See jax-ml#989

coded5282 · 2020-07-01T05:30:02Z

@v-i-s-h Were you ever able to resolve this issue by setting the anaconda path? That's what I've been trying to do but it hasn't been working.

v-i-s-h · 2020-07-04T15:54:47Z

@coded5282 Nope. I am still getting the same error.
I didn't try the symlink as it goes directly to the system settings and I am a bit afraid it may break some of my other configurations.

grantmcdermott · 2020-07-06T19:28:53Z

Just to add: Same error as everyone else. Tried setting the XLA_FLAGS environment variable; didn't work. However, adding the symlink did (for me: $ sudo ln -s /opt/cuda /usr/local/cuda-10.2)

Like @KeAWang and several others I'm on Arch and installed CUDA through through the AUR.

kmario23 · 2021-03-09T20:14:05Z

First check where your CUDA installation resides.

$ whereis -b cuda
cuda: /usr/lib/cuda

For instance, if the above command spits out /usr/lib/cuda, then that's where your CUDA installation is.
So, we now need to get the version number.

$ cat /usr/lib/cuda/version.txt
CUDA Version 11.0.228

But JAX, by default, looks for CUDA installation in /usr/local/cuda-<version>. So, we need to create a symlink with the specific version. This would do the redirection to actual cuda installation location when JAX searches in /usr/local/cuda-<version>.

$ sudo ln -s /usr/lib/cuda /usr/local/cuda-11.0

The above steps, in total in that order, would solve the issue, at least it did for me.

Note: Please keep in mind that the CUDA version (e.g., 11.0) of JAX binary that is installed on your machine should also be compiled for the specific CUDA installation in /usr/local/cuda-<11.0>.

kayhan-batmanghelich · 2021-04-16T02:00:06Z

Hi Everyone,

I still have an issue with this. I get the error of :

[...]
RuntimeError: Internal: libdevice not found at ./libdevice.10.bc

cuda is installed and I made a soft link. Here is some more info:

(jax) kayhan@lambda-dual:~$ whereis -b cuda
cuda: /usr/include/cuda.h /usr/include/cuda

(jax) kayhan@lambda-dual:~$ ls -lt /usr/local/cuda-11.1
lrwxrwxrwx 1 root root 39 Mar 11 15:16 /usr/local/cuda-11.1 -> /usr/lib/nvidia-cuda-toolkit/libdevice/

(jax) kayhan@lambda-dual:~$ ls -lt /usr/lib/nvidia-cuda-toolkit/libdevice/
total 464
lrwxrwxrwx 1 root root     13 Apr 15 15:21 cuda -> /usr/lib/cuda
-rw-r--r-- 1 root root 471124 Oct 16 13:42 libdevice.10.bc

i tried setting XLA flag but i still have the same issue:

(jax) kayhan@lambda-dual:~$ export XLA_FLAGS="--xla_gpu_cuda_data_dir=/usr/lib/nvidia-cuda-toolkit/libdevice/"
(jax) kayhan@lambda-dual:~$ ipython
[...]
RuntimeError: Internal: libdevice not found at ./libdevice.10.bc

kayhan-batmanghelich · 2021-04-19T00:52:15Z

This problem is addressed here:

#6479 (comment)

the instruction needs to clarify that one should make nvvm folder inside of the cuda to work.

hawkinsp · 2021-05-12T13:28:40Z

Good news! As of jaxlib 0.1.66, which was just released yesterday, we now bundle libdevice inside the jaxlib CUDA wheels. JAX should now always find it successfully. Hope that helps!

lhk mentioned this issue Dec 5, 2019

Why /usr/lib instead of /usr/local ? system76/cuda#13

Closed

jacobjinkelly mentioned this issue Mar 24, 2020

ImportError: .../jaxlib/xla_extension.so: symbol cudnnSetCTCLossDescriptorEx version libcudnn.so.7 not defined in file libcudnn.so.7 with link time reference #2494

Closed

skye added a commit to skye/jax that referenced this issue May 22, 2020

Update installation directions in README to mention expected CUDA loc…

9899b0b

…ation. See jax-ml#989

skye mentioned this issue May 22, 2020

Update installation directions in README to mention expected CUDA location. #3190

Merged

skye added a commit that referenced this issue May 28, 2020

Update installation directions in README to mention expected CUDA loc…

6e50aa9

…ation. (#3190) See #989

NeilGirdhar pushed a commit to NeilGirdhar/jax that referenced this issue Jun 11, 2020

Update installation directions in README to mention expected CUDA loc…

f3154ef

…ation. (jax-ml#3190) See jax-ml#989

gshartnett mentioned this issue Jun 22, 2020

Problem linking jax with existing CUDA, CUDNN installations #3503

Closed

zhihou7 mentioned this issue Sep 9, 2020

tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot assign a device for operation zhihou7/VCL#5

Closed

j-towns mentioned this issue Mar 12, 2021

issues with libdevice.10 j-towns/vdvae-jax#1

Closed

hawkinsp closed this as completed May 12, 2021

HumphreyYang mentioned this issue Feb 27, 2023

Update jax box at top of lecture, update CUDA to 11.8 QuantEcon/lecture-python.myst#319

Merged

samclearman mentioned this issue Jul 13, 2023

Can't find libdevice #16726

Closed

cannot find libdevice #989

cannot find libdevice #989

Comments

murphyk commented Jul 7, 2019

murphyk commented Jul 8, 2019

murphyk commented Jul 8, 2019

hawkinsp commented Jul 8, 2019 • edited Loading

murphyk commented Jul 8, 2019 via email

murphyk commented Jul 8, 2019 via email

murphyk commented Jul 8, 2019 via email

murphyk commented Jul 8, 2019 via email

murphyk commented Jul 16, 2019 via email

iamlemec commented Oct 1, 2019

lhk commented Dec 5, 2019

martinosorb commented Apr 15, 2020

KeAWang commented Apr 24, 2020 • edited Loading

skye commented Apr 24, 2020

KeAWang commented Apr 24, 2020 • edited Loading

KeAWang commented Apr 24, 2020 • edited Loading

iamlemec commented Apr 24, 2020

skye commented Apr 24, 2020

martinosorb commented Apr 26, 2020

KeAWang commented Apr 26, 2020

stevensslee commented May 1, 2020

skye commented May 1, 2020

tomweingarten commented May 2, 2020

tigerneil commented May 11, 2020 • edited Loading

tigerneil commented May 11, 2020

refraction-ray commented May 15, 2020

skye commented May 22, 2020

v-i-s-h commented Jun 5, 2020 • edited Loading

skye commented Jun 5, 2020

coded5282 commented Jul 1, 2020

v-i-s-h commented Jul 4, 2020

grantmcdermott commented Jul 6, 2020

kmario23 commented Mar 9, 2021 • edited Loading

kayhan-batmanghelich commented Apr 16, 2021

kayhan-batmanghelich commented Apr 19, 2021

hawkinsp commented May 12, 2021

hawkinsp commented Jul 8, 2019 •

edited

Loading

KeAWang commented Apr 24, 2020 •

edited

Loading

KeAWang commented Apr 24, 2020 •

edited

Loading

KeAWang commented Apr 24, 2020 •

edited

Loading

tigerneil commented May 11, 2020 •

edited

Loading

v-i-s-h commented Jun 5, 2020 •

edited

Loading

kmario23 commented Mar 9, 2021 •

edited

Loading