-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cannot find libdevice #989
Comments
I think I want it to find this file
I tried
to no avail. |
or maybe this file? /usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc |
How did you install CUDA? What operating system and what release is this? In fact, did you install CUDA at all? |
This is a tensorbook running ubuntu 18.04.
It is prebundled with cuda 10.0 installed by vendor:
https://lambdalabs.com/lambda-stack-deep-learning-software
I forgot to mention that both torch and tf2 can find the GPU.
Below is some diagnostics from TF. Is there a way for jax to "piggy back"
off the TF GPU support?
import tensorflow as tf
from tensorflow import keras
print("tf version {}".format(tf.__version__))
if tf.test.is_gpu_available():
print(tf.test.gpu_device_name())
else:
print("TF cannot find GPU")
tf version 2.0.0-beta1
/device:GPU:0
2019-07-08 08:37:58.187392: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.188242: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
properties:
name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
memoryClockRate(GHz): 1.095
pciBusID: 0000:01:00.0
2019-07-08 08:37:58.188295: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudart.so.10.0
2019-07-08 08:37:58.188306: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcublas.so.10.0
2019-07-08 08:37:58.188314: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcufft.so.10.0
2019-07-08 08:37:58.188322: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcurand.so.10.0
2019-07-08 08:37:58.188365: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusolver.so.10.0
2019-07-08 08:37:58.188373: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusparse.so.10.0
2019-07-08 08:37:58.188383: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudnn.so.7
2019-07-08 08:37:58.188425: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.189732: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.190444: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
devices: 0
2019-07-08 08:37:58.190461: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
StreamExecutor with strength 1 edge matrix:
2019-07-08 08:37:58.190484: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-07-08 08:37:58.190488: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-07-08 08:37:58.190682: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.191461: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.192553: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
capability: 7.5)
2019-07-08 08:37:58.193098: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.194157: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
properties:
name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
memoryClockRate(GHz): 1.095
pciBusID: 0000:01:00.0
2019-07-08 08:37:58.194200: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudart.so.10.0
2019-07-08 08:37:58.194211: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcublas.so.10.0
2019-07-08 08:37:58.194219: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcufft.so.10.0
2019-07-08 08:37:58.194227: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcurand.so.10.0
2019-07-08 08:37:58.194241: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusolver.so.10.0
2019-07-08 08:37:58.194250: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusparse.so.10.0
2019-07-08 08:37:58.194276: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudnn.so.7
2019-07-08 08:37:58.194347: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.195272: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.196356: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
devices: 0
2019-07-08 08:37:58.196380: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
StreamExecutor with strength 1 edge matrix:
2019-07-08 08:37:58.196388: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-07-08 08:37:58.196393: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-07-08 08:37:58.196735: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.197553: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.198262: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
capability: 7.5)
…On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***> wrote:
How did you install CUDA? What operating system and what release is this?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#989>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q>
.
|
What's also weird is that JAX says it can find the GPU but when I try to
run some actual code, I get the libdevice error:
import os
...:
os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"
...: import jax
...: import jax.numpy as np
...: from jax import grad, jacfwd, jacrev, jit, vmap
...: from jax.experimental import optimizers
...: print("jax version {}".format(jax.__version__))
...: from jax.lib import xla_bridge
...: print("jax backend {}".format(xla_bridge.get_backend().platform))
jax version 0.1.39
jax backend gpu
In [4]: from jax import random
...: key = random.PRNGKey(0)
...: x = random.normal(key, (5,5))
2019-07-08 08:40:15.551685: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129]
Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result
in compilation or runtime failures, if the program we try to run uses
routines from libdevice.
2019-07-08 08:40:15.551706: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130]
Searched for CUDA in the following directories:
2019-07-08 08:40:15.551710: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
/home/murphyk/miniconda3/lib
2019-07-08 08:40:15.551713: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
/usr/local/cuda
2019-07-08 08:40:15.551715: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
.
2019-07-08 08:40:15.551718: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135]
You can choose the search directory by setting xla_gpu_cuda_data_dir in
HloModule's DebugOptions. For most apps, setting the environment variable
XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2019-07-08 08:40:16.203818: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105]
Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
Traceback (most recent call last):
File "<ipython-input-4-d8a87c178f8a>", line 3, in <module>
x = random.normal(key, (5,5))
File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py",
line 389, in normal
return _normal(key, shape, dtype)
File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py",
line 123, in f_jitted
out = xla.xla_call(flat_fun, *args_flat, device_values=device_values)
File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py",
line 663, in call_bind
ans = primitive.impl(f, *args, **params)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
line 606, in xla_call_impl
compiled_fun = xla_callable(fun, device_values, *map(abstractify, args))
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py",
line 208, in memoized_fun
ans = call(f, *args)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
line 621, in xla_callable
compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
line 207, in compile_jaxpr
backend=xb.get_backend()), result_shape
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
line 535, in Compile
return backend.compile(self.computation, compile_options)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
line 118, in compile
compile_options.device_assignment)
RuntimeError: Not found: ./libdevice.compute_20.10.bc not found
In [5]:
In [5]:
…On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote:
This is a tensorbook running ubuntu 18.04.
It is prebundled with cuda 10.0 installed by vendor:
https://lambdalabs.com/lambda-stack-deep-learning-software
I forgot to mention that both torch and tf2 can find the GPU.
Below is some diagnostics from TF. Is there a way for jax to "piggy back"
off the TF GPU support?
import tensorflow as tf
from tensorflow import keras
print("tf version {}".format(tf.__version__))
if tf.test.is_gpu_available():
print(tf.test.gpu_device_name())
else:
print("TF cannot find GPU")
tf version 2.0.0-beta1
/device:GPU:0
2019-07-08 08:37:58.187392: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.188242: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
properties:
name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
memoryClockRate(GHz): 1.095
pciBusID: 0000:01:00.0
2019-07-08 08:37:58.188295: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudart.so.10.0
2019-07-08 08:37:58.188306: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcublas.so.10.0
2019-07-08 08:37:58.188314: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcufft.so.10.0
2019-07-08 08:37:58.188322: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcurand.so.10.0
2019-07-08 08:37:58.188365: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusolver.so.10.0
2019-07-08 08:37:58.188373: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusparse.so.10.0
2019-07-08 08:37:58.188383: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudnn.so.7
2019-07-08 08:37:58.188425: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.189732: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.190444: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
devices: 0
2019-07-08 08:37:58.190461: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
StreamExecutor with strength 1 edge matrix:
2019-07-08 08:37:58.190484: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-07-08 08:37:58.190488: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-07-08 08:37:58.190682: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.191461: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.192553: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
capability: 7.5)
2019-07-08 08:37:58.193098: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.194157: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
properties:
name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
memoryClockRate(GHz): 1.095
pciBusID: 0000:01:00.0
2019-07-08 08:37:58.194200: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudart.so.10.0
2019-07-08 08:37:58.194211: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcublas.so.10.0
2019-07-08 08:37:58.194219: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcufft.so.10.0
2019-07-08 08:37:58.194227: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcurand.so.10.0
2019-07-08 08:37:58.194241: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusolver.so.10.0
2019-07-08 08:37:58.194250: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcusparse.so.10.0
2019-07-08 08:37:58.194276: I
tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
opened dynamic library libcudnn.so.7
2019-07-08 08:37:58.194347: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.195272: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.196356: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
devices: 0
2019-07-08 08:37:58.196380: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
StreamExecutor with strength 1 edge matrix:
2019-07-08 08:37:58.196388: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
2019-07-08 08:37:58.196393: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
2019-07-08 08:37:58.196735: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.197553: I
tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
node read from SysFS had negative value (-1), but there must be at least
one NUMA node, so returning NUMA node zero
2019-07-08 08:37:58.198262: I
tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
capability: 7.5)
On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***>
wrote:
> How did you install CUDA? What operating system and what release is this?
>
> —
> You are receiving this because you authored the thread.
> Reply to this email directly, view it on GitHub
> <#989>,
> or mute the thread
> <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q>
> .
>
|
One more thing. When I turn my GPU off (*), JAX falls back to CPU and then
it works like a charm.
It would be great if the user could choose which mode JAX uses from within
Python, without having to turn the GPU off.
(*) I don't literally turn the GPU off - I don't even know how to do that!
I simply open a browser before I open python.
The browser seems to "lock up" the GPU and then all of torch, tf or jax say
the GPU is unavailable.
This is actually pretty annoying, and the only fix I have found so far
seems to be to type 'shutdown -h now'
and then start my python IDE before my browser. (Do you know a better way?)
But this is not a JAX issue, of course :)
…On Mon, Jul 8, 2019 at 8:47 AM Kevin Murphy ***@***.***> wrote:
What's also weird is that JAX says it can find the GPU but when I try to
run some actual code, I get the libdevice error:
import os
...:
os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"
...: import jax
...: import jax.numpy as np
...: from jax import grad, jacfwd, jacrev, jit, vmap
...: from jax.experimental import optimizers
...: print("jax version {}".format(jax.__version__))
...: from jax.lib import xla_bridge
...: print("jax backend {}".format(xla_bridge.get_backend().platform))
jax version 0.1.39
jax backend gpu
In [4]: from jax import random
...: key = random.PRNGKey(0)
...: x = random.normal(key, (5,5))
2019-07-08 08:40:15.551685: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129]
Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result
in compilation or runtime failures, if the program we try to run uses
routines from libdevice.
2019-07-08 08:40:15.551706: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130]
Searched for CUDA in the following directories:
2019-07-08 08:40:15.551710: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
/home/murphyk/miniconda3/lib
2019-07-08 08:40:15.551713: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
/usr/local/cuda
2019-07-08 08:40:15.551715: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
.
2019-07-08 08:40:15.551718: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135]
You can choose the search directory by setting xla_gpu_cuda_data_dir in
HloModule's DebugOptions. For most apps, setting the environment variable
XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
2019-07-08 08:40:16.203818: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105]
Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
Traceback (most recent call last):
File "<ipython-input-4-d8a87c178f8a>", line 3, in <module>
x = random.normal(key, (5,5))
File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py",
line 389, in normal
return _normal(key, shape, dtype)
File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py",
line 123, in f_jitted
out = xla.xla_call(flat_fun, *args_flat, device_values=device_values)
File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py",
line 663, in call_bind
ans = primitive.impl(f, *args, **params)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
line 606, in xla_call_impl
compiled_fun = xla_callable(fun, device_values, *map(abstractify, args))
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py",
line 208, in memoized_fun
ans = call(f, *args)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
line 621, in xla_callable
compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
line 207, in compile_jaxpr
backend=xb.get_backend()), result_shape
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
line 535, in Compile
return backend.compile(self.computation, compile_options)
File
"/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
line 118, in compile
compile_options.device_assignment)
RuntimeError: Not found: ./libdevice.compute_20.10.bc not found
In [5]:
In [5]:
On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote:
> This is a tensorbook running ubuntu 18.04.
> It is prebundled with cuda 10.0 installed by vendor:
> https://lambdalabs.com/lambda-stack-deep-learning-software
>
> I forgot to mention that both torch and tf2 can find the GPU.
> Below is some diagnostics from TF. Is there a way for jax to "piggy back"
> off the TF GPU support?
>
> import tensorflow as tf
> from tensorflow import keras
> print("tf version {}".format(tf.__version__))
> if tf.test.is_gpu_available():
> print(tf.test.gpu_device_name())
> else:
> print("TF cannot find GPU")
>
>
> tf version 2.0.0-beta1
> /device:GPU:0
> 2019-07-08 08:37:58.187392: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.188242: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
> properties:
> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
> memoryClockRate(GHz): 1.095
> pciBusID: 0000:01:00.0
> 2019-07-08 08:37:58.188295: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcudart.so.10.0
> 2019-07-08 08:37:58.188306: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcublas.so.10.0
> 2019-07-08 08:37:58.188314: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcufft.so.10.0
> 2019-07-08 08:37:58.188322: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcurand.so.10.0
> 2019-07-08 08:37:58.188365: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcusolver.so.10.0
> 2019-07-08 08:37:58.188373: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcusparse.so.10.0
> 2019-07-08 08:37:58.188383: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcudnn.so.7
> 2019-07-08 08:37:58.188425: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.189732: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.190444: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
> devices: 0
> 2019-07-08 08:37:58.190461: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
> StreamExecutor with strength 1 edge matrix:
> 2019-07-08 08:37:58.190484: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
> 2019-07-08 08:37:58.190488: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
> 2019-07-08 08:37:58.190682: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.191461: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.192553: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
> capability: 7.5)
> 2019-07-08 08:37:58.193098: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.194157: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
> properties:
> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
> memoryClockRate(GHz): 1.095
> pciBusID: 0000:01:00.0
> 2019-07-08 08:37:58.194200: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcudart.so.10.0
> 2019-07-08 08:37:58.194211: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcublas.so.10.0
> 2019-07-08 08:37:58.194219: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcufft.so.10.0
> 2019-07-08 08:37:58.194227: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcurand.so.10.0
> 2019-07-08 08:37:58.194241: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcusolver.so.10.0
> 2019-07-08 08:37:58.194250: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcusparse.so.10.0
> 2019-07-08 08:37:58.194276: I
> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
> opened dynamic library libcudnn.so.7
> 2019-07-08 08:37:58.194347: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.195272: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.196356: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
> devices: 0
> 2019-07-08 08:37:58.196380: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
> StreamExecutor with strength 1 edge matrix:
> 2019-07-08 08:37:58.196388: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
> 2019-07-08 08:37:58.196393: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
> 2019-07-08 08:37:58.196735: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.197553: I
> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
> node read from SysFS had negative value (-1), but there must be at least
> one NUMA node, so returning NUMA node zero
> 2019-07-08 08:37:58.198262: I
> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
> capability: 7.5)
>
>
>
> On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***>
> wrote:
>
>> How did you install CUDA? What operating system and what release is this?
>>
>> —
>> You are receiving this because you authored the thread.
>> Reply to this email directly, view it on GitHub
>> <#989>,
>> or mute the thread
>> <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q>
>> .
>>
>
|
To summarize, this is the error JAX produces
2019-07-08 10:24:47.184009: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105]
Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
...
RuntimeError: Not found: ./libdevice.compute_20.10.bc not found
I do have libdevice.10.bc installed (in several places),
locate libdevice
/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc
/home/murphyk/miniconda3/lib/libdevice.10.bc
I've tried setting
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib"
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib/nvidia-cuda-toolkit"
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr"
None work. I decided to look at the file nvptx_backend_lib.cc (possibly
installed by TF, not JAX, not sure).
locate nvptx_backend_lib.cc
/home/murphyk/.cache/bazel/_bazel_murphyk/87cf5205cc24305d7da6a4fb49af7044/external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc
And searched for the string "Defaulting to libdevice..."
but could not find it! Instead I find
"Defaulting to telling LLVM that we're compiling for sm_"
<< sm_version;
The source code actually lists {7,5} as a valid combination.
So I don;t know why JAX says "Unknown compute capability (7, 5) " - its in
the lookup table.
Maybe JAX wheel is compiled with an older version of this file?
…On Mon, Jul 8, 2019 at 8:52 AM Kevin Murphy ***@***.***> wrote:
One more thing. When I turn my GPU off (*), JAX falls back to CPU and then
it works like a charm.
It would be great if the user could choose which mode JAX uses from within
Python, without having to turn the GPU off.
(*) I don't literally turn the GPU off - I don't even know how to do that!
I simply open a browser before I open python.
The browser seems to "lock up" the GPU and then all of torch, tf or jax
say the GPU is unavailable.
This is actually pretty annoying, and the only fix I have found so far
seems to be to type 'shutdown -h now'
and then start my python IDE before my browser. (Do you know a better way?)
But this is not a JAX issue, of course :)
On Mon, Jul 8, 2019 at 8:47 AM Kevin Murphy ***@***.***> wrote:
> What's also weird is that JAX says it can find the GPU but when I try to
> run some actual code, I get the libdevice error:
>
> import os
>
> ...:
> os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"
>
> ...: import jax
>
> ...: import jax.numpy as np
>
> ...: from jax import grad, jacfwd, jacrev, jit, vmap
>
> ...: from jax.experimental import optimizers
>
> ...: print("jax version {}".format(jax.__version__))
>
> ...: from jax.lib import xla_bridge
>
> ...: print("jax backend {}".format(xla_bridge.get_backend().platform))
>
> jax version 0.1.39
>
> jax backend gpu
>
>
> In [4]: from jax import random
>
> ...: key = random.PRNGKey(0)
>
> ...: x = random.normal(key, (5,5))
>
>
> 2019-07-08 08:40:15.551685: W
> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129]
> Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result
> in compilation or runtime failures, if the program we try to run uses
> routines from libdevice.
>
> 2019-07-08 08:40:15.551706: W
> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130]
> Searched for CUDA in the following directories:
>
> 2019-07-08 08:40:15.551710: W
> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
> /home/murphyk/miniconda3/lib
>
> 2019-07-08 08:40:15.551713: W
> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
> /usr/local/cuda
>
> 2019-07-08 08:40:15.551715: W
> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
> .
>
> 2019-07-08 08:40:15.551718: W
> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135]
> You can choose the search directory by setting xla_gpu_cuda_data_dir in
> HloModule's DebugOptions. For most apps, setting the environment variable
> XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
>
> 2019-07-08 08:40:16.203818: W
> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105]
> Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
>
> Traceback (most recent call last):
>
>
> File "<ipython-input-4-d8a87c178f8a>", line 3, in <module>
>
> x = random.normal(key, (5,5))
>
>
> File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py",
> line 389, in normal
>
> return _normal(key, shape, dtype)
>
>
> File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py",
> line 123, in f_jitted
>
> out = xla.xla_call(flat_fun, *args_flat, device_values=device_values)
>
>
> File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py",
> line 663, in call_bind
>
> ans = primitive.impl(f, *args, **params)
>
>
> File
> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
> line 606, in xla_call_impl
>
> compiled_fun = xla_callable(fun, device_values, *map(abstractify, args))
>
>
> File
> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py",
> line 208, in memoized_fun
>
> ans = call(f, *args)
>
>
> File
> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
> line 621, in xla_callable
>
> compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args)
>
>
> File
> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
> line 207, in compile_jaxpr
>
> backend=xb.get_backend()), result_shape
>
>
> File
> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
> line 535, in Compile
>
> return backend.compile(self.computation, compile_options)
>
>
> File
> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
> line 118, in compile
>
> compile_options.device_assignment)
>
>
> RuntimeError: Not found: ./libdevice.compute_20.10.bc not found
>
>
>
> In [5]:
>
>
> In [5]:
>
> On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote:
>
>> This is a tensorbook running ubuntu 18.04.
>> It is prebundled with cuda 10.0 installed by vendor:
>> https://lambdalabs.com/lambda-stack-deep-learning-software
>>
>> I forgot to mention that both torch and tf2 can find the GPU.
>> Below is some diagnostics from TF. Is there a way for jax to "piggy
>> back" off the TF GPU support?
>>
>> import tensorflow as tf
>> from tensorflow import keras
>> print("tf version {}".format(tf.__version__))
>> if tf.test.is_gpu_available():
>> print(tf.test.gpu_device_name())
>> else:
>> print("TF cannot find GPU")
>>
>>
>> tf version 2.0.0-beta1
>> /device:GPU:0
>> 2019-07-08 08:37:58.187392: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.188242: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
>> properties:
>> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
>> memoryClockRate(GHz): 1.095
>> pciBusID: 0000:01:00.0
>> 2019-07-08 08:37:58.188295: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcudart.so.10.0
>> 2019-07-08 08:37:58.188306: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcublas.so.10.0
>> 2019-07-08 08:37:58.188314: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcufft.so.10.0
>> 2019-07-08 08:37:58.188322: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcurand.so.10.0
>> 2019-07-08 08:37:58.188365: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcusolver.so.10.0
>> 2019-07-08 08:37:58.188373: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcusparse.so.10.0
>> 2019-07-08 08:37:58.188383: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcudnn.so.7
>> 2019-07-08 08:37:58.188425: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.189732: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.190444: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
>> devices: 0
>> 2019-07-08 08:37:58.190461: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
>> StreamExecutor with strength 1 edge matrix:
>> 2019-07-08 08:37:58.190484: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
>> 2019-07-08 08:37:58.190488: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
>> 2019-07-08 08:37:58.190682: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.191461: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.192553: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
>> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
>> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
>> capability: 7.5)
>> 2019-07-08 08:37:58.193098: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.194157: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
>> properties:
>> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
>> memoryClockRate(GHz): 1.095
>> pciBusID: 0000:01:00.0
>> 2019-07-08 08:37:58.194200: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcudart.so.10.0
>> 2019-07-08 08:37:58.194211: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcublas.so.10.0
>> 2019-07-08 08:37:58.194219: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcufft.so.10.0
>> 2019-07-08 08:37:58.194227: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcurand.so.10.0
>> 2019-07-08 08:37:58.194241: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcusolver.so.10.0
>> 2019-07-08 08:37:58.194250: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcusparse.so.10.0
>> 2019-07-08 08:37:58.194276: I
>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>> opened dynamic library libcudnn.so.7
>> 2019-07-08 08:37:58.194347: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.195272: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.196356: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
>> devices: 0
>> 2019-07-08 08:37:58.196380: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
>> StreamExecutor with strength 1 edge matrix:
>> 2019-07-08 08:37:58.196388: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
>> 2019-07-08 08:37:58.196393: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
>> 2019-07-08 08:37:58.196735: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.197553: I
>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>> node read from SysFS had negative value (-1), but there must be at least
>> one NUMA node, so returning NUMA node zero
>> 2019-07-08 08:37:58.198262: I
>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
>> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
>> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
>> capability: 7.5)
>>
>>
>>
>> On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***>
>> wrote:
>>
>>> How did you install CUDA? What operating system and what release is
>>> this?
>>>
>>> —
>>> You are receiving this because you authored the thread.
>>> Reply to this email directly, view it on GitHub
>>> <#989>,
>>> or mute the thread
>>> <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q>
>>> .
>>>
>>
|
The folks at Lambda (maker of my TensorBook laptop) looked at the source
code and suggested this fix:
```
mkdir -p ~/xla/nvvm/libdevice
cp /usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc
~/xla/nvvm/libdevice
export XLA_FLAGS="--xla_gpu_cuda_data_dir=/home/murphyk/xla"
```
This actually works :)
Maybe worth updating the set of locations that JAX searches for
libdevice.10.bc?
…On Mon, Jul 8, 2019 at 10:41 AM Kevin Murphy ***@***.***> wrote:
To summarize, this is the error JAX produces
2019-07-08 10:24:47.184009: W
external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105]
Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
...
RuntimeError: Not found: ./libdevice.compute_20.10.bc not found
I do have libdevice.10.bc installed (in several places),
locate libdevice
/usr/lib/nvidia-cuda-toolkit/libdevice/libdevice.10.bc
/home/murphyk/miniconda3/lib/libdevice.10.bc
I've tried setting
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib"
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr/lib/nvidia-cuda-toolkit"
#os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/usr"
None work. I decided to look at the file nvptx_backend_lib.cc (possibly
installed by TF, not JAX, not sure).
locate nvptx_backend_lib.cc
/home/murphyk/.cache/bazel/_bazel_murphyk/87cf5205cc24305d7da6a4fb49af7044/external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc
And searched for the string "Defaulting to libdevice..."
but could not find it! Instead I find
"Defaulting to telling LLVM that we're compiling for sm_"
<< sm_version;
The source code actually lists {7,5} as a valid combination.
So I don;t know why JAX says "Unknown compute capability (7, 5) " - its in
the lookup table.
Maybe JAX wheel is compiled with an older version of this file?
On Mon, Jul 8, 2019 at 8:52 AM Kevin Murphy ***@***.***> wrote:
> One more thing. When I turn my GPU off (*), JAX falls back to CPU and
> then it works like a charm.
> It would be great if the user could choose which mode JAX uses from
> within Python, without having to turn the GPU off.
>
>
> (*) I don't literally turn the GPU off - I don't even know how to do
> that! I simply open a browser before I open python.
> The browser seems to "lock up" the GPU and then all of torch, tf or jax
> say the GPU is unavailable.
> This is actually pretty annoying, and the only fix I have found so far
> seems to be to type 'shutdown -h now'
> and then start my python IDE before my browser. (Do you know a better
> way?)
> But this is not a JAX issue, of course :)
>
>
>
> On Mon, Jul 8, 2019 at 8:47 AM Kevin Murphy ***@***.***> wrote:
>
>> What's also weird is that JAX says it can find the GPU but when I try to
>> run some actual code, I get the libdevice error:
>>
>> import os
>>
>> ...:
>> os.environ["XLA_FLAGS"]="--xla_gpu_cuda_data_dir=/home/murphyk/miniconda3/lib"
>>
>> ...: import jax
>>
>> ...: import jax.numpy as np
>>
>> ...: from jax import grad, jacfwd, jacrev, jit, vmap
>>
>> ...: from jax.experimental import optimizers
>>
>> ...: print("jax version {}".format(jax.__version__))
>>
>> ...: from jax.lib import xla_bridge
>>
>> ...: print("jax backend
>> {}".format(xla_bridge.get_backend().platform))
>>
>> jax version 0.1.39
>>
>> jax backend gpu
>>
>>
>> In [4]: from jax import random
>>
>> ...: key = random.PRNGKey(0)
>>
>> ...: x = random.normal(key, (5,5))
>>
>>
>> 2019-07-08 08:40:15.551685: W
>> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:129]
>> Can't find libdevice directory ${CUDA_DIR}/nvvm/libdevice. This may result
>> in compilation or runtime failures, if the program we try to run uses
>> routines from libdevice.
>>
>> 2019-07-08 08:40:15.551706: W
>> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:130]
>> Searched for CUDA in the following directories:
>>
>> 2019-07-08 08:40:15.551710: W
>> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
>> /home/murphyk/miniconda3/lib
>>
>> 2019-07-08 08:40:15.551713: W
>> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
>> /usr/local/cuda
>>
>> 2019-07-08 08:40:15.551715: W
>> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:133]
>> .
>>
>> 2019-07-08 08:40:15.551718: W
>> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:135]
>> You can choose the search directory by setting xla_gpu_cuda_data_dir in
>> HloModule's DebugOptions. For most apps, setting the environment variable
>> XLA_FLAGS=--xla_gpu_cuda_data_dir=/path/to/cuda will work.
>>
>> 2019-07-08 08:40:16.203818: W
>> external/org_tensorflow/tensorflow/compiler/xla/service/gpu/llvm_gpu_backend/nvptx_backend_lib.cc:105]
>> Unknown compute capability (7, 5) .Defaulting to libdevice for compute_20
>>
>> Traceback (most recent call last):
>>
>>
>> File "<ipython-input-4-d8a87c178f8a>", line 3, in <module>
>>
>> x = random.normal(key, (5,5))
>>
>>
>> File
>> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/random.py",
>> line 389, in normal
>>
>> return _normal(key, shape, dtype)
>>
>>
>> File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/api.py",
>> line 123, in f_jitted
>>
>> out = xla.xla_call(flat_fun, *args_flat, device_values=device_values)
>>
>>
>> File "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/core.py",
>> line 663, in call_bind
>>
>> ans = primitive.impl(f, *args, **params)
>>
>>
>> File
>> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
>> line 606, in xla_call_impl
>>
>> compiled_fun = xla_callable(fun, device_values, *map(abstractify, args))
>>
>>
>> File
>> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/linear_util.py",
>> line 208, in memoized_fun
>>
>> ans = call(f, *args)
>>
>>
>> File
>> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
>> line 621, in xla_callable
>>
>> compiled, result_shape = compile_jaxpr(jaxpr, consts, *abstract_args)
>>
>>
>> File
>> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jax/interpreters/xla.py",
>> line 207, in compile_jaxpr
>>
>> backend=xb.get_backend()), result_shape
>>
>>
>> File
>> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
>> line 535, in Compile
>>
>> return backend.compile(self.computation, compile_options)
>>
>>
>> File
>> "/home/murphyk/miniconda3/lib/python3.7/site-packages/jaxlib/xla_client.py",
>> line 118, in compile
>>
>> compile_options.device_assignment)
>>
>>
>> RuntimeError: Not found: ./libdevice.compute_20.10.bc not found
>>
>>
>>
>> In [5]:
>>
>>
>> In [5]:
>>
>> On Mon, Jul 8, 2019 at 8:46 AM Kevin Murphy ***@***.***> wrote:
>>
>>> This is a tensorbook running ubuntu 18.04.
>>> It is prebundled with cuda 10.0 installed by vendor:
>>> https://lambdalabs.com/lambda-stack-deep-learning-software
>>>
>>> I forgot to mention that both torch and tf2 can find the GPU.
>>> Below is some diagnostics from TF. Is there a way for jax to "piggy
>>> back" off the TF GPU support?
>>>
>>> import tensorflow as tf
>>> from tensorflow import keras
>>> print("tf version {}".format(tf.__version__))
>>> if tf.test.is_gpu_available():
>>> print(tf.test.gpu_device_name())
>>> else:
>>> print("TF cannot find GPU")
>>>
>>>
>>> tf version 2.0.0-beta1
>>> /device:GPU:0
>>> 2019-07-08 08:37:58.187392: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.188242: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
>>> properties:
>>> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
>>> memoryClockRate(GHz): 1.095
>>> pciBusID: 0000:01:00.0
>>> 2019-07-08 08:37:58.188295: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcudart.so.10.0
>>> 2019-07-08 08:37:58.188306: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcublas.so.10.0
>>> 2019-07-08 08:37:58.188314: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcufft.so.10.0
>>> 2019-07-08 08:37:58.188322: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcurand.so.10.0
>>> 2019-07-08 08:37:58.188365: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcusolver.so.10.0
>>> 2019-07-08 08:37:58.188373: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcusparse.so.10.0
>>> 2019-07-08 08:37:58.188383: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcudnn.so.7
>>> 2019-07-08 08:37:58.188425: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.189732: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.190444: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
>>> devices: 0
>>> 2019-07-08 08:37:58.190461: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
>>> StreamExecutor with strength 1 edge matrix:
>>> 2019-07-08 08:37:58.190484: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
>>> 2019-07-08 08:37:58.190488: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
>>> 2019-07-08 08:37:58.190682: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.191461: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.192553: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
>>> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
>>> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
>>> capability: 7.5)
>>> 2019-07-08 08:37:58.193098: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.194157: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1640] Found device 0 with
>>> properties:
>>> name: GeForce RTX 2080 with Max-Q Design major: 7 minor: 5
>>> memoryClockRate(GHz): 1.095
>>> pciBusID: 0000:01:00.0
>>> 2019-07-08 08:37:58.194200: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcudart.so.10.0
>>> 2019-07-08 08:37:58.194211: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcublas.so.10.0
>>> 2019-07-08 08:37:58.194219: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcufft.so.10.0
>>> 2019-07-08 08:37:58.194227: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcurand.so.10.0
>>> 2019-07-08 08:37:58.194241: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcusolver.so.10.0
>>> 2019-07-08 08:37:58.194250: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcusparse.so.10.0
>>> 2019-07-08 08:37:58.194276: I
>>> tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully
>>> opened dynamic library libcudnn.so.7
>>> 2019-07-08 08:37:58.194347: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.195272: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.196356: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1763] Adding visible gpu
>>> devices: 0
>>> 2019-07-08 08:37:58.196380: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1181] Device interconnect
>>> StreamExecutor with strength 1 edge matrix:
>>> 2019-07-08 08:37:58.196388: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1187] 0
>>> 2019-07-08 08:37:58.196393: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1200] 0: N
>>> 2019-07-08 08:37:58.196735: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.197553: I
>>> tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA
>>> node read from SysFS had negative value (-1), but there must be at least
>>> one NUMA node, so returning NUMA node zero
>>> 2019-07-08 08:37:58.198262: I
>>> tensorflow/core/common_runtime/gpu/gpu_device.cc:1326] Created TensorFlow
>>> device (/device:GPU:0 with 6694 MB memory) -> physical GPU (device: 0,
>>> name: GeForce RTX 2080 with Max-Q Design, pci bus id: 0000:01:00.0, compute
>>> capability: 7.5)
>>>
>>>
>>>
>>> On Mon, Jul 8, 2019 at 6:33 AM Peter Hawkins ***@***.***>
>>> wrote:
>>>
>>>> How did you install CUDA? What operating system and what release is
>>>> this?
>>>>
>>>> —
>>>> You are receiving this because you authored the thread.
>>>> Reply to this email directly, view it on GitHub
>>>> <#989>,
>>>> or mute the thread
>>>> <https://github.com/notifications/unsubscribe-auth/ABDK6EBTI6P5N5KTM2KAMQDP6M6ZHANCNFSM4H6W4E5Q>
>>>> .
>>>>
>>>
|
I'm getting this same error with python3.7 and CUDA 10.0. It seems like it doesn't actually check CUDA_DIR? Symlinking my CUDA_DIR to /usr/local/cuda solved the problem. |
Same problem here. My cuda installation is not in |
murphyk's fix worked for me, but it's rather ugly. I hope a better solution can be found soon :) thanks |
I tried murphyk's fix but I still get |
Here's my understanding of this issue: jax depends on XLA, which is built as part of TF and bundled up into the So symlinking your cuda install to Alternatively, setting the environment variable Is anyone still having problems after trying these methods? We should also potentially make a jax-specific environment variable to set a custom cuda install path, or at least document the XLA_FLAGS one more clearly... I can do that once we verify this actually works. |
Maybe I have a different issue then... I tried both symlinking and setting Full stack trace:
|
Ok upon looking at the stack trace again, it looks like XLA is searching in Anyone know why it's searching for
|
@skye Both those methods are working for me. As @KeAWang documented, I'm also seeing that in the absence of XLA_FLAGS info it will look in /usr/local/cuda-XXX, depending on the CUDA version. Would be great if the XLA folks could either actually check CUDA_DIR or simply not have an error message claiming to do so. |
Thanks @KeAWang and @iamlemec. Agreed this could be much clearer. I've filed an internal bug against XLA with your suggestions and some of my own :) These are the suggestions:
Please comment if I should correct anything or you have other suggestions! |
I personally have cuda in |
I also have it in |
Hi All, 2020-04-29 11:18:32.823934: W external/org_tensorflow/tensorflow/compiler/xla/service/gpu/nvptx_compiler.cc:77] Searched for CUDA in the following directories: I was wondering if anyone has any more information about this output? Running jax code seems to "work" as it does not through an error, but I'm not sure if the GPU is actually being used. Thanks for reading! |
Does |
For people running into this problem after an install of Ubuntu 20.04 with Ubuntu's cuda toolkit package, KeAWang's suggestion works but you need cuda-10.1 instead:
|
Thanks @murphyk It worked for me. |
you may try my code above to check whether gpu is used. |
@skye ,set XLA_FLAGS works for me. I believe this is a very important piece of information that should be in README#installation part as soon as possible. Since in most of setups, cuda installation is not in the default path XLA is looking for. And the error is confusing unless they can find this issue :) |
I have a conda environment for
even after setting My CUDA version is
Nvidia driver version is |
Do you get the same error without optirun? Also, can you try creating a symlink as described above and in the README? This will help narrow down where the problem is. |
@v-i-s-h Were you ever able to resolve this issue by setting the anaconda path? That's what I've been trying to do but it hasn't been working. |
@coded5282 Nope. I am still getting the same error. |
Just to add: Same error as everyone else. Tried setting the XLA_FLAGS environment variable; didn't work. However, adding the symlink did (for me: Like @KeAWang and several others I'm on Arch and installed CUDA through through the AUR. |
First check where your CUDA installation resides.
For instance, if the above command spits out
But JAX, by default, looks for CUDA installation in
The above steps, in total in that order, would solve the issue, at least it did for me. Note: Please keep in mind that the CUDA version (e.g., 11.0) of JAX binary that is installed on your machine should also be compiled for the specific CUDA installation in |
Hi Everyone, I still have an issue with this. I get the error of :
i tried setting XLA flag but i still have the same issue:
|
This problem is addressed here: the instruction needs to clarify that one should make |
Good news! As of jaxlib 0.1.66, which was just released yesterday, we now bundle |
Hi
Jax cannot find libdevice.
I'm running Python 3.7 with cuda 10.0 on my personal laptop qwith a GeForce RTX 2080.
I installed jax using pip.
I made a little test script shown below
The output is shown below.
The text was updated successfully, but these errors were encountered: