Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pip install of tensorflow-text disables tensorflow-gpu #195

Open
eduardofv opened this issue Dec 6, 2019 · 5 comments
Open

pip install of tensorflow-text disables tensorflow-gpu #195

eduardofv opened this issue Dec 6, 2019 · 5 comments
Assignees

Comments

@eduardofv
Copy link

It seems that a pip install of tensorflow-text>=2.0.0rc0 also installs tensorflow-2. If you previously had installed tensorflow-gpu the new one will disable GPU access.

Steps to reproduce:

  1. Build a new docker image with tf-gpu:
    Dockerfile
FROM tensorflow/tensorflow:latest-gpu-py3-jupyter
WORKDIR /root

Build with docker build -t prueba .

Test correct GPU access:

$ docker run --runtime=nvidia --rm -it prueba:latest  python -c "import tensorflow as tf; print(tf.test.is_gpu_available())"
2019-12-06 18:55:03.225780: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-06 18:55:03.252825: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2592000000 Hz
2019-12-06 18:55:03.253637: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x43890e0 executing computations on platform Host. Devices:
2019-12-06 18:55:03.253666: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-12-06 18:55:03.256129: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-12-06 18:55:03.361065: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 18:55:03.362691: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4460890 executing computations on platform CUDA. Devices:
2019-12-06 18:55:03.362794: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce GTX 960M, Compute Capability 5.0
2019-12-06 18:55:03.363437: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 18:55:03.364914: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate(GHz): 1.176
pciBusID: 0000:02:00.0
2019-12-06 18:55:03.365825: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-06 18:55:03.370683: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-06 18:55:03.373270: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-12-06 18:55:03.373944: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-12-06 18:55:03.376933: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-12-06 18:55:03.378858: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-12-06 18:55:03.384248: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-06 18:55:03.384400: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 18:55:03.384853: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 18:55:03.385166: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-06 18:55:03.385213: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-06 18:55:03.385816: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-06 18:55:03.385834: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2019-12-06 18:55:03.385842: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2019-12-06 18:55:03.385961: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 18:55:03.386314: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 18:55:03.386633: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with 3330 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0, compute capability: 5.0)
True
  1. New dockerfile with tensorflow-text:
FROM tensorflow/tensorflow:latest-gpu-py3-jupyter
WORKDIR /root

RUN pip install tensorflow-text>=2.0.0rc0

Build and test... no GPU:

$ docker run --runtime=nvidia --rm -it prueba:latest  python -c "import tensorflow as tf; print(tf.test.is_gpu_available())"
2019-12-06 19:02:11.488695: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-06 19:02:11.512879: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2592000000 Hz
2019-12-06 19:02:11.513972: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x3726f30 executing computations on platform Host. Devices:
2019-12-06 19:02:11.514007: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
False
  1. Workaround: uninstall tensorflow after installing tensorflow-text only to user. Dockerfile:
FROM tensorflow/tensorflow:latest-gpu-py3-jupyter
WORKDIR /root

RUN pip install --user tensorflow-text>=2.0.0rc0
RUN pip uninstall -y tensorflow

build and test, tensorflow-gpu works

$ docker build -t prueba -f Dockerfile .
Sending build context to Docker daemon  4.096kB
Step 1/4 : FROM tensorflow/tensorflow:latest-gpu-py3-jupyter
 ---> 88178d65d12c
Step 2/4 : WORKDIR /root
 ---> Using cache
 ---> 39616c78086e
Step 3/4 : RUN pip install --user tensorflow-text>=2.0.0rc0
 ---> Running in a08e37ee49da
  WARNING: The scripts saved_model_cli, tensorboard, tf_upgrade_v2, tflite_convert, toco and toco_from_protos are installed in '/root/.local/bin' which is not on PATH.
  Consider adding this directory to PATH or, if you prefer to suppress this warning, use --no-warn-script-location.
WARNING: You are using pip version 19.2.3, however version 19.3.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Removing intermediate container a08e37ee49da
 ---> d5c415130f01
Step 4/4 : RUN pip uninstall -y tensorflow
 ---> Running in 23e3d7e9a6a8
Uninstalling tensorflow-2.0.0:
  Successfully uninstalled tensorflow-2.0.0
Removing intermediate container 23e3d7e9a6a8
 ---> 7b91c3f6eeed
Successfully built 7b91c3f6eeed
Successfully tagged prueba:latest
$ docker run --runtime=nvidia --rm -it prueba:latest  python -c "import tensorflow as tf; print(tf.test.is_gpu_available())"
2019-12-06 19:03:36.662969: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-12-06 19:03:36.688813: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2592000000 Hz
2019-12-06 19:03:36.689431: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4c00060 executing computations on platform Host. Devices:
2019-12-06 19:03:36.689461: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): Host, Default Version
2019-12-06 19:03:36.691759: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-12-06 19:03:36.733931: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 19:03:36.734501: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4cd7810 executing computations on platform CUDA. Devices:
2019-12-06 19:03:36.734534: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce GTX 960M, Compute Capability 5.0
2019-12-06 19:03:36.734750: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 19:03:36.735122: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties: 
name: GeForce GTX 960M major: 5 minor: 0 memoryClockRate(GHz): 1.176
pciBusID: 0000:02:00.0
2019-12-06 19:03:36.735374: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-06 19:03:36.737036: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcublas.so.10.0
2019-12-06 19:03:36.738244: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcufft.so.10.0
2019-12-06 19:03:36.738574: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcurand.so.10.0
2019-12-06 19:03:36.740186: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusolver.so.10.0
2019-12-06 19:03:36.741472: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcusparse.so.10.0
2019-12-06 19:03:36.745057: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudnn.so.7
2019-12-06 19:03:36.745242: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 19:03:36.745721: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 19:03:36.746093: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1746] Adding visible gpu devices: 0
2019-12-06 19:03:36.746178: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcudart.so.10.0
2019-12-06 19:03:36.746822: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-12-06 19:03:36.746837: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]      0 
2019-12-06 19:03:36.746847: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1178] 0:   N 
2019-12-06 19:03:36.747053: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 19:03:36.747601: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-12-06 19:03:36.748010: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1304] Created TensorFlow device (/device:GPU:0 with 3330 MB memory) -> physical GPU (device: 0, name: GeForce GTX 960M, pci bus id: 0000:02:00.0, compute capability: 5.0)
True
@broken
Copy link
Member

broken commented Dec 9, 2019

Yeah; the pip install mechanism has it's shortcomings, and doesn't appear to be designed for relationships like that between tf.text & tf. Pip install also forces the version of tf based on tf text. We would like pip installs to check the version of TF and install the appropriate version of TF Text.

I'm not intricately familiar with the pip installation mechanism, but I think this may be impossible since each setup.py is unique per version, so when it runs at installation, pip has already decided on a version. I think the best we could do is hard fail installations with instructions on what command to actually run.

So if you have tensorflow-gpu 1.15 installed, and run pip install tensorflow_text, pip will still blindly select the tf.text 2.0 version, but we have checks in there that will fail and request you run pip install --gpu tensorflow_text==1.15 instead. We'd welcome advice if there is a better way.

@eduardofv
Copy link
Author

I understand... maybe the solution would be clarifying this in the documentation. If I get some time I'll take a look at the docs and propose how may it work. Thanks!

@VinuraD
Copy link

VinuraD commented Jun 23, 2021

Hi, as I have mentioned here; #644 , when I install with correct version number, the gpu version doesn't get disabled. But I get an error as described in this issue.

@marlon-shiftone
Copy link

marlon-shiftone commented Dec 13, 2022

Most of the problems come from the following:

(to make things easier, use conda and create a new environment)

First: running tensorflow with you GPU demands the correct cudatoolkit and cudNN versions. To avoid any problem, use conda install tensorflow-gpu and Anaconda will choose the right versions for you.

Second: print and check your new environment tensorflow version. Once done it, go the terminal and type: pip install tensorflow-text==(your current tf version). Most ot the problem come from this mismatch.

If you only do it, it will give you a Symbol not found error. Then you have to install the tf-models-official package by typing the following pip install ft-models-official==(your current tf version).

At last, type conda list, if the tensorflow, tensorflow-gpu, tensorflow-text, and tf-models-official all have the same version, it should work.

1 similar comment
@marlon-shiftone
Copy link

Most of the problems come from the following:

(to make things easier, use conda and create a new environment)

First: running tensorflow with you GPU demands the correct cudatoolkit and cudNN versions. To avoid any problem, use conda install tensorflow-gpu and Anaconda will choose the right versions for you.

Second: print and check your new environment tensorflow version. Once done it, go the terminal and type: pip install tensorflow-text==(your current tf version). Most ot the problem come from this mismatch.

If you only do it, it will give you a Symbol not found error. Then you have to install the tf-models-official package by typing the following pip install ft-models-official==(your current tf version).

At last, type conda list, if the tensorflow, tensorflow-gpu, tensorflow-text, and tf-models-official all have the same version, it should work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants