You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have I written custom code (as opposed to using a stock example script provided in TensorFlow):
Ubuntu 18.04.04:
TensorFlow installed from : pip3 install tensorflow-rocm
TensorFlow version : 2.2.0
Python version: 3.7
GPU model and memory: rx580 8gb
rocm: 3.5.1
Describe the current behavior
When I tried to import any model, or creating any model that require gpus operation, I got the error `Segmentation fault (core dumped) Describe the expected behavior
The program should run without error
Standalone code to reproduce the issue import tensorflow as tf from tensorflow.keras.applications.resnet50 import ResNet50 model = ResNet50(weights='imagenet')
or model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10) ])
both code snippet above crash when creating model with same error mentioned.
here is the full error stack: 2020-06-25 02:57:19.930061: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libhip_hcc.so 2020-06-25 02:57:19.985372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1579] Found device 0 with properties: pciBusID: 0000:01:00.0 name: Ellesmere [Radeon RX 470/480/570/570X/580/580X] ROCm AMD GPU ISA: gfx803 coreClock: 1.34GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: -1B/s 2020-06-25 02:57:20.033347: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so 2020-06-25 02:57:20.034196: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so 2020-06-25 02:57:20.038766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so 2020-06-25 02:57:20.038977: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so 2020-06-25 02:57:20.039037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2020-06-25 02:57:20.039253: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA 2020-06-25 02:57:20.043716: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3600000000 Hz 2020-06-25 02:57:20.043918: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556aef8ca0e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-06-25 02:57:20.043929: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-06-25 02:57:20.045297: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556aef8cbba0 initialized for platform ROCM (this does not guarantee that XLA will be used). Devices: 2020-06-25 02:57:20.045326: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Ellesmere [Radeon RX 470/480/570/570X/580/580X], AMDGPU ISA version: gfx803 2020-06-25 02:57:20.045435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1579] Found device 0 with properties: pciBusID: 0000:01:00.0 name: Ellesmere [Radeon RX 470/480/570/570X/580/580X] ROCm AMD GPU ISA: gfx803 coreClock: 1.34GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: -1B/s 2020-06-25 02:57:20.045464: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so 2020-06-25 02:57:20.045475: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so 2020-06-25 02:57:20.045484: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so 2020-06-25 02:57:20.045493: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so 2020-06-25 02:57:20.045523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2020-06-25 02:57:20.777733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-25 02:57:20.777775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 2020-06-25 02:57:20.777780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N 2020-06-25 02:57:20.777909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7399 MB memory) -> physical GPU (device: 0, name: Ellesmere [Radeon RX 470/480/570/570X/580/580X], pci bus id: 0000:01:00.0) Segmentation fault (core dumped)
The text was updated successfully, but these errors were encountered:
Instead of setting the LD_LIBRARY_PATH, could you try the alternative to set ROCM_PATH and make sure you have the latest hip-rocclr, and see if it fix the issue?
$ sudo apt-get install hip-rocclr
$ export ROCM_PATH=/opt/rocm
System information
Describe the current behavior
When I tried to import any model, or creating any model that require gpus operation, I got the error `Segmentation fault (core dumped)
Describe the expected behavior
The program should run without error
Standalone code to reproduce the issue
import tensorflow as tf from tensorflow.keras.applications.resnet50 import ResNet50 model = ResNet50(weights='imagenet')
or
model = tf.keras.models.Sequential([ tf.keras.layers.Flatten(input_shape=(28, 28)), tf.keras.layers.Dense(128, activation='relu'), tf.keras.layers.Dropout(0.2), tf.keras.layers.Dense(10) ])
both code snippet above crash when creating model with same error mentioned.
here is the full error stack:
2020-06-25 02:57:19.930061: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libhip_hcc.so 2020-06-25 02:57:19.985372: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1579] Found device 0 with properties: pciBusID: 0000:01:00.0 name: Ellesmere [Radeon RX 470/480/570/570X/580/580X] ROCm AMD GPU ISA: gfx803 coreClock: 1.34GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: -1B/s 2020-06-25 02:57:20.033347: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so 2020-06-25 02:57:20.034196: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so 2020-06-25 02:57:20.038766: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so 2020-06-25 02:57:20.038977: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so 2020-06-25 02:57:20.039037: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2020-06-25 02:57:20.039253: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE3 SSE4.1 SSE4.2 AVX AVX2 FMA 2020-06-25 02:57:20.043716: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 3600000000 Hz 2020-06-25 02:57:20.043918: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556aef8ca0e0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2020-06-25 02:57:20.043929: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version 2020-06-25 02:57:20.045297: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x556aef8cbba0 initialized for platform ROCM (this does not guarantee that XLA will be used). Devices: 2020-06-25 02:57:20.045326: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Ellesmere [Radeon RX 470/480/570/570X/580/580X], AMDGPU ISA version: gfx803 2020-06-25 02:57:20.045435: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1579] Found device 0 with properties: pciBusID: 0000:01:00.0 name: Ellesmere [Radeon RX 470/480/570/570X/580/580X] ROCm AMD GPU ISA: gfx803 coreClock: 1.34GHz coreCount: 36 deviceMemorySize: 8.00GiB deviceMemoryBandwidth: -1B/s 2020-06-25 02:57:20.045464: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocblas.so 2020-06-25 02:57:20.045475: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libMIOpen.so 2020-06-25 02:57:20.045484: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocfft.so 2020-06-25 02:57:20.045493: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library librocrand.so 2020-06-25 02:57:20.045523: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1703] Adding visible gpu devices: 0 2020-06-25 02:57:20.777733: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1102] Device interconnect StreamExecutor with strength 1 edge matrix: 2020-06-25 02:57:20.777775: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1108] 0 2020-06-25 02:57:20.777780: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1121] 0: N 2020-06-25 02:57:20.777909: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1247] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7399 MB memory) -> physical GPU (device: 0, name: Ellesmere [Radeon RX 470/480/570/570X/580/580X], pci bus id: 0000:01:00.0) Segmentation fault (core dumped)
The text was updated successfully, but these errors were encountered: