Skip to content

Commit c0f3d4b

Browse files
authored
Fix GPU CUDA out of memory error when workers_per_replica > 1 (#853)
1 parent 47caeeb commit c0f3d4b

File tree

2 files changed

+28
-0
lines changed

2 files changed

+28
-0
lines changed

docs/deployments/gpus.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,3 +8,26 @@ To use GPUs:
88
2. You may need to [file an AWS support ticket](https://console.aws.amazon.com/support/cases#/create?issueType=service-limit-increase&limitType=ec2-instances) to increase the limit for your desired instance type.
99
3. Set instance type to an AWS GPU instance (e.g. g4dn.xlarge) when installing Cortex.
1010
4. Set the `gpu` field in the `compute` configuration for your API. One unit of GPU corresponds to one virtual GPU. Fractional requests are not allowed.
11+
12+
## Pitfalls
13+
14+
### If using `workers_per_replica` > 1, TensorFlow-based models, and Python Predictor
15+
16+
When using `workers_per_replica` > 1 with TensorFlow-based models (including Keras) in the Python Predictor, loading the model in separate processes at the same time will throw a `CUDA_ERROR_OUT_OF_MEMORY: out of memory` error. This is because the first process that loads the model will allocate all of the GPU's memory and leave none to other processes. To prevent this from happening, the per-process GPU memory usage can be limited. There are two methods:
17+
18+
1\) Configure the model to allocate only as much memory as it requires, via [tf.config.experimental.set_memory_growth()](https://www.tensorflow.org/api_docs/python/tf/config/experimental/set_memory_growth):
19+
20+
```python
21+
for gpu in tf.config.list_physical_devices("GPU"):
22+
tf.config.experimental.set_memory_growth(gpu, True)
23+
```
24+
25+
2\) Impose a hard limit on how much memory the model can use, via [tf.config.set_logical_device_configuration()](https://www.tensorflow.org/api_docs/python/tf/config/set_logical_device_configuration):
26+
27+
```python
28+
mem_limit_mb = 1024
29+
for gpu in tf.config.list_physical_devices("GPU"):
30+
tf.config.set_logical_device_configuration(gpu, [tf.config.LogicalDeviceConfiguration(memory_limit=mem_limit_mb)])
31+
```
32+
33+
See the [TensorFlow GPU guide](https://www.tensorflow.org/guide/gpu) and this [blog post](https://medium.com/@starriet87/tensorflow-2-0-wanna-limit-gpu-memory-10ad474e2528) for additional information.

examples/tensorflow/license-plate-reader/predictor_crnn.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -5,10 +5,15 @@
55
import keras_ocr
66
import base64
77
import pickle
8+
import tensorflow as tf
89

910

1011
class PythonPredictor:
1112
def __init__(self, config):
13+
# limit memory usage on each process
14+
for gpu in tf.config.list_physical_devices("GPU"):
15+
tf.config.experimental.set_memory_growth(gpu, True)
16+
1217
# keras-ocr will automatically download pretrained
1318
# weights for the detector and recognizer.
1419
self.pipeline = keras_ocr.pipeline.Pipeline()

0 commit comments

Comments
 (0)