More than 3 years have passed since last update.

tensorflow.python.framework.errors_impl.ResourceExhaustedErrorでOOMが走るときの対処方法

Posted at 2020-05-27

Keras-Yolov3 でGPUを動かしたときにメモリが足りないときの対処方法

YOLOv3 (Tensorflow backend) をGPUを使ってモデルを作成しようとした時にOOMが動いてしまう。
https://github.com/qqwweee/keras-yolo3

tensorflow.python.framework.errors_impl.ResourceExhaustedError: OOM when allocating tensor with shape[32,104,104,128] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc

Batchサイズを32 -> 8 に変更したところ、動くようになった。

train.py

@@ -54,7 +54,7 @@ def _main():
             # use custom yolo_loss Lambda layer.
             'yolo_loss': lambda y_true, y_pred: y_pred})

-        batch_size = 32
+        batch_size = 8
         print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))
         model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, anchors, num_classes),
                 steps_per_epoch=max(1, num_train//batch_size),
@@ -73,7 +73,7 @@ def _main():
         model.compile(optimizer=Adam(lr=1e-4), loss={'yolo_loss': lambda y_true, y_pred: y_pred}) # recompile to apply the change
         print('Unfreeze all of the layers.')

-        batch_size = 32 # note that more GPU memory is required after unfreezing the body
+        batch_size = 8 # note that more GPU memory is required after unfreezing the body
         print('Train on {} samples, val on {} samples, with batch size {}.'.format(num_train, num_val, batch_size))
         model.fit_generator(data_generator_wrapper(lines[:num_train], batch_size, input_shape, anchors, num_classes),
             steps_per_epoch=max(1, num_train//batch_size),

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up