
More than 5 years have passed since last update.

TensorFlowでつまづいた(Out of GPU Memoryとは)

Last updated at Posted at 2015-12-03







$ python test.py 
I tensorflow/core/common_runtime/local_device.cc:25] Local device intra op parallelism threads: 12
I tensorflow/core/common_runtime/gpu/gpu_init.cc:88] Found device 0 with properties: 
name: GeForce GTX TITAN X
major: 5 minor: 2 memoryClockRate (GHz) 1.076
pciBusID 0000:05:00.0
Total memory: 11.99GiB
Free memory: 11.47GiB
I tensorflow/core/common_runtime/gpu/gpu_init.cc:112] DMA: 0 
I tensorflow/core/common_runtime/gpu/gpu_init.cc:122] 0:   Y 
I tensorflow/core/common_runtime/gpu/gpu_device.cc:643] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:05:00.0)
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:47] Setting region size to 11701021287
I tensorflow/core/common_runtime/local_session.cc:45] Local session inter op parallelism threads: 12
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 256 (256B) Pool: chunks: 64 free: 24 cumulative malloc: 134728 cumulative freed: 134688
Number of chunks: 64, in_use chunks: 40
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 4096 (4.0KiB) Pool: chunks: 8 free: 2 cumulative malloc: 2812 cumulative freed: 2806
Number of chunks: 8, in_use chunks: 6
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 8192 (8.0KiB) Pool: chunks: 8 free: 3 cumulative malloc: 2814 cumulative freed: 2809
Number of chunks: 8, in_use chunks: 5
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 16384 (16.0KiB) Pool: chunks: 8 free: 3 cumulative malloc: 11233 cumulative freed: 11228
Number of chunks: 8, in_use chunks: 5
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 65536 (64.0KiB) Pool: chunks: 16 free: 16 cumulative malloc: 44896 cumulative freed: 44896
Number of chunks: 16, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 98304 (96.0KiB) Pool: chunks: 8 free: 8 cumulative malloc: 11224 cumulative freed: 11224
Number of chunks: 8, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 131072 (128.0KiB) Pool: chunks: 4 free: 4 cumulative malloc: 14030 cumulative freed: 14030
Number of chunks: 4, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 212992 (208.0KiB) Pool: chunks: 8 free: 3 cumulative malloc: 11232 cumulative freed: 11227
Number of chunks: 8, in_use chunks: 5
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 229376 (224.0KiB) Pool: chunks: 2 free: 1 cumulative malloc: 2 cumulative freed: 1
Number of chunks: 2, in_use chunks: 1
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 262144 (256.0KiB) Pool: chunks: 8 free: 8 cumulative malloc: 16836 cumulative freed: 16836
Number of chunks: 8, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 425984 (416.0KiB) Pool: chunks: 1 free: 1 cumulative malloc: 2806 cumulative freed: 2806
Number of chunks: 1, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 524288 (512.0KiB) Pool: chunks: 8 free: 8 cumulative malloc: 25254 cumulative freed: 25254
Number of chunks: 8, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 1048576 (1.00MiB) Pool: chunks: 8 free: 8 cumulative malloc: 25254 cumulative freed: 25254
Number of chunks: 8, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 13631488 (13.00MiB) Pool: chunks: 8 free: 3 cumulative malloc: 2814 cumulative freed: 2809
Number of chunks: 8, in_use chunks: 5
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 268435456 (256.00MiB) Pool: chunks: 1 free: 1 cumulative malloc: 1 cumulative freed: 1
Number of chunks: 1, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 369098752 (352.00MiB) Pool: chunks: 1 free: 1 cumulative malloc: 1 cumulative freed: 1
Number of chunks: 1, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 738197504 (704.00MiB) Pool: chunks: 1 free: 0 cumulative malloc: 1 cumulative freed: 0
Number of chunks: 1, in_use chunks: 1
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 1476395008 (1.38GiB) Pool: chunks: 0 free: 0 cumulative malloc: 0 cumulative freed: 0
Number of chunks: 0, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:339] Chunk size: 2952790016 (2.75GiB) Pool: chunks: 3 free: 3 cumulative malloc: 3 cumulative freed: 3
Number of chunks: 3, in_use chunks: 0
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:345] Aggregate Region Memory: 11701021287 (10.90GiB)
I tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:347] Aggregate Chunk Memory: 10363027456 (9.65GiB)
W tensorflow/core/common_runtime/gpu/gpu_region_allocator.cc:89] Out of GPU memory, see memory state dump above
W tensorflow/core/kernels/conv_ops.cc:162] Resource exhausted: OOM when allocating tensor with shapedim { size: 28060 } dim { size: 14 } dim { size: 14 } dim { size: 64 }
W tensorflow/core/common_runtime/executor.cc:1027] 0x10426540 Compute status: Resource exhausted: OOM when allocating tensor with shapedim { size: 28060 } dim { size: 14 } dim { size: 14 } dim { size: 64 }
     [[Node: conv2/Conv2D = Conv2D[T=DT_FLOAT, padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pool1/MaxPool, conv2/Variable)]]
W tensorflow/core/common_runtime/executor.cc:1027] 0x127a7090 Compute status: Resource exhausted: OOM when allocating tensor with shapedim { size: 28060 } dim { size: 14 } dim { size: 14 } dim { size: 64 }
     [[Node: conv2/Conv2D = Conv2D[T=DT_FLOAT, padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pool1/MaxPool, conv2/Variable)]]
     [[Node: range_1/_15 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_394_range_1", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
W tensorflow/core/common_runtime/executor.cc:1027] 0x127a7090 Compute status: Resource exhausted: OOM when allocating tensor with shapedim { size: 28060 } dim { size: 14 } dim { size: 14 } dim { size: 64 }
     [[Node: conv2/Conv2D = Conv2D[T=DT_FLOAT, padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pool1/MaxPool, conv2/Variable)]]
     [[Node: Cast/_13 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_393_Cast", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Traceback (most recent call last):
  File "img_ditect_train.py", line 229, in <module>
    keep_prob: 1.0})
  File "/home/tensorflow-GPU/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 345, in run
    results = self._do_run(target_list, unique_fetch_targets, feed_dict_string)
  File "/home/tensorflow-GPU/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 419, in _do_run
tensorflow.python.framework.errors.ResourceExhaustedError: OOM when allocating tensor with shapedim { size: 28060 } dim { size: 14 } dim { size: 14 } dim { size: 64 }
     [[Node: conv2/Conv2D = Conv2D[T=DT_FLOAT, padding="SAME", strides=[1, 1, 1, 1], use_cudnn_on_gpu=true, _device="/job:localhost/replica:0/task:0/gpu:0"](pool1/MaxPool, conv2/Variable)]]
     [[Node: range_1/_15 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_394_range_1", tensor_type=DT_INT32, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
Caused by op u'conv2/Conv2D', defined at:
  File "test.py", line 196, in <module>
    logits = inference(images_placeholder, keep_prob)
  File "test.py", line 70, in inference
    h_conv2 = tf.nn.relu(conv2d(h_pool1, W_conv2) + b_conv2)
  File "test.py", line 46, in conv2d
    return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
  File "/home/tensorflow-GPU/local/lib/python2.7/site-packages/tensorflow/python/ops/gen_nn_ops.py", line 207, in conv2d
    use_cudnn_on_gpu=use_cudnn_on_gpu, name=name)
  File "/home/tensorflow-GPU/local/lib/python2.7/site-packages/tensorflow/python/ops/op_def_library.py", line 633, in apply_op
  File "/home/tensorflow-GPU/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1710, in create_op
    original_op=self._default_original_op, op_def=op_def)
  File "/home/tensorflow-GPU/local/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 988, in __init__
    self._traceback = _extract_stack()




MATS様より”shapedim { size: 28060 } dim { size: 14 } dim { size: 14 } dim { size: 64 }のところのサイズを小さくして試してみてはいかがでしょう?”というご指摘をいただき、{ size: 28060 }という部分に着目しました。






Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up