Rubyで物体検出(ONNXRuntime + YOLOv3)

Last updated at 2019-09-01Posted at 2019-09-01

RubyでDeepLearningネタがめずらしくない気もする今日この頃ですが、気にせずにyoloをやりました。

https://commons.wikimedia.org/wiki/File:Arimatsushibori.JPG

ONNXRuntime

推論エンジン

マイクロソフト公式
- https://github.com/microsoft/onnxruntime
ankane氏によるRubyバインディング
- https://github.com/ankane/onnxruntime

yolov3

ONNXのモデル
- https://github.com/onnx/models/tree/master/vision/object_detection_segmentation/yolov3
公式
- https://pjreddie.com/darknet/yolo/

事前準備、ダウンロード

# model
wget https://onnxzoo.blob.core.windows.net/models/opset_10/yolov3/yolov3.onnx
# coco-labels
wget https://raw.githubusercontent.com/amikelive/coco-labels/master/coco-labels-2014_2017.txt
# gem
gem instal numo-narray mini_magick

yolov3.rb

require 'mini_magick'
require 'numo/narray'
require 'onnxruntime'

SFloat = Numo::SFloat

input_path = ARGV[0]
output_path = ARGV[1]

model = OnnxRuntime::Model.new('yolov3.onnx')
labels = File.readlines('coco-labels-2014_2017.txt')

# preprocessing
img = MiniMagick::Image.open(input_path)
image_size = [[img.height, img.width]]
img.combine_options do |b|
  b.resize '416x416'
  b.gravity 'center'
  b.background 'transparent'
  b.extent '416x416'
end
img_data = SFloat.cast(img.get_pixels)
img_data /= 255.0
image_data = img_data.transpose(2, 0, 1)
                     .expand_dims(0)
                     .to_a # NArray -> Array

# inference
output = model.predict(input_1: image_data, image_shape: image_size)

# postprocessing
boxes, scores, indices = output.values
results = indices.map do |idx|
  { class: idx[1],
    score: scores[idx[0]][idx[1]][idx[2]],
    box: boxes[idx[0]][idx[2]] }
end

# visualization
img = MiniMagick::Image.open(input_path)
img.colorspace 'gray'
results.each do |r|
  hue = r[:class] * 100 / 80.0
  label = labels[r[:class]]
  score = r[:score]
  # draw box
  y1, x1, y2, x2 = r[:box].map(&:round)
  img.combine_options do |c|
    c.draw        "rectangle #{x1}, #{y1}, #{x2}, #{y2}"
    c.fill        "hsla(#{hue}%, 20%, 80%, 0.25)"
    c.stroke      "hsla(#{hue}%, 70%, 60%, 1.0)"
    c.strokewidth (score * 3).to_s
  end
  # draw text
  img.combine_options do |c|
    c.draw "text #{x1}, #{y1 - 5} \"#{label}\""
    c.fill 'white'
    c.pointsize 18
  end
end

img.write output_path