ElixirAdvent Calendar 2024

株式会社オーイーシー

Google Colaboratory 上の Livebook で YOLO Elixir による物体検出を実行する

Posted at 2024-12-26

はじめに

YOLO Elixir は YOLOv8 による物体検出を Elixir で簡単に実行できるモジュールです

YOLO の前処理や後処理を定義してくれているので、行列演算を意識せずに物体検出が実行できます

本記事では Google Colaboratory 上に立ち上げた Livebook でYOLO Elixir による物体検出を実行します

実装したノートブックはこちら

実行環境

Google Colaboratory で無償利用できる T4 GPU を使います

Livebook で GPU を使うための詳細は以下の記事を参照してください

Livebook の起動

ランタイムタイプに T4 を選択し、以下のコードを順次実行します

OS の確認

!cat /etc/os-release

CUDA のバージョン確認

!nvcc --version

cuDNN のバージョン確認

!cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

EXLA に合わせたバージョンへの更新

!apt-get -y install cudnn9-cuda-12

!cat /usr/include/cudnn_version.h | grep CUDNN_MAJOR -A 2

YOLO Elixir のリポジトリークローン

!git clone https://github.com/poeticoding/yolo_elixir.git

YOLOv8 モデルの ONNX 形式への変換

%cd yolo_elixir
!pip install -r python/requirements.txt
!python python/yolov8_to_onnx.py n
!python python/yolov8_to_onnx.py x
%cd ..

mise のインストール

!sudo apt update -y && sudo apt install -y gpg sudo wget curl
!sudo install -dm 755 /etc/apt/keyrings
!wget -qO - https://mise.jdx.dev/gpg-key.pub | gpg --dearmor | sudo tee /etc/apt/keyrings/mise-archive-keyring.gpg 1> /dev/null
!echo "deb [signed-by=/etc/apt/keyrings/mise-archive-keyring.gpg arch=amd64] https://mise.jdx.dev/deb stable main" | sudo tee /etc/apt/sources.list.d/mise.list
!sudo apt update
!sudo apt install -y mise

パスの設定

import os

os.environ['PATH'] = "/root/.local/share/mise/shims:/root/.mix/escripts:" + os.environ['PATH']

各種言語インストール（ortex を使用するため、 Rust が必要になります）

!mise use -g erlang@27.2
!mise use -g elixir@1.18.1-otp-27
!mise use -g rust@1.83.0

Livebook のインストール

!mix local.hex --force
!mix local.rebar --force
!mix escript.install hex livebook --force
!livebook -v

ngrok CLI のインストール

!curl -s https://ngrok-agent.s3.amazonaws.com/ngrok.asc \
	| sudo tee /etc/apt/trusted.gpg.d/ngrok.asc >/dev/null \
	&& echo "deb https://ngrok-agent.s3.amazonaws.com buster main" \
	| sudo tee /etc/apt/sources.list.d/ngrok.list \
	&& sudo apt update \
	&& sudo apt install ngrok

ngrok 認証トークンの入力

from getpass import getpass

token = getpass()

ngrok の認証設定

!ngrok config add-authtoken "$token"

ngrok によるトンネル作成

get_ipython().system_raw('ngrok http 8888 &')
!sleep 5s

公開 URL を取得

!curl -s http://localhost:4040/api/tunnels | python3 -c "import sys, json; print(json.load(sys.stdin)['tunnels'][0]['public_url'])"

Livebook 起動

!livebook server --port 8888

Livebook での物体検出実行

起動した Livebook にアクセスし、新しいノートブックを開きます

セットアップ

必要なモジュールをインストールします

config と system_env により、 EXLA バックエンドから GPU を使うように設定しています

Mix.install(
  [
    {:yolo, ">= 0.0.0"},
    {:yolo_fast_nms, "~> 0.1"},
    {:exla, "~> 0.9.2"},
    {:evision, "~> 0.2.0"},
    {:kino, "~> 0.14.2"}
  ],
  config: [
    nx: [default_backend: EXLA.Backend]
  ],
  system_env: [
    {"XLA_TARGET", "cuda12"},
    {"EXLA_TARGET", "cuda"},
    {"EVISION_ENABLE_CUDA", "true"},
    {"EVISION_ENABLE_CONTRIB", "true"},
    {"EVISION_CUDA_VERSION", "12"},
    {"EVISION_CUDNN_VERSION", "9"}
  ]
)

YOLOv8 モデルの読込

ONNX 形式に変換したモデルを読み込みます

model = YOLO.load([
  model_path: "/content/yolo_elixir/models/yolov8n.onnx", 
  classes_path: "/content/yolo_elixir/models/yolov8n_classes.json"
])

画像の読込

画像選択のフォームを作成します

image_input = Kino.Input.image("IMAGE", format: :png)

今回は YOLO Elixir のガイドに使用されていた画像を使用します

画像を行列に変換します

image =
  image_input
  |> Kino.Input.read()
  |> Map.get(:file_ref)
  |> Kino.Input.file_path()
  |> File.read!()

mat = Evision.imdecode(image, Evision.Constant.cv_IMREAD_COLOR())

実行結果

物体検出の実行

物体検出を実行し、結果を整形します

nms_fun: &YoloFastNMS.run/3 により、 NMS (Non-Max Suppression) を高速化しています

objects =
  model
  |> YOLO.detect(mat, nms_fun: &YoloFastNMS.run/3)
  |> YOLO.to_detected_objects(model.classes)

実行結果

[
  %{
    class: "person",
    prob: 0.7682794332504272,
    bbox: %{h: 147, w: 69, cx: 699, cy: 579},
    class_idx: 0
  },
  %{
    class: "person",
    prob: 0.7557899951934814,
    bbox: %{h: 213, w: 81, cx: 609, cy: 777},
    class_idx: 0
  },
  %{
    class: "person",
    prob: 0.7420753836631775,
    bbox: %{h: 225, w: 90, cx: 468, cy: 849},
    class_idx: 0
  },
  ...
]

検出結果の描画

検出結果を画像上に描画します

四角形の色は適当にクラス毎に違う色にしています

draw_objects = fn mat, objects ->
  objects
  |> Enum.reduce(mat, fn %{class: class, prob: prob, bbox: bbox, class_idx: class_idx}, drawed_mat ->
    %{w: w, h: h, cx: cx, cy: cy} = bbox
    left = cx - div(w, 2)
    top = cy - div(h, 2)
    right = left + w
    bottom = top + h
  
    score = round(prob * 100) |> Integer.to_string()
  
    color = {
      case rem(class_idx, 3) do
        0 -> 0
        1 -> 128
        2 -> 255
      end,
      case rem(80 - class_idx, 4) do
        0 -> 0
        1 -> 30
        2 -> 60
        3 -> 90
      end,
      case rem(40 + class_idx, 5) do
        0 -> 255
        1 -> 196
        2 -> 128
        3 -> 64
        4 -> 0
      end
    }
  
    text = class <> ":" <> score
    font = Evision.Constant.cv_FONT_HERSHEY_SIMPLEX()
    font_scale = 1
    font_thickness = 2
    {{tw, th}, _} = Evision.getTextSize(text, font, font_scale, font_thickness)
  
    drawed_mat
    |> Evision.rectangle(
      {left, top},
      {right, bottom},
      color,
      thickness: 10
    )
    |> Evision.rectangle(
      {left - 5, top - th - 10},
      {left + tw + 5, top},
      color,
      thickness: -1
    )
    |> Evision.putText(
      text,
      {left, top - 5},
      font,
      font_scale,
      {255, 255, 255},
      thickness: font_thickness
    )
  end)
end

draw_objects.(mat, objects)

実行結果

それなりに検出できていることが分かります

YOLOv8x の実行

より高精度な YOLOv8x を使ってみます

model = YOLO.load([
  model_path: "/content/yolo_elixir/models/yolov8x.onnx", 
  classes_path: "/content/yolo_elixir/models/yolov8x_classes.json"
])

objects =
  model
  |> YOLO.detect(mat, nms_fun: &YoloFastNMS.run/3)
  |> YOLO.to_detected_objects(model.classes)

draw_objects.(mat, objects)

実行結果

ほぼ完璧に物体検出ができています

まとめ

YOLO Elixir を利用することで、 YOLO の前処理、後処理を自分で書かなくても物体検出が実行できました

Elixir による AI 処理もどんどん実用的になっていますね

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up