More than 1 year has passed since last update.

YOLOv8自作データセットの学習方法(ローカルでも動かすよ)

Posted at 2024-05-04

はじめに

今回は、yolov8と自作データセットを用いて画像認識をしていきます。
ローカルで学習したい方は前回のCUDA環境構築のやり方(Pytorchも入れるよ)を参考に環境構築してください。本文では学習環境はgoogle colabで行います。アノテーションはlabelImgを使いました。

labelImgの使い方

アノテーションするのにはいろいろなツールがありますが今回はlabelImgを使いました。

how to install

pipを使ってインストール

pip install labelImg

コマンドラインにlabelImgを入力で起動

labelImg

how to use

1.「Open Dir」で画像データを保存してあるフォルダを指定する。
2.「Change Save Dir」でアノテーションしたラベリングデータの保存先を指定する。
3. ラベリングデータの保存形式が「PascalVOC」や「CreateML」であれば「YOLO」に変更する。
4. 右クリックで「Create RectBox」を押した後、画像上でドラッグしながらラベル付けしたい物体の範囲を選択。
5.必要な数だけラベル付け出来たら「Save」を押して保存。
6.「Next Image」を押して次の画像も同様に実施。

labelImgが落ちる不具合対策

「Create RectBox」を押してラベル付けするときに以下のエラーが表示されてアプリケーションが落ちる場合があります。

Traceback (most recent call last):
File "c:\Users\……\labelImg.py", line 965, in scroll_request
bar.setValue(bar.value() + bar.singleStep() * units)
TypeError: setValue(self, a0: int): argument 1 has unexpected type 'float'

この改善方法についてはGitHub上で議論されています。
データの型が間違っており、関係するライブラリのうち、canvas.pyとlabelImg.pyの中身のコード計4か所にint()を追加します。

File canvas.py（line526） from: p.drawRect(left_top.x(), left_top.y(), rect_width, rect_height)

p.drawRect(int(left_top.x()), int(left_top.y()), int(rect_width), int(rect_height))

File canvas.py（line530） from: p.drawLine(self.prev_point.x(), 0, self.prev_point.x(), self.pixmap.height())

p.drawLine(int(self.prev_point.x()), 0, int(self.prev_point.x()), int(self.pixmap.height()))

File canvas.py（line531） from: p.drawLine(0, self.prev_point.y(), self.pixmap.width(), self.prev_point.y())

p.drawLine(0, int(self.prev_point.y()), int(self.pixmap.width()), int(self.prev_point.y()))

File labelImg.py（line965） from: bar.setValue(bar.value() + bar.singleStep() * units)

bar.setValue(int(bar.value() + bar.singleStep() * units))

データセットの作り方

更新待ち

学習方法

colabを開く。
ultralyticsをインストール

!pip install ultralytics

yoloをインポート

from ultralytics import YOLO

モデルの選択

model = YOLO('yolov8n.pt')

データセットをcolab上にupload(zipファイルで)

from google.colab import files
uploaded = files.upload()

zipを解凍

!unzip dataset.zip

同様にyamlファイルをupload
学習を行う(epochsは学習回数）

model.train(data='dataset.yaml', epochs=300)

学習を評価

model.val()

写真を使って検出してみる。

results = model('写真のpath')

結果を保存

from PIL import Image

for r in results:
  im_array = r.plot()  # plot a BGR numpy array of predictions
  im = Image.fromarray(im_array[..., ::-1])  # RGB PIL image
  im.show()  # show image
  im.save('results.jpg')  # save image

実際に動かしてみた

カメラの画像を学習した重みbest.ptを使って推論を回すコード

import cv2
from ultralytics import YOLO

# カメラのデバイス番号を指定（通常は0が内蔵カメラ）
camera_device = 0

# VideoCaptureオブジェクトを作成
cap = cv2.VideoCapture(camera_device)
#yoloをインスタンス化
model =YOLO("../suitei/best.pt")

# カメラが正しくオープンされたかチェック
if not cap.isOpened():
    print("Error: カメラを開けませんでした。")
    exit()

# カメラから連続的にフレームをキャプチャして表示
while True:
    # フレームを1枚ずつ取得
    ret, frame = cap.read()

    # フレームの取得に失敗した場合は終了
    if not ret:
        print("Error: フレームの取得に失敗しました。")
        break
    
    result = model.predict(source=frame)
    img_annotated = result[0].plot()
    
    
    
    # 取得したフレームを表示
    cv2.imshow("Camera", img_annotated)

    # 'q'キーが押されたらループを終了
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break

# カメラの解放とウィンドウの破棄
cap.release()
cv2.destroyAllWindows()

結果の画像

最後に

今回は自分で認識したいデータセットを学習させて実際に動かしてみました。前回のhsvでの色認識よりも遥かに簡単に認識精度を上げ得ることができました。参考までに、今回学習に使った画像は約400枚程です。かなり精度は上がりましたがなぜ誤認識しているのかが分かりづらかったしました。ロボコンでは十分に使えると感じました。

参考文献

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up