yolo11をwindows上で使う。

Last updated at 2024-10-28Posted at 2024-10-22

yolo 11 で稼働テストから学習までのメモ

1 pythonのインストール（Python 3.12.6 win64)

Windows向けのインストーラ版を選ぶ。今回は、以下のバージョンを選んだ。

https://www.python.org/downloads/windows/
Python 3.12.6 - Sept. 6, 2024
Note that Python 3.12.6 cannot be used on Windows 7 or earlier.

Download Windows installer (64-bit)

カスタムインストールを選び、保存先だけ変更する。

以下を指定。
C:\python\python312

2 仮想環境の構築

C:\python\python312でコマンドプロンプト（ターミナル）を開く

仮想環境を作成する。環境名はyolo11

python -m venv yolo11

以下を実行すると仮想環境に入る。（yolo11）が表示されていることを確認。
C:\python\python312\yolo11\Scripts\activate

ライブラリのインストールを仮想環境内で進める。
公式　https://github.com/ultralytics/ultralytics

pip install ultralytics
これだけで、必要なものが入る。
動作確認

yolo predict model=yolo11n.pt source='https://ultralytics.com/images/bus.jpg'

pip list で確認するとpytorchはGPU対応版が入っていないので必要に応じ置き換える。同じtorchバージョンで自分のcudaバージョンに合わせるのが吉です。

pip uninstall torchvision
公式　https://pytorch.org/get-started/previous-versions/

pip install torch==2.2.2 torchvision==0.17.2 torchaudio==2.2.2 --index-url [a:https://download.pytorch.org/whl/cu121] https://download.pytorch.org/whl/cu121

pip list で再び確認し、torchに+cu121などが付いていればよい。

そのほか、公式サンプルプログラムを実行するために
pip install transformers sahi
など

３プロジェクトフォルダで試す

mkdir c:\yolo11

プロジェクトフォルダをVS-Codeで開く。
先ほどの、コマンドプロンプトでの実行と同じものをPythonで記述。
インタプリタを仮想環境に切り替えることを忘れずに。

静止画1枚の推論

test.py

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt")

# Perform object detection on an image
results = model("bus.jpg")
results[0].show()

webcam動画の推論

test_webcam.py

from ultralytics import YOLO

# Load a model
model = YOLO("yolo11n.pt")

import cv2

# Webカメラを起動
cap = cv2.VideoCapture(0)

while cap.isOpened():
    ret, frame = cap.read()
    if not ret:
        break

    # YOLOでフレームを解析
    results = model(frame, show=True)

cap.release()
cv2.destroyAllWindows()

オリジナルデータで学習

モデルサイズ　n < s < m < l < x

my_train_win.py


from ultralytics import YOLO
from multiprocessing import freeze_support

if __name__ == '__main__':
    freeze_support()  # Windowsでマルチプロセスを使用するために必要

    model = YOLO("yolo11n.pt")  # load a pretrained model (recommended for training)

    model.train(data="C:/yolo11/datasets/my_datasets.yaml", epochs=100, imgsz=640,device=0 ) #device=0 GPUの指定 CPUの場合は削除する

データの設定はyamlファイルで行う。
以下のようなデータ配置にしておく。

 C:/yolo11/datasets/my_datasets
              |            ├── images
              |            |      └── train
              |            |      └── val
              |            └── labels
              |                   └── train
              |                   └── val
              |            　      
              └── my_datasets.yaml

imagesはアノテーション済みの画像データ。
labelsはテキスト形式のアノテーション。
画像と同じ名前で、拡張子がtxt
カテゴリ名　ｘ、ｙ中心位置　ｘ、ｙ大きさ　の順で比率としたものが入っている。
train,valに重複して入っていても、画像フォルダにあるものを使ってくれる。

image1.txt

1 0.765038 0.217826 0.173684 0.2
2 0.63609 0.236087 0.013534 0.026957

my_datasets.yaml


# Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..]
path: C:/yolo11/datasets/my_datasets  # dataset root dir
train: images/train  # train images (relative to 'path') 
val: images/val  # val images (relative to 'path')
test:  # test images (optional)

# Classes
names:
  0: person
  1: cat
  2: dog
  3: chair

とりあえず、これで学習できる。
学習結果は、自動でフォルダ　runs に保存される
この中の、weightsに学習結果が保存されるので、推論プログラムのyolo11.ptの部分をbest.pt（フルパス）に変えればいい。

サンプルコードは公式にいろいろ。
https://github.com/ultralytics/ultralytics
公式のコード一式をダウンロードして解凍。 or git clone

4 学習時のデータ拡張・ハイパラの設定

学習時には拡大(scale= 0から1で指定、デフォルト0.5)などは自動的に適用されている。回転やmixup、学習率などを追加で適用したい場合は、https://github.com/ultralytics/ultralytics/blob/main/ultralytics/cfg/default.yaml を参考にする。
model.train の部分に、mixup=0.5,flipud=0.5,lr0=1E-5 などと記述すれば良い。

以下、目についた設定項目。

Train settings

imgsz: 640 # (int | list) input images size as int for train and val modes, or list[h,w] for predict and export modes
batch: 16 # (int) number of images per batch (-1 for AutoBatch)
multi_scale: False # (bool) Whether to use multiscale during training

画像サイズを大きくすると、メモリ使用量が増大する。メモリエラーを回避するため、バッチサイズを小さくするかオートバッチに設定する。

Val/Test settings

half: False # (bool) use half precision (FP16)

Predict settings

classes: # (int | list[int], optional) filter results by class, i.e. classes=0, or classes=[0,2,3]

Visualize settings

show: False # (bool) show predicted images and videos if environment allows
show_boxes: True # (bool) show prediction boxes
line_width: # (int, optional) line width of the bounding boxes. Scaled to image size if None.

Hyperparameters

scale: 0.5 # (float) image scale (+/- gain)

大きな入力画像への対応

デフォルトの640ピクセルでは小さすぎることもある。学習に利用できるメモリに余裕があれば、上記のimgszを大きく設定できる。
その他の手段としては、画像を分割して推論するライブラリSAHIの例が公式にある。
https://github.com/ultralytics/ultralytics/tree/main/examples/YOLOv8-SAHI-Inference-Video
https://docs.ultralytics.com/ja/guides/sahi-tiled-inference/
サンプルコードで model_type="yolov8"　となっている部分はyolo11でも変更せずに使えば動作する。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up