More than 1 year has passed since last update.

Yolo V8の覚書

Last updated at 2023-07-14Posted at 2023-04-25

はじめに

yolov8のインストールメモ

必要なもの(2023年4月基準)

CUDA==11.8
CuDNN==8.8

CUDA システム環境変数にCUDA_PATHとCUDA_PATH_V11_8を確認（自動）
CuDNN ユーザ環境変数のPathに CUDA\V11.8\bin 及び CUDA\V11.8\libnvvpを登録（手動）
CUDAの確認は、c:>nvcc --versionを利用

インストール手順

Yolo v8設置

pip install ultralytics

2.　GPUに対応してなかった場合

2.1.　PytorchだけUninstall

pip uninstall torch torchvision

2.2. Pytorchを再インストール

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

3. YoloのBasic

object_detection.py

from ultralytics import YOLO
import cv2

# Load a model
model = YOLO("yolov8n.pt")
print(model.names)
print(len(model.names))

出力

coco datasetの訓練結果
{0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
80

object_detection.py

from ultralytics import YOLO
import cv2

# Load a model
model = YOLO("yolov8n.pt")

#Predict the model
results = model.predict(source='../images/bus.jpg')

#resultsの中身
print(len(results))
print(results[0])

出力


1
ultralytics.yolo.engine.results.Results object with attributes:

boxes: ultralytics.yolo.engine.results.Boxes object
keypoints: None
keys: ['boxes']
masks: None
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
orig_img: array([[[122, 148, 172],
        [120, 146, 170],
        [125, 153, 177],
        ...,
        [157, 170, 184],
        [158, 171, 185],
        [158, 171, 185]],
         ...,
      
        [ 99,  89,  95],
        [ 96,  86,  92],
        [102,  92,  98]]], dtype=uint8)
orig_shape: (1080, 810)
path: 'C:\\Users\\.....\\01.Object_Detection\\..\\images\\bus.jpg'
probs: None
save_dir: None
speed: {'preprocess': 2.521991729736328, 'inference': 116.05429649353027, 'postprocess': 4.006147384643555}
image 1/1 C:\Users\.....\01.Object_Detection\..\images\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 116.1ms
Speed: 2.5ms preprocess, 116.1ms inference, 4.0ms postprocess per image at shape (1, 3, 640, 480)

boxes : A 2D tensor of bounding box coordinates for each detection.
keypoints : A list of detected keypoints for each object.* これは姿勢検定の時に出力
masks:A 3D tensor of detection masks, where each mask is a binary image.
names:A dictionary of class names.
orig_img:The original image as a numpy array.
orig_shape:The original image shape in (height, width) format.
path:The path to the image file.
probs:A Probs object containing probabilities of each class for classification task.
speed:A dictionary of preprocess, inference and postprocess speeds in milliseconds per image.

ここで、boxesの中身を確認しましょう。

from ultralytics import YOLO
import cv2

# Load a model
model = YOLO("yolov8n.pt")

#Predict the model
results = model.predict(source='../images/bus.jpg')

for result in results:
    print(result.boxes)

WARNING  'Boxes.boxes' is deprecated. Use 'Boxes.data' instead.
ultralytics.yolo.engine.results.Boxes object with attributes:

boxes: tensor([[1.7254e+01, 2.3059e+02, 8.0153e+02, 7.6847e+02, 8.7038e-01, 5.0000e+00],
        [4.8737e+01, 3.9927e+02, 2.4450e+02, 9.0251e+02, 8.6907e-01, 0.0000e+00],
        [6.7027e+02, 3.8027e+02, 8.0986e+02, 8.7569e+02, 8.5361e-01, 0.0000e+00],
        [2.2139e+02, 4.0579e+02, 3.4472e+02, 8.5740e+02, 8.1945e-01, 0.0000e+00],
        [6.3884e-02, 2.5464e+02, 3.2290e+01, 3.2504e+02, 3.4639e-01, 1.1000e+01],
        [0.0000e+00, 5.5101e+02, 6.7097e+01, 8.7394e+02, 3.0120e-01, 0.0000e+00]], device='cuda:0')
cls: tensor([ 5.,  0.,  0.,  0., 11.,  0.], device='cuda:0')
conf: tensor([0.8704, 0.8691, 0.8536, 0.8194, 0.3464, 0.3012], device='cuda:0')
data: tensor([[1.7254e+01, 2.3059e+02, 8.0153e+02, 7.6847e+02, 8.7038e-01, 5.0000e+00],
        [4.8737e+01, 3.9927e+02, 2.4450e+02, 9.0251e+02, 8.6907e-01, 0.0000e+00],
        [6.7027e+02, 3.8027e+02, 8.0986e+02, 8.7569e+02, 8.5361e-01, 0.0000e+00],
        [2.2139e+02, 4.0579e+02, 3.4472e+02, 8.5740e+02, 8.1945e-01, 0.0000e+00],
        [6.3884e-02, 2.5464e+02, 3.2290e+01, 3.2504e+02, 3.4639e-01, 1.1000e+01],
        [0.0000e+00, 5.5101e+02, 6.7097e+01, 8.7394e+02, 3.0120e-01, 0.0000e+00]], device='cuda:0')
id: None
is_track: False
orig_shape: (1080, 810)
shape: torch.Size([6, 6])
xywh: tensor([[409.3930, 499.5262, 784.2786, 537.8801],
        [146.6187, 650.8860, 195.7628, 503.2408],
        [740.0625, 627.9819, 139.5904, 495.4198],
        [283.0580, 631.5974, 123.3267, 451.6083],
        [ 16.1772, 289.8407,  32.2266,  70.3963],
        [ 33.5483, 712.4762,  67.0965, 322.9335]], device='cuda:0')
xywhn: tensor([[0.5054, 0.4625, 0.9682, 0.4980],
        [0.1810, 0.6027, 0.2417, 0.4660],
        [0.9137, 0.5815, 0.1723, 0.4587],
        [0.3495, 0.5848, 0.1523, 0.4182],
        [0.0200, 0.2684, 0.0398, 0.0652],
        [0.0414, 0.6597, 0.0828, 0.2990]], device='cuda:0')
xyxy: tensor([[1.7254e+01, 2.3059e+02, 8.0153e+02, 7.6847e+02],
        [4.8737e+01, 3.9927e+02, 2.4450e+02, 9.0251e+02],
        [6.7027e+02, 3.8027e+02, 8.0986e+02, 8.7569e+02],
        [2.2139e+02, 4.0579e+02, 3.4472e+02, 8.5740e+02],
        [6.3884e-02, 2.5464e+02, 3.2290e+01, 3.2504e+02],
        [0.0000e+00, 5.5101e+02, 6.7097e+01, 8.7394e+02]], device='cuda:0')
xyxyn: tensor([[2.1301e-02, 2.1351e-01, 9.8955e-01, 7.1154e-01],
        [6.0169e-02, 3.6969e-01, 3.0185e-01, 8.3565e-01],
        [8.2749e-01, 3.5210e-01, 9.9982e-01, 8.1083e-01],
        [2.7333e-01, 3.7573e-01, 4.2558e-01, 7.9389e-01],
        [7.8869e-05, 2.3578e-01, 3.9865e-02, 3.0096e-01],
        [0.0000e+00, 5.1019e-01, 8.2835e-02, 8.0921e-01]], device='cuda:0')

boxes : Return the raw bboxes tensor (今後 dataに変更。deprecated)
cls: the class values of the boxes
conf:the confidence values of the boxes
id : the track IDs of the boxes (if available).
xywh:the boxes in xywh format.
wywhn:the boxes in xywh format normalized by original image size.
xyxy:the boxes in xyxy format.
xyxyn:the boxes in xyxy format normalized by original image size.

Object Detection with webcam

object detection webcam.py

from ultralytics import YOLO
import cv2
import math
# https://docs.ultralytics.com/modes/predict/#plotting-results
# cap = cv2.VideoCapture('../videos/motorbikes.mp4')
cap = cv2.VideoCapture(0)

# Load a model
model = YOLO("yolov8l.pt")

while True:
    success, img = cap.read()

    if success:
        # results = model(img)  # stream true
        results = model.predict(img)
        # https: // docs.ultralytics.com / modes / predict /  # inference-arguments

        # Visualize the results on the frame
        annotated_frame = results[0].plot()

        # Display the annotated frame
        cv2.imshow("YOLOv8 Inference", annotated_frame)

   
        # Press Q to exit
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        # Break the loop if the end of the video is reached
        break

参考資料

yolov8については、最も分かりやすい動画かも。
https://youtu.be/WgPbbWmnXJ8?t=4098

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up