LoginSignup
8
9

Yolo V8の覚書

Last updated at Posted at 2023-04-25

はじめに

yolov8のインストールメモ

必要なもの(2023年4月基準)

CUDA==11.8
CuDNN==8.8

  • CUDA システム環境変数にCUDA_PATHとCUDA_PATH_V11_8を確認(自動)
  • CuDNN ユーザ環境変数のPathに CUDA\V11.8\bin 及び CUDA\V11.8\libnvvpを登録(手動)
  • CUDAの確認は、c:>nvcc --versionを利用

インストール手順

  1. Yolo v8設置
pip install ultralytics

2. GPUに対応してなかった場合

2.1. PytorchだけUninstall

pip uninstall torch torchvision

2.2. Pytorchを再インストール

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

3. YoloのBasic

object_detection.py
from ultralytics import YOLO
import cv2

# Load a model
model = YOLO("yolov8n.pt")
print(model.names)
print(len(model.names))

出力
coco datasetの訓練結果
{0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
80

object_detection.py
from ultralytics import YOLO
import cv2

# Load a model
model = YOLO("yolov8n.pt")

#Predict the model
results = model.predict(source='../images/bus.jpg')

#resultsの中身
print(len(results))
print(results[0])

image.png

出力

1
ultralytics.yolo.engine.results.Results object with attributes:

boxes: ultralytics.yolo.engine.results.Boxes object
keypoints: None
keys: ['boxes']
masks: None
names: {0: 'person', 1: 'bicycle', 2: 'car', 3: 'motorcycle', 4: 'airplane', 5: 'bus', 6: 'train', 7: 'truck', 8: 'boat', 9: 'traffic light', 10: 'fire hydrant', 11: 'stop sign', 12: 'parking meter', 13: 'bench', 14: 'bird', 15: 'cat', 16: 'dog', 17: 'horse', 18: 'sheep', 19: 'cow', 20: 'elephant', 21: 'bear', 22: 'zebra', 23: 'giraffe', 24: 'backpack', 25: 'umbrella', 26: 'handbag', 27: 'tie', 28: 'suitcase', 29: 'frisbee', 30: 'skis', 31: 'snowboard', 32: 'sports ball', 33: 'kite', 34: 'baseball bat', 35: 'baseball glove', 36: 'skateboard', 37: 'surfboard', 38: 'tennis racket', 39: 'bottle', 40: 'wine glass', 41: 'cup', 42: 'fork', 43: 'knife', 44: 'spoon', 45: 'bowl', 46: 'banana', 47: 'apple', 48: 'sandwich', 49: 'orange', 50: 'broccoli', 51: 'carrot', 52: 'hot dog', 53: 'pizza', 54: 'donut', 55: 'cake', 56: 'chair', 57: 'couch', 58: 'potted plant', 59: 'bed', 60: 'dining table', 61: 'toilet', 62: 'tv', 63: 'laptop', 64: 'mouse', 65: 'remote', 66: 'keyboard', 67: 'cell phone', 68: 'microwave', 69: 'oven', 70: 'toaster', 71: 'sink', 72: 'refrigerator', 73: 'book', 74: 'clock', 75: 'vase', 76: 'scissors', 77: 'teddy bear', 78: 'hair drier', 79: 'toothbrush'}
orig_img: array([[[122, 148, 172],
        [120, 146, 170],
        [125, 153, 177],
        ...,
        [157, 170, 184],
        [158, 171, 185],
        [158, 171, 185]],
         ...,
      
        [ 99,  89,  95],
        [ 96,  86,  92],
        [102,  92,  98]]], dtype=uint8)
orig_shape: (1080, 810)
path: 'C:\\Users\\.....\\01.Object_Detection\\..\\images\\bus.jpg'
probs: None
save_dir: None
speed: {'preprocess': 2.521991729736328, 'inference': 116.05429649353027, 'postprocess': 4.006147384643555}
image 1/1 C:\Users\.....\01.Object_Detection\..\images\bus.jpg: 640x480 4 persons, 1 bus, 1 stop sign, 116.1ms
Speed: 2.5ms preprocess, 116.1ms inference, 4.0ms postprocess per image at shape (1, 3, 640, 480)

  1. boxes : A 2D tensor of bounding box coordinates for each detection.
  2. keypoints : A list of detected keypoints for each object.* これは姿勢検定の時に出力
  3. masks:A 3D tensor of detection masks, where each mask is a binary image.
  4. names:A dictionary of class names.
  5. orig_img:The original image as a numpy array.
  6. orig_shape:The original image shape in (height, width) format.
  7. path:The path to the image file.
  8. probs:A Probs object containing probabilities of each class for classification task.
  9. speed:A dictionary of preprocess, inference and postprocess speeds in milliseconds per image.

ここで、boxesの中身を確認しましょう。

from ultralytics import YOLO
import cv2

# Load a model
model = YOLO("yolov8n.pt")

#Predict the model
results = model.predict(source='../images/bus.jpg')

for result in results:
    print(result.boxes)
WARNING  'Boxes.boxes' is deprecated. Use 'Boxes.data' instead.
ultralytics.yolo.engine.results.Boxes object with attributes:

boxes: tensor([[1.7254e+01, 2.3059e+02, 8.0153e+02, 7.6847e+02, 8.7038e-01, 5.0000e+00],
        [4.8737e+01, 3.9927e+02, 2.4450e+02, 9.0251e+02, 8.6907e-01, 0.0000e+00],
        [6.7027e+02, 3.8027e+02, 8.0986e+02, 8.7569e+02, 8.5361e-01, 0.0000e+00],
        [2.2139e+02, 4.0579e+02, 3.4472e+02, 8.5740e+02, 8.1945e-01, 0.0000e+00],
        [6.3884e-02, 2.5464e+02, 3.2290e+01, 3.2504e+02, 3.4639e-01, 1.1000e+01],
        [0.0000e+00, 5.5101e+02, 6.7097e+01, 8.7394e+02, 3.0120e-01, 0.0000e+00]], device='cuda:0')
cls: tensor([ 5.,  0.,  0.,  0., 11.,  0.], device='cuda:0')
conf: tensor([0.8704, 0.8691, 0.8536, 0.8194, 0.3464, 0.3012], device='cuda:0')
data: tensor([[1.7254e+01, 2.3059e+02, 8.0153e+02, 7.6847e+02, 8.7038e-01, 5.0000e+00],
        [4.8737e+01, 3.9927e+02, 2.4450e+02, 9.0251e+02, 8.6907e-01, 0.0000e+00],
        [6.7027e+02, 3.8027e+02, 8.0986e+02, 8.7569e+02, 8.5361e-01, 0.0000e+00],
        [2.2139e+02, 4.0579e+02, 3.4472e+02, 8.5740e+02, 8.1945e-01, 0.0000e+00],
        [6.3884e-02, 2.5464e+02, 3.2290e+01, 3.2504e+02, 3.4639e-01, 1.1000e+01],
        [0.0000e+00, 5.5101e+02, 6.7097e+01, 8.7394e+02, 3.0120e-01, 0.0000e+00]], device='cuda:0')
id: None
is_track: False
orig_shape: (1080, 810)
shape: torch.Size([6, 6])
xywh: tensor([[409.3930, 499.5262, 784.2786, 537.8801],
        [146.6187, 650.8860, 195.7628, 503.2408],
        [740.0625, 627.9819, 139.5904, 495.4198],
        [283.0580, 631.5974, 123.3267, 451.6083],
        [ 16.1772, 289.8407,  32.2266,  70.3963],
        [ 33.5483, 712.4762,  67.0965, 322.9335]], device='cuda:0')
xywhn: tensor([[0.5054, 0.4625, 0.9682, 0.4980],
        [0.1810, 0.6027, 0.2417, 0.4660],
        [0.9137, 0.5815, 0.1723, 0.4587],
        [0.3495, 0.5848, 0.1523, 0.4182],
        [0.0200, 0.2684, 0.0398, 0.0652],
        [0.0414, 0.6597, 0.0828, 0.2990]], device='cuda:0')
xyxy: tensor([[1.7254e+01, 2.3059e+02, 8.0153e+02, 7.6847e+02],
        [4.8737e+01, 3.9927e+02, 2.4450e+02, 9.0251e+02],
        [6.7027e+02, 3.8027e+02, 8.0986e+02, 8.7569e+02],
        [2.2139e+02, 4.0579e+02, 3.4472e+02, 8.5740e+02],
        [6.3884e-02, 2.5464e+02, 3.2290e+01, 3.2504e+02],
        [0.0000e+00, 5.5101e+02, 6.7097e+01, 8.7394e+02]], device='cuda:0')
xyxyn: tensor([[2.1301e-02, 2.1351e-01, 9.8955e-01, 7.1154e-01],
        [6.0169e-02, 3.6969e-01, 3.0185e-01, 8.3565e-01],
        [8.2749e-01, 3.5210e-01, 9.9982e-01, 8.1083e-01],
        [2.7333e-01, 3.7573e-01, 4.2558e-01, 7.9389e-01],
        [7.8869e-05, 2.3578e-01, 3.9865e-02, 3.0096e-01],
        [0.0000e+00, 5.1019e-01, 8.2835e-02, 8.0921e-01]], device='cuda:0')

  1. boxes : Return the raw bboxes tensor (今後 dataに変更。deprecated)
  2. cls: the class values of the boxes
  3. conf:the confidence values of the boxes
  4. id : the track IDs of the boxes (if available).
  5. xywh:the boxes in xywh format.
  6. wywhn:the boxes in xywh format normalized by original image size.
  7. xyxy:the boxes in xyxy format.
  8. xyxyn:the boxes in xyxy format normalized by original image size.

Object Detection with webcam

object detection webcam.py
from ultralytics import YOLO
import cv2
import math
# https://docs.ultralytics.com/modes/predict/#plotting-results
# cap = cv2.VideoCapture('../videos/motorbikes.mp4')
cap = cv2.VideoCapture(0)

# Load a model
model = YOLO("yolov8l.pt")

while True:
    success, img = cap.read()

    if success:
        # results = model(img)  # stream true
        results = model.predict(img)
        # https: // docs.ultralytics.com / modes / predict /  # inference-arguments

        # Visualize the results on the frame
        annotated_frame = results[0].plot()

        # Display the annotated frame
        cv2.imshow("YOLOv8 Inference", annotated_frame)

   
        # Press Q to exit
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
    else:
        # Break the loop if the end of the video is reached
        break

参考資料

yolov8については、最も分かりやすい動画かも。
https://youtu.be/WgPbbWmnXJ8?t=4098

8
9
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
8
9