More than 1 year has passed since last update.

YOLOを、計る

Posted at 2022-10-10

はじめに

Object Detection の手法である YOLO では、これまでさまざまなモデルが発表されてきましたが、YOLOv7 の論文（以下、論文と言います）では、代表的な YOLO のパラメータ数、計算量（flops）、FPS (Frame per Second)、精度を一挙に掲載しており、それによって YOLO 間での比較が可能になっています。
こういった比較を手元でできないかと試行しましたので、ここに結果をまとめます。実際に測定した値（以下、実測値と言います）と論文に記載された値（以下、論文値と言います）を比較することで、その実測手法の妥当性を検証したいと思います。

YOLO を計る

それでは、実際に、推論時間（FPS）、精度、計算量、パラメータ数の順に計測していきます。

比較対象

論文 Table 2 の表にあるモデルのうち、PPYOLOE を除く総てのモデルに対し、FPS (推論時間の逆数)、精度、計算量（flops）、パラメータ数を測定します。

推論時間（FPS）を計る

推論時間の計測は、Web 記事「YOLOv7はYOLOシリーズで最強か【第二版】」で詳細に記載しましたが、Colab Pro で V100 を使って 10 回測定し、平均を算出しました。論文値を X 軸、実測値を Y 軸として比較したのが以下です。破線上に乗っていれば、論文値と実測値が一致することを意味します。ここでは、破線上に乗っているモデルもあれば、外れているものもあります。

この結果を解釈するため、Colab Pro ( GPU: Tesla T4) 上で、YOLOv7 のモデルを 100 回実行した結果を測定し、グラフ化したのが以下です。X 軸は試行回、Y 軸は推論時間 (単位：ミリ秒) です。最初の数回は時間がかかりますが、以降は比較的安定した挙動を示しています。ただ、50 回目以降でも値が乱れたり、実行のたびに挙動が違って見えたりします。この結果は Tesla T4 上で測定したものですが、V100 でも同様に値が暴れていた可能性があります。このために、10 回試行の平均をとっても、なお、値がうまくとれないものがあるのだと思います。この傾向は、YOLOv7 だけではなく、他の YOLO でも同様でしたので、YOLO の特性の問題ではなく、クラウド環境での測定の問題と思われます。

このように、クラウド環境である Colab Pro で性能測定をする際には、値が暴れることを考慮し、適当な回数を実行し、最小値をとるなど、工夫する必要があります。

精度を計る

各 YOLO ともに、MS COCO データセットの val2017 で AP (Average Precision) をとります。--conf は 0.001、--iou は 0.65 で総ての YOLO で統一します。例えば、YOLOv7 では以下のように実行します。

なお、Web 記事「YOLOv7はYOLOシリーズで最強か【第二版】」では、各 YOLO での FP16 や Fusion の扱い方が違うため、それらの組み合わせで推論速度を比較しました。しかし、精度に対しては、FP16 や Fusion はほとんど影響を与えないことを確認済みですので、各 YOLO ともデフォルトのやり方でモデルを作成します。

!python test.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.65 --device 0 --weights yolov7.pt --name yolov7_640_val

そうすると、以下のように結果が編集されて出力されます。各 YOLO 比較には、赤枠で囲った部分の数値を使います。

各 YOLO に対して精度実測を行った結果を論文値と比較したのが以下です。YOLOv7-tiny と YOLOR シリーズは破線上に乗っておらず、論文値と異なった結果になっています。

YOLOv7-tiny は、論文では SiLU を活性化関数で使用していると記載されていますが、事前学習モデルで使用したものは、Leaky ReLU を使用しています。精度の違いはこのあたりにあるのかもしれません。

YOLOR シリーズに対して論文の精度が再現できていませんが、YOLOR github (大きいモデルは、こちらの github) では、実測値と完全に一致する値が掲載されています。論文では、YOLOR github とは違う Weights を使ったのかもしれません。

YOLOR モデル	論文値	YOLOR github	実測値
YOLOR-CSP	50.8%	49.2%	49.2%
YOLOR-CSP-X	52.7%	51.1%	51.1%
YOLOR-P6	53.5%	52.5%	52.5%
YOLOR-W6	54.8%	54.0%	54.0%
YOLOR-E6	55.7%	54.6%	54.6%
YOLOR-D6	56.1%	55.4%	55.4%

計算量、パラメータ数を計る

計算量、パラメータ数は、thop パッケージの profile で測定します。profile にモデルと画像スケルトンを入力するとその戻りが計算量 (flops) とパラメータ数 (params) になりますので、それぞれ、ギガ、メガのスケールに修正します。ここで、flops に 2 をかけていますが、その理由はこの記事で説明されています。実際のコードは「参考：コード」を参照願います。

# flops, params計算
device = select_device('')
weights, img_sz = 'yolov7.pt', 640
model = attempt_load(weights, map_location=device) 
input = torch.zeros((1, 3, img_sz, img_sz), device=next(model.parameters()).device)
flops, params = profile(deepcopy(model), inputs=(input, ))
flops = flops / 1E9 * 2
params /= 1E6

このやり方で対象の YOLO モデルに対して計算を行い、論文値と比較したのが以下です。

YOLOv7-D6 のみ外れています。精度や FPS は実測値とあっていますので、論文での記載ミスかもしれません。

測定結果

最後に、上記手法によって測定した実測値を、論文値と並べて表にまとめました。

まとめ

以上により、YOLO を FPS、精度、計算量、パラメータ数で計る方法を確認できました。FPS (or 推論時間) はクラウド環境では、測定に注意が必要なこともわかりました。

なお、上記実行にあたっては、いくつかの問題に遭遇しました。以下のコードでは、それら問題の対策も含め、記載しています。

参考：コード

ここでは、各 YOLO での、インストール、精度測定、計算量・パラメータ数測定について、コードを記載します。
FPS 測定については、Web 記事「YOLOv7はYOLOシリーズで最強か【第二版】」で詳細に記載していますので、そちらをご参照ください。

YOLOv7

インストール

import os
os.chdir('/content')

!git clone https://github.com/WongKinYiu/yolov7.git > /dev/null
os.chdir('yolov7')
!pip3 install -U pip && pip3 install -r requirements.txt > /dev/null

精度測定

以下のように実行します。ここで、--imgはモデルにあわせ、640、あるいは 1280 を指定します。--conf 0.001 --iou 0.65 は全 YOLO で共通です。

!python test.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.65 --device 0 --weights yolov7-tiny.pt --name yolov7-tiny_640_val

この場合、以下のような出力になります。

Namespace(augment=False, batch_size=32, conf_thres=0.001, data='data/coco.yaml', device='0', exist_ok=False, img_size=640, iou_thres=0.65, name='yolov7-tiny_640_val', no_trace=False, project='runs/test', save_conf=False, save_hybrid=False, save_json=True, save_txt=False, single_cls=False, task='val', v5_metric=False, verbose=False, weights=['yolov7-tiny.pt'])
YOLOR 🚀 v0.1-115-g072f76c torch 1.12.1+cu113 CUDA:0 (Tesla T4, 15109.75MB)

Downloading https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-tiny.pt to yolov7-tiny.pt...
100% 12.1M/12.1M [00:00<00:00, 27.8MB/s]

Fusing layers... 
Model Summary: 200 layers, 6219709 parameters, 6219709 gradients
 Convert model to Traced-model... 
/usr/local/lib/python3.7/dist-packages/torch/_tensor.py:1083: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:477.)
  return self._grad
 traced_script_module saved! 
 model is traced! 


WARNING: Dataset not found, nonexistent paths: ['/content/yolov7/coco/val2017.txt']
Downloading bash ./scripts/get_coco.sh ...
Downloading https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels-segments.zip  ...
Downloading http://images.cocodataset.org/zips/train2017.zip ...
Downloading http://images.cocodataset.org/zips/val2017.zip ...
Downloading http://images.cocodataset.org/zips/test2017.zip ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  % T ota l    % Received % Xferd  Average Speed %  Ti Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0me    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  168M  100  168M    0     0  23.2M      0  0:00:07  0:00:07 --:--:-- 21.5M
100  777M  100  777M    0     0  43.0M      0  0:00:18  0:00:18 --:--:-- 44.1M
100 6339M  100 6339M    0     0  42.2M      0  0:02:30  0:02:30 --:--:-- 31.6M
100 18.0G  100 18.0G    0     0  37.1M      0  0:08:16  0:08:16 --:--:-- 47.3M
Dataset autodownload success

/usr/local/lib/python3.7/dist-packages/torch/functional.py:478: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at  ../aten/src/ATen/native/TensorShape.cpp:2894.)
  return _VF.meshgrid(tensors, **kwargs)  # type: ignore[attr-defined]
val: Scanning 'coco/val2017' images and labels... 4952 found, 48 missing, 0 empty, 0 corrupted: 100% 5000/5000 [00:12<00:00, 408.92it/s]
val: New cache created: coco/val2017.cache
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100% 157/157 [01:09<00:00,  2.27it/s]
                 all        5000       36335       0.646        0.51       0.545        0.36
Speed: 2.5/1.7/4.2 ms inference/NMS/total per 640x640 image at batch-size 32

Evaluating pycocotools mAP... saving runs/test/yolov7-tiny_640_val/yolov7-tiny_predictions.json...
loading annotations into memory...
Done (t=0.95s)
creating index...
index created!
Loading and preparing results...
DONE (t=6.05s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=74.28s).
Accumulating evaluation results...
DONE (t=13.78s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.374
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.552
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.403
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.191
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.418
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.526
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.311
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.519
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.571
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.356
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.632
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.745
/usr/local/lib/python3.7/dist-packages/torch/_tensor.py:1083: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the .grad field to be populated for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations. (Triggered internally at  aten/src/ATen/core/TensorBody.h:477.)
  return self._grad
Results saved to runs/test/yolov7-tiny_640_val

計算量・パラメータ数測定

import torch
from thop import profile
from models.experimental import attempt_load
from utils.torch_utils import select_device
from copy import deepcopy
import numpy as np
import os

weights_list = ['yolov7-tiny.pt', 
                'yolov7.pt',
                'yolov7x.pt',
                'yolov7-w6.pt',
                'yolov7-e6.pt',
                'yolov7-d6.pt',
                'yolov7-e6e.pt']

img_sz_list = [640, 640, 640, 1280, 1280, 1280, 1280]

device = select_device('')
for weights, img_sz in zip(weights_list, img_sz_list):

    model_name = weights.split('.')[0]
   
    model = attempt_load(weights, map_location=device) 
    input = torch.zeros((1, 3, img_sz, img_sz), device=next(model.parameters()).device)
    flops, params = profile(deepcopy(model), inputs=(input, ))
    flops = flops / 1E9 * 2
    params /= 1E6
    print(f'model_name: flops = {flops}, params = {params}')

YOLOv5

インストール

import os
os.chdir('/content')

!git clone https://github.com/ultralytics/yolov5  # clone
os.chdir('yolov5')
!pip install -r requirements.txt  # install

精度測定

以下のように実行します。ここで、--imgはモデルにあわせ、640、あるいは 1280 を指定します。conf はデフォルトで 0.001 となっているため、指定を省略しています。--iou 0.65 は全 YOLO で共通です。

!python val.py --weights yolov5n.pt --data coco.yaml --img 640 --iou 0.65

計算量・パラメータ数測定

import torch
from thop import profile
from utils.torch_utils import select_device
from copy import deepcopy
from models.yolo import DetectionModel
import yaml
import numpy as np
import os

weights_list = ['yolov5n.pt', 'yolov5s.pt', 'yolov5m.pt', 'yolov5l.pt', 'yolov5x.pt', 
                'yolov5n6.pt', 'yolov5s6.pt', 'yolov5m6.pt', 'yolov5l6.pt', 'yolov5x6.pt', ]

img_sz_list = [640, 640, 640, 640, 640, 1280, 1280, 1280, 1280, 1280]

device = select_device('')
for weights, img_sz in zip(weights_list, img_sz_list):

    model_name = weights.split('.')[0]
   
    # model = model = DetectMultiBackend(weights, device=device, dnn=False, data='data/coco128.yaml', fp16=True)

    addon_path = 'hub/' if model_name[-1] == '6' else ''
    yaml_path = '/content/yolov5/models/' + addon_path + model_name + '.yaml'
    with open(yaml_path, 'r') as yml:
        config = yaml.safe_load(yml)

    model = DetectionModel(config)

    input = torch.zeros((1, 3, img_sz, img_sz), device=next(model.parameters()).device)
    # gflop_dict[model_name] = profile(model, inputs=(input, ))
    flops, params = profile(deepcopy(model), inputs=(input, ))
    flops = flops / 1E9 * 2
    params /= 1E6
    print(f'model_name: flops = {flops}, params = {params}')

YOLOR (yolor_csp、yolor_csp_x、yolor_p6)

インストール

os.chdir('/content')
import os

!git clone https://github.com/WongKinYiu/yolor
os.chdir('yolor')
!sed -i '30s/^pycocotools/# pycocotools/' requirements.txt

# pip install required packages
!pip install -qr requirements.txt

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
!git clone https://github.com/JunnYu/mish-cuda
os.chdir('mish-cuda')
!python setup.py build install
os.chdir('..')

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
!git clone https://github.com/fbcotter/pytorch_wavelets
os.chdir('pytorch_wavelets')
!pip install .
os.chdir('..')

上記で requirements.txt に対して sed をして、pycocotools をコメントアウトしました。Colab では、pycocotools はすでに 2.0.5 (2022年10月10日現在) になっています。この行を実行すると2.0.0 にダウングレードされてしまい、そのバージョンに内在する問題のために後続の処理でエラーが発生します。それを回避するためこの対応を行っています。

事前学習モデルの取得

画像サイズ640 向けの事前学習モデル（yolor_csp.pt、yolor_csp_star.pt）と画像サイズ1280向けの事前学習モデルのうち yolor_p6.ptは、YOLOR github からダウンロードできます。

精度測定

テスト実行前に以下を実行し、後続での問題発生を回避します。

!sed -i '97s/for pred in o:/for pred in o.cpu().numpy():/' utils/plots.py
!sed -i '1adownload: bash ./scripts/get_coco.sh' data/coco.yaml

テストは以下のように実行します。ここで、--imgはモデルにあわせ、640、あるいは 1280 を指定します。--conf 0.001 --iou 0.65 は全 YOLO で共通です。

!python test.py --data data/coco.yaml --img 640 --batch 32 --conf 0.001 --iou 0.65 --device 0 --cfg cfg/yolor_csp.cfg --weights yolor_csp.pt --name yolor_csp_val

この場合、以下のような出力になります。

Namespace(augment=False, batch_size=32, cfg='cfg/yolor_csp.cfg', conf_thres=0.001, data='data/coco.yaml', device='0', exist_ok=False, img_size=640, iou_thres=0.65, name='yolor_csp_val', names='data/coco.names', project='runs/test', save_conf=False, save_json=True, save_txt=False, single_cls=False, task='val', verbose=False, weights=['yolor_csp.pt'])
Using torch 1.7.0 CUDA:0 (Tesla T4, 15109MB)

Model Summary: 529 layers, 52923994 parameters, 52923994 gradients, 121.027788800 GFLOPS

WARNING: Dataset not found, nonexistent paths: ['/content/coco/val2017.txt']
Downloading bash ./scripts/get_coco.sh ...
Downloading https://github.com/ultralytics/yolov5/releases/download/v1.0/coco2017labels.zip  ...
Downloading http://images.cocodataset.org/zips/train2017.zip ...
Downloading http://images.cocodataset.org/zips/val2017.zip ...
Downloading http://images.cocodataset.org/zips/test2017.zip ...
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100 46.3M  100 46.3M    0     0  19.2M      0  0:00:02  0:00:02 --:--:-- 23.9M
100  777M  100  777M    0     0  37.1M      0  0:00:20  0:00:20 --:--:-- 45.3M
100 6339M  100 6339M    0     0  36.2M      0  0:02:54  0:02:54 --:--:-- 46.4M
100 18.0G  100 18.0G    0     0  44.6M      0  0:06:52  0:06:52 --:--:-- 47.1M
Dataset autodownload success

Scanning images: 100% 5000/5000 [00:10<00:00, 489.22it/s]
Scanning labels ../coco/labels/val2017.cache3 (4952 found, 0 missing, 48 empty, 0 duplicate, for 5000 images): 5000it [00:00, 14444.56it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 157/157 [01:39<00:00,  1.58it/s]
                 all       5e+03    3.63e+04       0.434       0.748       0.671       0.481
Speed: 11.9/1.9/13.8 ms inference/NMS/total per 640x640 image at batch-size 32

Evaluating pycocotools mAP... saving runs/test/yolor_csp_val2/yolor_csp_predictions.json...
loading annotations into memory...
Done (t=0.90s)
creating index...
index created!
Loading and preparing results...
DONE (t=5.16s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=68.45s).
Accumulating evaluation results...
DONE (t=10.85s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.492
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.676
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.537
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.329
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.544
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.630
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.376
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.618
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.672
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.508
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.727
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.810
Results saved to runs/test/yolor_csp_val2

計算量・パラメータ数測定

import torch
from thop import profile
from copy import deepcopy
from models.models import *
import os
import numpy as np

weights_list = [['yolor_csp.pt'],
                ['yolor_csp_x.pt'],
                ['yolor_p6.pt']]

cfg_list = ['cfg/yolor_csp.cfg',
            'cfg/yolor_csp_x.cfg',
            'cfg/yolor_p6.cfg',
            ]

img_sz_list = [640, 640, 1280]

for weights, cfg, img_sz in zip(weights_list, cfg_list, img_sz_list):
    model_name = weights[0].split('.')[0]

    with torch.no_grad():
        model = Darknet(cfg, img_sz)

        input = torch.zeros((1, 3, img_sz, img_sz), device=next(model.parameters()).device)
        flops, params = profile(deepcopy(model), inputs=(input, ))
        flops = flops / 1E9 * 2
        params /= 1E6
        print(f'model_name: flops = {flops}, params = {params}')

YOLOR (yolor-w6、yolor-e6、yolor-d6)

インストール

import os
import shutil
os.chdir('/content')
if os.path.isdir('/content/yolor'): shutil.rmtree('/content/yolor')

!git clone -b paper https://github.com/WongKinYiu/yolor
os.chdir('yolor')
!sed -i '30s/^pycocotools/# pycocotools/' requirements.txt

# pip install required packages
!pip install -qr requirements.txt

# install mish-cuda if you want to use mish activation
# https://github.com/thomasbrandon/mish-cuda
# https://github.com/JunnYu/mish-cuda
!git clone https://github.com/JunnYu/mish-cuda
os.chdir('mish-cuda')
!python setup.py build install
os.chdir('..')

# install pytorch_wavelets if you want to use dwt down-sampling module
# https://github.com/fbcotter/pytorch_wavelets
!git clone https://github.com/fbcotter/pytorch_wavelets
os.chdir('pytorch_wavelets')
!pip install .
os.chdir('..')

requireents.txt における pycocotools の扱いは、YOLOR (yolor_csp、yolor_csp_x、yolor_p6) の場合と同様です。

事前学習モデルの取得

yolor-w6、yolor-e6、yolor-d6 の事前学習モデルは YOLOR paper github からダウンロードできます。

精度測定

テスト実行前に以下を実行し、後続での問題発生を回避します。

!sed -i '97s/for pred in o:/for pred in o.cpu().numpy():/' utils/plots.py
!sed -i '1adownload: bash ./scripts/get_coco.sh' data/coco.yaml

テストは以下のように実行します。ここで、--imgは 1280 を指定します。--conf 0.001 --iou 0.65 は全 YOLO で共通です。
なお、筆者が実行した Tesla T4 では CUDA メモリ不足になったため、--batch を 16 とすることで、回避しました。

!python test.py --img 1280 --conf 0.001 --iou 0.65 --batch 16 --device 0 --data data/coco.yaml --weights yolor-w6.pt --name yolor-w6_val

計算量・パラメータ数測定

import torch
from thop import profile
from copy import deepcopy
from models.experimental import attempt_load
from utils.torch_utils import select_device
import os
import numpy as np

weights_list = [['yolor-w6.pt'],
                ['yolor-e6.pt'],
                ['yolor-d6.pt']]

img_sz_list = [1280, 1280, 1280]
device = select_device('')

for weights, img_sz in zip(weights_list, img_sz_list):
    model_name = weights[0].split('.')[0]

    with torch.no_grad():

        # Load model
        model = attempt_load(weights, map_location=device)

        input = torch.zeros((1, 3, img_sz, img_sz), device=next(model.parameters()).device)
        flops, params = profile(deepcopy(model), inputs=(input, ))
        flops = flops / 1E9 * 2
        params /= 1E6
        print(f'model_name: flops = {flops}, params = {params}')

YOLOX

インストール

import os
os.chdir('/content')

!git clone https://github.com/Megvii-BaseDetection/YOLOX.git > /dev/null
os.chdir('YOLOX')
!pip3 install -U pip && pip3 install -r requirements.txt > /dev/null
!pip3 install -v -e . > /dev/null # or  python3 setup.py develop

!pip3 install cython; pip3 install git+https://github.com/cocodataset/cocoapi.git

事前学習モデルの取得

YOLOX github の Benchmark セクションにモデルの表があります。各モデルに対応する事前学習モデルを weights 欄から取得します。yolox_s の事前学習モデルのダウンロード例です。

!wget https://github.com/Megvii-BaseDetection/YOLOX/releases/download/0.1.1rc0/yolox_s.pth

精度測定

事前に MS COCO データセットをインストールする必要があります。先のYOLORの実行で、MS COCO データセットが /content/coco にインストールされているので、シンボリックリンクをはって流用します。自前でインストールする場合、YOLOX github には他の YOLO で用意されているようなスクリプトが見当たらなかったため、例えば、YOLOv7 github にある scripts/get_coco.sh をコピーして実行すればよいと思います。

!ln -s /content/coco ./datasets/COCO

YOLOX で予定しているデータセットのパスをインストールしているものにあわせます。

!sed -i '191s/self.data_dir, self.name/self.data_dir, "images", self.name/' yolox/data/datasets/coco.py

テストは以下のように実行します。YOLOX では、iou の指定は nms を使いますので、以下のようになります。

!python -m yolox.tools.eval -n  yolox-s -c yolox_s.pth -b 64 -d 8 --conf 0.001 --nms 0.65 --fp16 --fuse --devices 1

以下のような出力がでます。

(snip)
2022-10-10 21:30:48 | INFO     | yolox.data.datasets.coco:64 - loading annotations into memory...
2022-10-10 21:30:49 | INFO     | yolox.data.datasets.coco:64 - Done (t=0.58s)
2022-10-10 21:30:49 | INFO     | pycocotools.coco:86 - creating index...
2022-10-10 21:30:49 | INFO     | pycocotools.coco:86 - index created!
2022-10-10 21:30:54 | INFO     | __main__:165 - loading checkpoint from yolox_s.pth
2022-10-10 21:30:54 | INFO     | __main__:169 - loaded checkpoint done.
2022-10-10 21:30:54 | INFO     | __main__:175 - 	Fusing model...
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py:390: UserWarning: The .grad attribute of a Tensor that is not a leaf Tensor is being accessed. Its .grad attribute won't be populated during autograd.backward(). If you indeed want the gradient for a non-leaf Tensor, use .retain_grad() on the non-leaf Tensor. If you access the non-leaf Tensor by mistake, make sure you access the leaf Tensor instead. See github.com/pytorch/pytorch/pull/30531 for more informations.
  if param.grad is not None:
100%|##########| 79/79 [01:27<00:00,  1.11s/it]
2022-10-10 21:32:21 | INFO     | yolox.evaluators.coco_evaluator:256 - Evaluate in main process...
2022-10-10 21:32:31 | INFO     | yolox.evaluators.coco_evaluator:289 - Loading and preparing results...
2022-10-10 21:32:34 | INFO     | yolox.evaluators.coco_evaluator:289 - DONE (t=3.39s)
2022-10-10 21:32:34 | INFO     | pycocotools.coco:366 - creating index...
2022-10-10 21:32:35 | INFO     | pycocotools.coco:366 - index created!
Running per image evaluation...
Evaluate annotation type *bbox*
COCOeval_opt.evaluate() finished in 13.68 seconds.
Accumulating evaluation results...
COCOeval_opt.accumulate() finished in 2.06 seconds.
2022-10-10 21:32:53 | INFO     | __main__:196 - 
Average forward time: 4.66 ms, Average NMS time: 1.67 ms, Average inference time: 6.33 ms
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.404
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.593
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.437
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.232
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.448
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.541
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.326
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.531
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.574
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.365
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.634
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.723
per class AP:
(snip)

計算量・パラメータ数測定

他とモデル作成方法が違うため、tools/eval.py を流用しました。以下のコードを実行して measure.py を作成します。

txt = '''
#!/usr/bin/env python3
# -*- coding:utf-8 -*-
# Copyright (c) Megvii, Inc. and its affiliates.

import argparse
import os
import random
import warnings
from loguru import logger
from thop import profile
import numpy as np
from copy import deepcopy

import torch
import torch.backends.cudnn as cudnn
from torch.nn.parallel import DistributedDataParallel as DDP

from yolox.core import launch
from yolox.exp import get_exp
from yolox.utils import (
    configure_module,
    configure_nccl,
    fuse_model,
    get_local_rank,
    get_model_info,
    setup_logger
)


def make_parser():
    parser = argparse.ArgumentParser("YOLOX Eval")
    parser.add_argument("-expn", "--experiment-name", type=str, default=None)
    parser.add_argument("-n", "--name", type=str, default=None, help="model name")

    # distributed
    parser.add_argument(
        "--dist-backend", default="nccl", type=str, help="distributed backend"
    )
    parser.add_argument(
        "--dist-url",
        default=None,
        type=str,
        help="url used to set up distributed training",
    )
    parser.add_argument("-b", "--batch-size", type=int, default=1, help="batch size")
    parser.add_argument(
        "-d", "--devices", default=1, type=int, help="device for training"
    )
    parser.add_argument(
        "--num_machines", default=1, type=int, help="num of node for training"
    )
    parser.add_argument(
        "--machine_rank", default=0, type=int, help="node rank for multi-node training"
    )
    parser.add_argument(
        "-f",
        "--exp_file",
        default=None,
        type=str,
        help="please input your experiment description file",
    )
    parser.add_argument("-c", "--ckpt", default=None, type=str, help="ckpt for eval")
    parser.add_argument("--conf", default=None, type=float, help="test conf")
    parser.add_argument("--nms", default=None, type=float, help="test nms threshold")
    parser.add_argument("--tsize", default=None, type=int, help="test img size")
    parser.add_argument("--seed", default=None, type=int, help="eval seed")
    parser.add_argument(
        "--fp16",
        dest="fp16",
        default=False,
        action="store_true",
        help="Adopting mix precision evaluating.",
    )
    parser.add_argument(
        "--fuse",
        dest="fuse",
        default=False,
        action="store_true",
        help="Fuse conv and bn for testing.",
    )
    parser.add_argument(
        "--trt",
        dest="trt",
        default=False,
        action="store_true",
        help="Using TensorRT model for testing.",
    )
    parser.add_argument(
        "--legacy",
        dest="legacy",
        default=False,
        action="store_true",
        help="To be compatible with older versions",
    )
    parser.add_argument(
        "--test",
        dest="test",
        default=False,
        action="store_true",
        help="Evaluating on test-dev set.",
    )
    parser.add_argument(
        "--speed",
        dest="speed",
        default=False,
        action="store_true",
        help="speed test only.",
    )
    parser.add_argument(
        "opts",
        help="Modify config options using the command-line",
        default=None,
        nargs=argparse.REMAINDER,
    )
    return parser


@logger.catch
def main(exp, args, num_gpu):
    if args.seed is not None:
        random.seed(args.seed)
        torch.manual_seed(args.seed)
        cudnn.deterministic = True
        warnings.warn(
            "You have chosen to seed testing. This will turn on the CUDNN deterministic setting, "
        )

    is_distributed = num_gpu > 1

    # set environment variables for distributed training
    configure_nccl()
    cudnn.benchmark = True

    rank = get_local_rank()

    file_name = os.path.join(exp.output_dir, args.experiment_name)

    if rank == 0:
        os.makedirs(file_name, exist_ok=True)

    setup_logger(file_name, distributed_rank=rank, filename="val_log.txt", mode="a")
    logger.info("Args: {}".format(args))

    if args.conf is not None:
        exp.test_conf = args.conf
    if args.nms is not None:
        exp.nmsthre = args.nms
    if args.tsize is not None:
        exp.test_size = (args.tsize, args.tsize)

    model = exp.get_model()
    logger.info("Model Summary: {}".format(get_model_info(model, exp.test_size)))

    img = torch.zeros((1, 3, 640, 640), device=next(model.parameters()).device)
    flops, params = profile(deepcopy(model), inputs=(img,), verbose=False)
    params /= 1e6
    flops /= 1e9
    flops *= 2  # Gflops
    print(f'{args.experiment_name}: flops = {flops}, params = {params}')

if __name__ == "__main__":
    configure_module()
    args = make_parser().parse_args()
    exp = get_exp(args.exp_file, args.name)
    exp.merge(args.opts)

    if not args.experiment_name:
        args.experiment_name = exp.exp_name

    num_gpu = torch.cuda.device_count() if args.devices is None else args.devices
    assert num_gpu <= torch.cuda.device_count()

    dist_url = "auto" if args.dist_url is None else args.dist_url
    launch(
        main,
        num_gpu,
        args.num_machines,
        args.machine_rank,
        backend=args.dist_backend,
        dist_url=dist_url,
        args=(exp, args, num_gpu),
    )
'''
with open('measure.py', mode='w') as fd:
    fd.writelines(txt)

上記 measure.py を使い、、以下のようにモデル名と Weights を指定して実行します。

!python measure.py -n yolox-s -c yolox_s.pth

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up