23
Help us understand the problem. What are the problem?

More than 3 years have passed since last update.

posted at

updated at

YOLO (You Only Look Once)を試してみた(on mac)

YOLOずっと動かしたかった!!!

先ほどくそ荒れ果てたPythonの環境を再構築しました
pythonの環境構築はやり直してひじょ〜〜〜〜〜〜〜にスムーズに行くので
YOLOを試してみることにしました.
Pythonってこんなに楽だったんだ!!!

参考文献:
[http://ai-coordinator.jp/yolov2-tensorflow-python:title]

準備

pip install --upgrade opencv-python
pip install --upgrade tensorflow

YOLOは直接開発元のdarknetのものを使わずにTensorflowに書き換えたdarflowを使います.
pythondでかけます.

git clone https://github.com/thtrieu/darkflow.git
cd darkflow
python setup.py build_ext --inplace

とりあえずサンプル動かさせて頂きます

まずはサンプル画像から

学習の重みをダウンロードしてきます.darkflow/以下に配置
[https://drive.google.com/drive/folders/0B1tW_VtY7onidEwyQ2FtQVplWEU]

サンプルプログラム

from darkflow.net.build import TFNet
import cv2

options = {"model": "cfg/yolo.cfg", "load": "yolo.weights", "threshold": 0.1}

tfnet = TFNet(options)

imgcv = cv2.imread("./sample_img/sample_dog.jpg")
result = tfnet.return_predict(imgcv)
print(result)

実行すると以下の結果が得られました.

[{'label': 'bicycle', 'confidence': 0.8448341, 'topleft': {'x': 81, 'y': 114}, 'bottomright': {'x': 553, 'y': 466}}, {'label': 'truck', 'confidence': 0.79511166, 'topleft': {'x': 462, 'y': 81}, 'bottomright': {'x': 693, 'y': 167}}, {'label': 'motorbike', 'confidence': 0.27550778, 'topleft': {'x': 59, 'y': 76}, 'bottomright': {'x': 114, 'y': 124}}, {'label': 'cat', 'confidence': 0.12677637, 'topleft': {'x': 139, 'y': 197}, 'bottomright': {'x': 314, 'y': 551}}, {'label': 'dog', 'confidence': 0.7696115, 'topleft': {'x': 136, 'y': 214}, 'bottomright': {'x': 322, 'y': 539}}]

次にウェブカメラです.

from darkflow.net.build import TFNet
import cv2
import numpy as np

options = {"model": "cfg/yolo.cfg", "load": "bin/yolo.weights", "threshold": 0.1}
tfnet = TFNet(options)

# カメラの起動
cap = cv2.VideoCapture(0)

class_names = ['aeroplane', 'bicycle', 'bird', 'boat', 'bottle', 
              'bus', 'car', 'cat', 'chair', 'cow', 'diningtable', 
              'dog', 'horse', 'motorbike', 'person', 'pottedplant',
              'sheep', 'sofa', 'train', 'tvmonitor']

num_classes = len(class_names)
class_colors = []
for i in range(0, num_classes):
    hue = 255*i/num_classes
    col = np.zeros((1,1,3)).astype("uint8")
    col[0][0][0] = hue
    col[0][0][1] = 128
    col[0][0][2] = 255
    cvcol = cv2.cvtColor(col, cv2.COLOR_HSV2BGR)
    col = (int(cvcol[0][0][0]), int(cvcol[0][0][1]), int(cvcol[0][0][2]))
    class_colors.append(col) 

def main():

    while(True):

        # 動画ストリームからフレームを取得
        ret, frame = cap.read()
        result = tfnet.return_predict(frame)

        for item in result:
            tlx = item['topleft']['x']
            tly = item['topleft']['y']
            brx = item['bottomright']['x']
            bry = item['bottomright']['y']
            label = item['label']
            conf = item['confidence']

            if conf > 0.6:

                for i in class_names:
                    if label == i:
                        class_num = class_names.index(i)
                        break

                #枠の作成
                cv2.rectangle(frame, (tlx, tly), (brx, bry), class_colors[class_num], 2)

                #ラベルの作成
                text = label + " " + ('%.2f' % conf)  
                cv2.rectangle(frame, (tlx, tly - 15), (tlx + 100, tly + 5), class_colors[class_num], -1)
                cv2.putText(frame, text, (tlx, tly), cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0,0,0), 1)

        # 表示
        cv2.imshow("Show FLAME Image", frame) 

        # escを押したら終了。
        k = cv2.waitKey(10);
        if k == ord('q'):  break;

    cap.release()
    cv2.destroyAllWindows()

if __name__ == '__main__':
    main()
Why not register and get more from Qiita?
  1. We will deliver articles that match you
    By following users and tags, you can catch up information on technical fields that you are interested in as a whole
  2. you can read useful information later efficiently
    By "stocking" the articles you like, you can search right away
Sign upLogin
23
Help us understand the problem. What are the problem?