More than 5 years have passed since last update.

YOLOv3 アノテーションデータxmlの加工

Last updated at 2020-03-03Posted at 2019-10-04

背景

使用しているのはkeras版YOLOv3です。

以前の記事でアノテーションデータの作成を行いました。
次は元画像サイズのアノテーションデータから
YOLOv3の学習用(416, 416)サイズのアノテーションデータを作成します。

YOLOv3のアノテーションデータ

keras版YOLOv3ではPascal VOC形式のxmlファイルです。

内容は以下のようになっています。

pic_(1)_.xml

<annotation>
<filename>pic_(1)_.jpg</filename>
<source>
    <database>original</database>
    <annotation>original</annotation>
    <image>XXX</image>
    <flickrid>0</flickrid>
</source>
<owner>
    <flickrid>0</flickrid>
    <name>?</name>
</owner>
    <size>
        <width> 748</width>
        <height> 558</height>
        <depth>3</depth>
    </size>
    <object>
        <name>pudding</name>
        <pose>Unspecified</pose>
        <truncated>1</truncated>
        <difficult>0</difficult>
        <bndbox>
            <xmin> 282</xmin>
            <ymin> 163</ymin>
            <xmax> 476</xmax>
            <ymax> 285</ymax>
        </bndbox>
    </object>
</annotation>

xmlファイルはchromeやテキストエディタで確認できます。

変換コード

賢くはないコードですいません。（汗）
元画像サイズのアノテーションデータを読み込み、必要とする要素のみを抽出します。
要素とは、画像ファイル名、画像サイズ、クラス名、バウンディングボックスの座標（左上と右下）。
抽出した値を用いて、新たにxmlテンプレートに書き込めばいけます。

フォルダの中身

参考コード

anathor_xml_maker.py


## xmlファイルを読み込み、解析
# """ xmlを読み込んで、新しいxmlに圧縮した値を作成
import os
import xml.etree.ElementTree as ET

original_xml_path = 'xml_files/pic_({})_.xml'
nem_xml_path = "new_xml_files/pic_({})_.xml"

for i in range(1,1001):

    Answer = os.path.exists(original_xml_path.format(i))

    if Answer == True:
        FILE = original_xml_path.format(i)
        file = open(FILE)
        tree = ET.parse(file)
        root = tree.getroot()

        all_list = []

        # 画像ファイル名を取得
        img_name = root.find('filename').text

        # 画像ファイルのサイズ（幅・高さ）を取得
        img_size = root.find('size')
        img_w = int(img_size.find('width').text)
        img_h = int(img_size.find('height').text)

        for obj in root.iter('object'):
            cls = obj.find('name').text
            xmlbox = obj.find('bndbox')
            xmin = int(xmlbox.find('xmin').text)
            ymin = int(xmlbox.find('ymin').text)
            xmax = int(xmlbox.find('xmax').text)
            ymax = int(xmlbox.find('ymax').text)

            all_list.append([img_name] + [cls])

        print(all_list)
        print("画像サイズ　width = {}, height = {}".format(img_w,img_h))

        string_ = '''\
        <annotation>
            <filename>{}</filename>
            <source>
                <database>original</database>
                <annotation>original</annotation>
                <image>XXX</image>
                <flickrid>0</flickrid>
            </source>
            <owner>
                <flickrid>0</flickrid>
                <name>?</name>
            </owner>
            <size>
                <width>{}</width>
                <height>{}</height>
                <depth>3</depth>
            </size>
            <object>
                <name>pudding</name>
                <pose>Unspecified</pose>
                <truncated>1</truncated>
                <difficult>0</difficult>
                <bndbox>
                    <xmin>{}</xmin>
                    <ymin>{}</ymin>
                    <xmax>{}</xmax>
                    <ymax>{}</ymax>
                </bndbox>
            </object>
        </annotation>'''

        size_changed = 416#画像変更サイズ
        x_ = size_changed / img_w#X軸方方向の圧縮倍率
        y_ = size_changed / img_h#X軸方方向の圧縮倍率

        #等倍でファイル作成
        #x_ = 1#X軸方方向の圧縮倍率
        #y_ = 1#X軸方方向の圧縮倍率

        img_name = "pic_({})_.jpg".format(i)

        re_img_w = round(img_w * x_)
        re_img_h = round(img_h * y_)
        re_xmin = round(xmin * x_)
        re_ymin = round(ymin * y_)
        re_xmax = round(xmax * x_)
        re_ymax = round(ymax * y_)
        print(re_img_w,re_img_h, re_xmin, re_ymin, re_xmax, re_ymax)

        #新たなフォルダに書き込む
        with open(nem_xml_path.format(i), 'w') as f:
            f.write(string_.format(img_name,re_img_w, re_img_h, re_xmin,re_ymin,re_xmax,re_ymax))

参考文献

Pascal VOC形式のxmlファイルをPythonで読む

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up