Label Studioで作成したアノテーションJSONをCreate ML用に変換する方法

Last updated at 2024-12-03Posted at 2024-12-03

Create MLでオブジェクト検出用のモデルを作成する際、Annotation JSONが必要です。しかし、Label Studioで作成したJSONデータはそのままでは使用できません。Appleが要求する形式に変換する必要があります。

この記事では、その変換方法について説明します。Label Studioやアノテーション作成に関する基本的な知識がある前提です。

手順

Label Studioでアノテーション作成
Label Studioを起動し、アノテーションを作成します。
アノテーションのエクスポート
作成が完了したら、Exportボタンを押し、JSONを書き出します。
YOLO形式でデータを書き出し
書き出されたJSONには、画像ファイル名が変換された画像が含まれないため、YOLO形式でデータを書き出します。この操作で変換された画像も取得できます。
PythonスクリプトでCreate ML用JSONに変換
以下のPythonスクリプトを使用して、Label StudioのJSONをCreate ML用JSONに変換します。

craeteML_annotaion_converter.python

import json
import os
import sys

if len(sys.argv) != 4:
    print("Usage: python craeteML_annotaion_converter.py <input_json> <output_json_path> <output_json_file>")
    sys.exit(1)

input_json = sys.argv[1]
output_json_path = sys.argv[2]
output_json_file = sys.argv[3]

with open(input_json, "r") as f:
    original_data = json.load(f)

create_ml_data = []

for item in original_data:
    image_name = item["data"]["image"]
    image_name = os.path.basename(image_name)
    annotations = []
    
    for annotation in item["annotations"]:
        for result in annotation["result"]:
            if result["type"] == "rectanglelabels":
                value = result["value"]
                label = value["rectanglelabels"][0]

                x_center = value["x"] / 100 * result["original_width"]
                y_center = value["y"] / 100 * result["original_height"]
                width = value["width"] / 100 * result["original_width"]
                height = value["height"] / 100 * result["original_height"]

                annotations.append({
                    "label": label,
                    "coordinates": {
                        "x": int(x_center),
                        "y": int(y_center),
                        "width": int(width),
                        "height": int(height)
                    }
                })
            else:
                print(f"Skipping result")
    
    create_ml_data.append({
        "imagefilename": image_name,
        "annotation": annotations
    })

output_path = os.path.join(output_json_path, output_json_file)
with open(output_path, "w") as f:
    json.dump(create_ml_data, f, indent=4)

print(f"Converted data saved to {output_json_file}")

実行方法

上記のコードを任意のファイル名（例: createml_annotation_converter.py）で保存します。
ターミナルで以下のコマンドを実行します。

python createml_annotation_converter.py <Label Studioで書き出したJSONのフルパス> <出力先ディレクトリ> <出力ファイル名>

注意点

スクリプトはすべてのケースに対応していないため、アノテーション方法やJSONフォーマットによっては変換に失敗する場合があります。

変換後のデータをCreate MLで利用

適当なフォルダを作成し、以下を格納します:
- 変換したJSON
- アノテーション対象の画像
フォルダをCreate MLのTraining Dataとしてアップロードすればトレーニング可能です。

AppleのAnnotation JSONフォーマット

公式ドキュメント:
Building an Object Detector Data Source

必要な方の参考になれば幸いです！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up