More than 5 years have passed since last update.

COCO Formatを使う (pycocotools)。RLEの圧縮/非圧縮を添えて

coco

Last updated at 2019-09-19Posted at 2019-09-19

0. 概要

COCO Formatを使うためにはポリゴンをピクセルに変換したり、面積に変換したり、時にはRLEしたり・・・色々と手間がかかる。
このためCOCO TOOLSというものが用意されているので、これを用いて効率的に開発を進めたい。

1. Installation

以下でCOCOをインストール

pip install git+https://github.com/waleedka/coco.git#subdirectory=PythonAPI

インストールできたか確認

$ python
$ from pycocotools.coco import COCO

2. Run-length encoding (RLE)

そもそもAnnotationにあるcountsって何なの？っていう話であるが、
これはbboxで囲まれた領域のX x Yピクセルの内、どこが塗りつぶされているかをbitで表現しているものである。

これを復元するためには以下のようなコードを書く

import json
import numpy as np
from pycocotools import mask
from skimage import measure

ground_truth_binary_mask = np.array([[  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  1,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0]], dtype=np.uint8)

fortran_ground_truth_binary_mask = np.asfortranarray(ground_truth_binary_mask)
encoded_ground_truth = mask.encode(fortran_ground_truth_binary_mask)
print(encoded_ground_truth )
print(mask.decode(encoded_ground_truth))

出力は以下となる。

{'size': [9, 10], 'counts': b'61X13mN000`0'}
[[0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 1 1 1 0 0]
 [0 0 0 0 0 1 1 1 0 0]
 [0 0 0 0 0 1 1 1 0 0]
 [0 0 0 0 0 1 1 1 0 0]
 [1 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]
 [0 0 0 0 0 0 0 0 0 0]]

Image

import cv2
rle={'size': [9, 10], 'counts': b'61X13mN000`0'}
dec_rle= mask.decode(rle) * 255
cv2.imwrite("segmentation.png", dec_rle )

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up