More than 1 year has passed since last update.

Instance Segmentation「Mask-RCNN」をKerasで学習させる

Last updated at 2022-10-26Posted at 2022-10-26

0. Introduction

頃日、Semantic Segmentationも束の間、さらにその上位種であるInstance Segmentationが隆盛を極めている。
これはYOLOのようなBBOXによる検出に加え、BBOX内のSegmentationも行うマルチタスクラーニングの一種である。

マルチタスクラーニング化することで、よりIOUの精度を高められたことが特徴の1つである。
今回は、このInstance Segmentationで最も有名だと思われるMask-RCNNを使えるようになりたいと考える。
なお、Mask-RCNNはSOTAではない。

1. Installation

まずは以下のような結果を得られよう環境構築をして行く。

MSCOCOデータセットは重いため今回は使用しない。

Pull

Kerasで実装されたMask-RCNNを（以下リポジトリ）を用いる。
https://github.com/matterport/Mask_RCNN

まずはクローンする。

$ git clone https://github.com/matterport/Mask_RCNN
$ cd Mask_RCNN

Environment

以下の環境を整備する。多少、異なっていても動くとは思う。

CUDA=10.0
CUDNN=7.6.2
python=3.7
numpy
scipy
Pillow
cython
matplotlib
scikit-image
tensorflow>=1.3.0
keras>=2.0.8
opencv-python
h5py
imgaug

一先ずリポジトリに沿って環境を整えて行く

$ pip install -r requirements.txt
$ python setup.py install

特にエラー等が生じなければ、これで環境は整うはずである。

Download

次にデータセット等である。ルートフォルダをMask_RCNN内とする。

データセットフォルダを作成し、データセット及び学習済みの重みをダウンロードする。

$ mkdir datasets
$ cd datasets/
$ wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/balloon_dataset.zip
$ wget https://github.com/matterport/Mask_RCNN/releases/download/v2.1/mask_rcnn_balloon.h5

次に解凍する。

$ unzip balloon_dataset.zip

解凍すると、以下のようなフォルダ構成となる。

baloon
|- train
  |- xxx.jpg
  |- via_region_data.json
  |- ...
|- val
  |- xxx.jpg
  |- via_region_data.json
  |- ...

これにて環境構築はおしまい。

2. Prediction

環境構築で整備したデータセット及び重みを用いて識別を行っていく。

まず、ルートフォルダから以下に移動

$ cd samples/balloon/

次に以下を実行

$ python balloon.py splash --weights=../../datasets/mask_rcnn_balloon.h5 --image=../../datasets/balloon/val/5603212091_2dfe16ea72_b.jpg

そうすると、以下のようなログが表示される。

Configurations:
BACKBONE                       resnet101
BACKBONE_STRIDES               [4, 8, 16, 32, 64]
BATCH_SIZE                     1
BBOX_STD_DEV                   [0.1 0.1 0.2 0.2]
COMPUTE_BACKBONE_SHAPE         None
DETECTION_MAX_INSTANCES        100
DETECTION_MIN_CONFIDENCE       0.9
DETECTION_NMS_THRESHOLD        0.3
FPN_CLASSIF_FC_LAYERS_SIZE     1024
GPU_COUNT                      1
GRADIENT_CLIP_NORM             5.0
IMAGES_PER_GPU                 1
IMAGE_CHANNEL_COUNT            3
IMAGE_MAX_DIM                  1024
IMAGE_META_SIZE                14
IMAGE_MIN_DIM                  800
IMAGE_MIN_SCALE                0
IMAGE_RESIZE_MODE              square
IMAGE_SHAPE                    [1024 1024    3]
LEARNING_MOMENTUM              0.9
LEARNING_RATE                  0.001
...
Saved to  splash_20200304T125953.png
...

処理が全て終わると同フォルダにsplash_20200304T125953.pngが保存されている。
これを表示すると、上手くバルーンのみが抽出されていることが確認できる。

3. Train

既存のバルーンのデータセットで学習を行う。

Fix depending on `Keras >= 2.3.0`

Keras==2.3.0を用いているとデフォルトだと動かないのでコードを修正する。
まずインポート先を環境変数からローカルフォルダに変更する。

./sample/balloon/を以下のように変更する。

balloon.py

...
# Root directory of the project
ROOT_DIR = os.path.abspath("../../")

# Import Mask RCNN
#sys.path.append(ROOT_DIR)  # To find local version of the library
#from mrcnn.config import Config
#from mrcnn import model as modellib, utils

# Root directory of the project
sys.path.append(os.path.abspath("../../mrcnn"))
# Import Mask RCNN
import model as modellib, utils
from config import Config

# Path to trained weights file
COCO_WEIGHTS_PATH = os.path.join(ROOT_DIR, "mask_rcnn_coco.h5")

# Directory to save logs and model checkpoints, if not provided
# through the command line argument --logs
DEFAULT_LOGS_DIR = os.path.join(ROOT_DIR, "logs")
...

冒頭のimport部分をこのように変更して、ローカル内のソースコードの変更を有効にする。
初期設定だとsite-packegeの中からimportしてくるため変更し辛い。

次に同フォルダから../../mrcnn/model.pyを開いて以下のように変更する

model.py

...
        for name in loss_names:
            if name in self.keras_model.metrics_names:
                continue
            layer = self.keras_model.get_layer(name)
            #self.keras_model.metrics_names.append(name)
            loss = (
                tf.reduce_mean(layer.output, keepdims=True)
                * self.config.LOSS_WEIGHTS.get(name, 1.))
            #self.keras_model.metrics_tensors.append(loss)
            self.keras_model.add_metric(loss, name)
...

self.keras_model.metrics_names.append(name)及びself.keras_model.metrics_tensors.append(loss)が
self.keras_model.add_metric(loss, name)にAPIが変更されているため、これらを書き換える。

Training

これで以下のコマンドで学習ができる。

$ python balloon.py train --dataset=../../datasets/balloon/ --weights=imagenet

--weightsに重みを指定することで転移学習することができる。
例えば以下であれば最新の重みを用いて学習することになる。

 --weights=last

Training without weight

最初から学習したい場合は以下のように重みのロードをコメントアウトすればよい。

balloon.py

    # Load weights
#    print("Loading weights ", weights_path)
#    if args.weights.lower() == "coco":
#        # Exclude the last layers because they require a matching
#        # number of classes
#        model.load_weights(weights_path, by_name=True, exclude=[
#            "mrcnn_class_logits", "mrcnn_bbox_fc",
#            "mrcnn_bbox", "mrcnn_mask"])
#    else:
#        model.load_weights(weights_path, by_name=True)
#
    # Train or evaluate
    if args.command == "train":
        train(model)
    elif args.command == "splash":
        detect_and_color_splash(model, image_path=args.image,
                                video_path=args.video)

コマンドはこんな感じ。

$ python balloon.py train --dataset=../../datasets/balloon/ --weights=

学習が始まると以下のように表示される。

100/100 [==============================] - 6s 6ms/step - loss: 11.5805 - rpn_class_loss: 0.6961 - rpn_bbox_loss: 6.0425 - mrcnn_class_loss: 0.3481 - mrcnn_bbox_loss: 4.0630 - mrcnn_mask_loss: 0.4308 - val_loss: 22.7146 - val_rpn_class_loss: 3.1622 - val_rpn_bbox_loss: 10.6883 - val_mrcnn_class_loss: 0.0000e+00 - val_mrcnn_bbox_loss: 0.0000e+00 - val_mrcnn_mask_loss: 0.0000e+00

4. Train with custom datasets

次に自身のデータセットを用いて識別を行う方法を説明する。

準備

sample/balloonがサンプルとしても扱いやすいので、こちらをベースに修正していく。
まずルートフォルダからコピーを行う。

$ cd sample
$ cp -R balloon customDataset
$ cd customDataset

データセットの構成

以下の該当部分を書き換えれば自身のデータセットを読み込ませることが出来る。

balloon.py

class BalloonDataset(utils.Dataset):

    def load_balloon(self, dataset_dir, subset):
        """Load a subset of the Balloon dataset.
        dataset_dir: Root directory of the dataset.
        subset: Subset to load: train or val
        """
        # Add classes. We have only one class to add.
        self.add_class("balloon", 1, "balloon")

        # Train or validation dataset?
        assert subset in ["train", "val"]
        dataset_dir = os.path.join(dataset_dir, subset)

        # Load annotations
        # VGG Image Annotator (up to version 1.6) saves each image in the form:
        # { 'filename': '28503151_5b5b7ec140_b.jpg',
        #   'regions': {
        #       '0': {
        #           'region_attributes': {},
        #           'shape_attributes': {
        #               'all_points_x': [...],
        #               'all_points_y': [...],
        #               'name': 'polygon'}},
        #       ... more regions ...
        #   },
        #   'size': 100202
        # }
        # We mostly care about the x and y coordinates of each region
        # Note: In VIA 2.0, regions was changed from a dict to a list.
        annotations = json.load(open(os.path.join(dataset_dir, "via_region_data.json")))
        annotations = list(annotations.values())  # don't need the dict keys

        # The VIA tool saves images in the JSON even if they don't have any
        # annotations. Skip unannotated images.
        annotations = [a for a in annotations if a['regions']]

        # Add images
        for a in annotations:
            # Get the x, y coordinaets of points of the polygons that make up
            # the outline of each object instance. These are stores in the
            # shape_attributes (see json format above)
            # The if condition is needed to support VIA versions 1.x and 2.x.
            if type(a['regions']) is dict:
                polygons = [r['shape_attributes'] for r in a['regions'].values()]
            else:
                polygons = [r['shape_attributes'] for r in a['regions']] 

            # load_mask() needs the image size to convert polygons to masks.
            # Unfortunately, VIA doesn't include it in JSON, so we must read
            # the image. This is only managable since the dataset is tiny.
            image_path = os.path.join(dataset_dir, a['filename'])
            image = skimage.io.imread(image_path)
            height, width = image.shape[:2]

            self.add_image(
                "balloon",
                image_id=a['filename'],  # use file name as a unique image id
                path=image_path,
                width=width, height=height,
                polygons=polygons)

方法としては2つ。

1つは以下のアノテーションファイルを作成する

        # { 'filename': '28503151_5b5b7ec140_b.jpg',
        #   'regions': {
        #       '0': {
        #           'region_attributes': {},
        #           'shape_attributes': {
        #               'all_points_x': [...],
        #               'all_points_y': [...],
        #               'name': 'polygon'}},
        #       ... more regions ...
        #   },
        #   'size': 100202
        # }

2つ目は以下の構造に自身のデータセットを入れる。

            self.add_image(
                "balloon",
                image_id=a['filename'],  # use file name as a unique image id
                path=image_path,
                width=width, height=height,
                polygons=polygons)

今回は2つ目の方法を取る。

image_path、width、height、polygonsが主なデータである。
これ等のデータは以下のように表現されている。

../../datasets/balloon/train/34020010494_e5cb88e1c4_k.jpg
2048
1536
[{'name': 'polygon', 'all_points_x': [1020, 1000, 994, 1003, 1023, 1050, 1089, 1134, 1190, 1265, 1321, 1361, 1403, 1428, 1442, 1445, 1441, 1427, 1400, 1361, 1316, 1269, 1228, 1198, 1207, 1210, 1190, 1177, 1172, 1174, 1170, 1153, 1127, 1104, 1061, 1032, 1020], 'all_points_y': [963, 899, 841, 787, 738, 700, 663, 638, 621, 619, 643, 672, 720, 765, 800, 860, 896, 942, 990, 1035, 1079, 1112, 1129, 1134, 1144, 1153, 1166, 1166, 1150, 1136, 1129, 1122, 1112, 1084, 1037, 989, 963]}]

よって、画像1つに対して、このペアを入力すればよい。
なお、この方法は1クラスの方法である。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

Instance Segmentation「Mask-RCNN」をKerasで学習させる

0. Introduction

1. Installation

Pull

Environment

Download

2. Prediction

3. Train

Fix depending on Keras >= 2.3.0

Training

Training without weight

4. Train with custom datasets

準備

データセットの構成

Fix depending on `Keras >= 2.3.0`