More than 5 years have passed since last update.

Detectronの中身を一部を理解してみる1

Last updated at 2019-07-10Posted at 2019-07-08

親記事はVideopose3Dを理解してみる(メモ)です.
Videopose3Dの中のdetectronの機能を使うinfer_simple.pyを理解するために少しこちらに遷移してきました。
infer_simple.py内で呼び出された宣言について軽く理解していきたいと思います。

initialize_model_from_cfgを理解してみる

test_engine.py


def initialize_model_from_cfg(weights_file, gpu_id=0):
    """Initialize a model from the global cfg. Loads test-time weights and
    creates the networks in the Caffe2 workspace.
    """
    model = model_builder.create(cfg.MODEL.TYPE, train=False, gpu_id=gpu_id)

このときのweights_fileはargs.weightsです。

いま何が入っているのかを確認しましょう。

呼び出し時のargs.weightsの値とは？

入力：model_final.pkl
args.weights = cache_url(args.weights, cfg.DOWNLOAD_CACHE)
よって最初のweightsに入っているcoco keypointsの重みファイルが入っています。

親記事の通り、
大まかな説明としては、modelをglobal cfgから初期化して、重みをロードしてCaffe2のワークスペースにネットワークを作るようです。（軽い理解です）

ということで細かく関数を見てみたいと思います。

model_builder.py

def create(model_type_func, train=False, gpu_id=0):
    """Generic model creation function that dispatches to specific model
    building functions.
    By default, this function will generate a data parallel model configured to
    run on cfg.NUM_GPUS devices. However, you can restrict it to build a model
    targeted to a specific GPU by specifying gpu_id. This is used by
    optimizer.build_data_parallel_model() during test time.
    """
    model = DetectionModelHelper(
        name=model_type_func,
        train=train,
        num_classes=cfg.MODEL.NUM_CLASSES,
        init_params=train
    )
    model.only_build_forward_pass = False
    model.target_gpu_id = gpu_id
    return get_func(model_type_func)(model)

DetectionModelHelperクラスはdetectron/modeling/detector.pyにありますがinitのみをご紹介します。

detector.py

class DetectionModelHelper(cnn.CNNModelHelper):
    def __init__(self, **kwargs):
        # Handle args specific to the DetectionModelHelper, others pass through
        # to CNNModelHelper
        self.train = kwargs.get('train', False)
        self.num_classes = kwargs.get('num_classes', -1)
        assert self.num_classes > 0, 'num_classes must be > 0'
        for k in ('train', 'num_classes'):
            if k in kwargs:
                del kwargs[k]
        kwargs['order'] = 'NCHW'
        # Defensively set cudnn_exhaustive_search to False in case the default
        # changes in CNNModelHelper. The detection code uses variable size
        # inputs that might not play nicely with cudnn_exhaustive_search.
        kwargs['cudnn_exhaustive_search'] = False
        super(DetectionModelHelper, self).__init__(**kwargs)
        self.roi_data_loader = None
        self.losses = []
        self.metrics = []
        self.do_not_update_params = []  # Param on this list are not updated
        self.net.Proto().type = cfg.MODEL.EXECUTION_TYPE
        self.net.Proto().num_workers = cfg.NUM_GPUS * 4
        self.prev_use_cudnn = self.use_cudnn
        self.gn_params = []  # Param on this list are GroupNorm parameters

ここではget_func(model_type_func)(model)を返しています。
よって親記事ではmodelにget_func(model_type_func)(model)を代入していることになります。

dispatchとは？

よく聞くけどよくわからないディスパッチとは？

ディスパッチとは発送する、派遣するなどの意味を持つ英単語で、ITの分野では同種の複数の対象から一つを選びだしたり、データの送信、資源の割り当て、昨日の呼び出しなどを表すことが多い

IT用語辞典から引用

つまり！

特定のモデル構築機能にディスパッチ(理解しましたか！？)する汎用モデル作成機能。デフォルトでは、この関数はcfg.NUM_GPUSデバイスで実行するように設定されたデータ並列モデルを生成します。ただし、gpu_idを指定することで、特定のGPUを対象としたモデルを構築するように制限することができます。これはテスト時にoptimizer.build_data_parallel_model（）によって使用されます。

data parallel modelとは？

英語ではこの記事が日本語ではこの記事が詳しそう！

【分散深層学習とは？】
・分散深層学習にはデータ並列性とモデル並列性がある。
・複数プロセス、複数ノードを用いた分散深層学習によってニューラルネットの訓練は高速に行えるようになっている。
・データ並列は、全プロセスに同じモデルのコピーをして訓練することで、バッチサイズをプロセス数倍し、学習を高速化させる手法です。
・モデル並列とは、1つのモデルを分割して複数のプロセスに配置し、全プロセスで強調して1つのモデルを訓練する手法です。

それではtest_engine.pyに戻ります！

test_engine.py

    net_utils.initialize_gpu_from_weights_file(
        model, weights_file, gpu_id=gpu_id,
    )

net.py内の関数の宣言の説明でのコメントでinitialize_gpu_from_weights_fileを理解します！
使用するGPUを初期化しています。

net.py


def initialize_gpu_from_weights_file(model, weights_file, gpu_id=0):
    """Initialize a network with ops on a specific GPU.
    If you use CUDA_VISIBLE_DEVICES to target specific GPUs, Caffe2 will
    automatically map logical GPU ids (starting from 0) to the physical GPUs
    specified in CUDA_VISIBLE_DEVICES.
    """

ふたたびtest_engineに戻ります。

test_engine.py

    model_builder.add_inference_inputs(model)

まずはdetectron/modeling/model_builder内の関数のadd_inference_inputsを呼び出しています。引数はDetectionModelHelperクラスのmodelです。

add_inference_inputsのコメントには

model_builder.py

def add_inference_inputs(model):
    """Create network input blobs used for inference."""

    def create_input_blobs_for_net(net_def):
        for op in net_def.op:
            for blob_in in op.input:
                if not workspace.HasBlob(blob_in):
                    workspace.CreateBlob(blob_in)

    create_input_blobs_for_net(model.net.Proto())
    if cfg.MODEL.MASK_ON:
        create_input_blobs_for_net(model.mask_net.Proto())
    if cfg.MODEL.KEYPOINTS_ON:
        create_input_blobs_for_net(model.keypoint_net.Proto())

と書かれており、推論に使用されるネットワーク入力BLOBを作成する命令になっています。

test_engine.py


    workspace.CreateNet(model.net)
    workspace.CreateNet(model.conv_body_net)
    if cfg.MODEL.MASK_ON:
        workspace.CreateNet(model.mask_net)
    if cfg.MODEL.KEYPOINTS_ON:
        workspace.CreateNet(model.keypoint_net)
    return model

CreateNetはCaffe2内のメソッドです。
Caffe2の公式ページで説明されています。

createnetはWorkspaceクラスの中のメソッドです。

Workspaceクラス：

実行時間の間に作成されたオブジェクトに関連したすべてのものを持つクラス
すべてのblobs
すべてのインスタンス化されたネットワーク(すべてのオブジェクトを所有していて、scaffolding logistics（？）を扱います）

CreateNetメソッド：

実行にはネットが必要
データセットから作成されたNetDefと事前に訓練されたモデル、ネットについて書いてあるprotobufデータとモデルはCaffe2のprotobufスペックの特徴を継承したプロトタイプオブジェクトとしてインスタンス化する必要がある
CreateNetではBLOBが渡されない時には空のネットを返す

結果としてはmodelをDetectionModelHelperクラスに変更して、様々な値やフラグを初期化しているのだとわかりました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up