More than 3 years have passed since last update.

LambdaでDetectron2(物体検出)して、実行速度はかってみた。

Last updated at 2021-01-08Posted at 2021-01-07

概要

モデル：resnet50 + frcnn
lambdaのメモリサイズ：8192MB

結論先に書くと、
コールドスタートありで9秒ほど。
コールドスタートなしでなんと3秒！

簡単にapi作るにはいいかもですね。

(メモ書きです。)

ファイル郡

FROM nvidia/cuda:10.1-cudnn7-devel

ENV DEBIAN_FRONTEND noninteractive
RUN apt-get update && apt-get install -y \
	python3-opencv ca-certificates python3-dev git wget sudo  \
	cmake ninja-build && \
  rm -rf /var/lib/apt/lists/*
RUN ln -sv /usr/bin/python3 /usr/bin/python

WORKDIR /home

ENV PATH="/home/.local/bin:${PATH}"
RUN wget https://bootstrap.pypa.io/get-pip.py && \
	python3 get-pip.py && \
	rm get-pip.py

# install dependencies
# See https://pytorch.org/ for other options if you use a different version of CUDA
RUN pip install tensorboard
RUN pip install torch==1.7 torchvision==0.8.1 -f https://download.pytorch.org/whl/cu101/torch_stable.html

RUN pip install 'git+https://github.com/facebookresearch/fvcore'
# install detectron2
RUN git clone https://github.com/facebookresearch/detectron2 detectron2_repo
# set FORCE_CUDA because during `docker build` cuda is not accessible
ENV FORCE_CUDA="0"
# This will by default build detectron2 for all common cuda architectures and take a lot more time,
# because inside `docker build`, there is no way to tell which architecture will be used.
ARG TORCH_CUDA_ARCH_LIST="Kepler;Kepler+Tesla;Maxwell;Maxwell+Tegra;Pascal;Volta;Turing"
ENV TORCH_CUDA_ARCH_LIST="${TORCH_CUDA_ARCH_LIST}"

RUN pip install -e detectron2_repo

# aws lambda run interface clientを iinstallする
RUN pip install awslambdaric

# opetion aws-rieを入れる。
ADD https://github.com/aws/aws-lambda-runtime-interface-emulator/releases/latest/download/aws-lambda-rie /usr/bin/aws-lambda-rie
RUN chmod 755 /usr/bin/aws-lambda-rie

# Set a fixed model cache directory.
ENV FVCORE_CACHE="/tmp"
ARG APP_DIR="/home/app/"
WORKDIR ${APP_DIR}
COPY app ${APP_DIR}

ENTRYPOINT [ "/bin/bash", "entry.sh" ]
CMD [ "icon_predict.main" ]

entry.sh

# !/bin/sh
if [ -z "${AWS_LAMBDA_RUNTIME_API}" ]; then
    exec /usr/bin/aws-lambda-rie /usr/bin/python -m awslambdaric $1
    # exec /usr/bin/aws-lambda-rie  /usr/bin/python3 -m awslambdaric 
else
    exec /usr/bin/python -m awslambdaric $1
fi

icon_predict.py

import os
import sys
import numpy as np
import cv2
import torch
import json
import boto3
from torchvision import transforms, utils

from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer, ColorMode
from detectron2.data import MetadataCatalog, DatasetCatalog
from detectron2.data.datasets import register_coco_instances

# modelのダウンロード
S3_BUCKET = [bucket名]
s3 = boto3.resource('s3')
model_path = [modelのpath名]
s3_key = [modelのs3のkey名]
if not os.path.exists(model_path):
    s3.Bucket(S3_BUCKET).download_file(s3_key, model_path)

def gen_predictor(model_path):
    """
        事前にモデルを設定する。
    """
    device = os.getenv("DEVICE", "cpu")
    cfg = get_cfg()
    cfg.MODEL.DEVICE = device
    cfg.merge_from_file(model_zoo.get_config_file("PascalVOC-Detection/faster_rcnn_R_50_FPN.yaml"))

    cfg.MODEL.WEIGHTS = model_path # downloadで落としてきたモデル。
    cfg.SOLVER.IMS_PER_BATCH = 1
    cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 128  # faster, and good enough for this toy dataset (default: 512)
    cfg.MODEL.ROI_HEADS.NUM_CLASSES = 6  # only has one class (ballon)
    cfg.MODEL.ROI_HEADS.NMS = 0.2
    cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.8   # set the testing threshold for this model
    predictor = DefaultPredictor(cfg)
    return predictor

def main():
    img_path = "image.png"

    img = cv2.imread(img_path)
    predictor = gen_predictor(model_path)
    bigicon_predictions = predictor(img)["instances"].to("cpu")
    boxes = bigicon_predictions.pred_boxes.__dict__["tensor"].tolist()
    scores = bigicon_predictions.scores.tolist()
    print("boxes :", boxes)
    print("scores :", scores)
    print("-- Finished!! --")
    return boxes

def handler(event, context):
    return main()

if __name__ == "__main__":
    main()

実行結果

テストすると3秒くらい。

モデルをひとまずAPIにしたい。
サーバレスで運用したい
SageMakermめんどくさい...
コストそんな高くしたくない...

って人にはそこそこ使える性能じゃないだろうか。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up