aiobotocoreを用いたSageMaker推論の高速化

Posted at 2018-12-26

aptpod Advent Calendar 2018 の最終日です

（社内のシャイな皆さんによる譲り合いでぽっかり空いていたので、すっと埋めちゃいます）

概要

PythonスクリプトからのSageMakerの推論エンドポイントへのアクセスにaio-libs/aiobotocoreを利用することでhttp通信部分を非同期化してスループットの向上を試みました

検証

SageMaker推論エンドポイントの作成

本家のAmazon SageMaker 開発者用リソース – アマゾンウェブサービス (AWS)やクラスメソッドさんのSageMakerで「うまい棒検出モデル」を作ってみた｜ DevelopersIOなどご覧いただけばエンドポイント作成するところまではスムースにいけるかと思いますので、この記事では省略します

今回は「うまい棒検出モデル」を完全に写経して作成した「きのこ／たけのこ検出モデル」を ml.c5.large + ml.eia1.medium のインスタンス1台で動かすエンドポイントを使用しています

推論対象の画像の用意

（ある程度の枚数を処理した方が差がわかりやすいため、）「スマホなどで動画撮影→ffmpegなどでフレームごとに画像を切り出して保存」の手順で画像を用意します

手元の環境では30fpsで約20秒の動画から588枚の画像を用意しました（以下は1例です）

Pythonスクリプトの作成

公式のboto3を利用する場合

use_boto3.py はこんな感じ


import os, json, boto3

sagemaker_client = boto3.client('sagemaker-runtime')

ORIG_FRAME_DIR = 'orig/'
SAGEMAKER_ENDPOINT = 'kinoko-takenoko'

class SageMakerException(Exception):
    pass

def main():
    orig_file_list = sorted(os.listdir(ORIG_FRAME_DIR))

    prediction_dict = {}
    for idx, orig_file in enumerate(orig_file_list):
        with open(ORIG_FRAME_DIR + orig_file, 'rb') as image:
            f = image.read()
            b = bytearray(f)

        response = sagemaker_client.invoke_endpoint(
            EndpointName=SAGEMAKER_ENDPOINT,
            Body=b,
            ContentType='image/jpeg',
            Accept='application/json'
        )
        if response['ResponseMetadata']['HTTPStatusCode'] != 200:
            raise SageMakerException
        
        predictions = json.load(response['Body'])['prediction']
        prediction_dict[idx] = predictions

    print("finished!")

if __name__ == "__main__":
    main()

aiobotocoreを利用する場合

use_aiobotocore.py はこんな感じ

import os, json, asyncio, aiobotocore

ORIG_FRAME_DIR = 'orig/'
SAGEMAKER_ENDPOINT = 'kinoko-takenoko'

class SageMakerException(Exception):
    pass

def main():
    orig_file_list = sorted(os.listdir(ORIG_FRAME_DIR))

    loop = asyncio.get_event_loop()
    prediction_dict = loop.run_until_complete(inference_using_gather(loop, orig_file_list))
    print("finished!")

async def invoke_endpoint(client, body):
    response = await client.invoke_endpoint(
        EndpointName=SAGEMAKER_ENDPOINT,
        Body=body,
        ContentType='image/jpeg',
        Accept='application/json'
    )
    return response

async def inference_using_gather(loop, orig_file_list):
    session = aiobotocore.get_session(loop=loop)
    async with session.create_client(
            'sagemaker-runtime',
            region_name='ap-northeast-1',
        ) as sagemaker_client:

        cors = []

        for orig_file in orig_file_list:
            with open(ORIG_FRAME_DIR + orig_file, 'rb') as image:
                f = image.read()
                b = bytearray(f)
                cors.append(invoke_endpoint(sagemaker_client, b))

        responses = await asyncio.gather(*cors)
        prediction_dict = {}
        for idx, r in enumerate(responses):
            if r['ResponseMetadata']['HTTPStatusCode'] != 200:
                raise SageMakerException()
            
            body = await r['Body'].read()
            predictions = json.loads(body.decode('utf8'))['prediction']
            prediction_dict[idx] = predictions

        return prediction_dict

if __name__ == "__main__":
    main()

実行結果

必要なライブラリをインストールします（aiobotocore側の依存の関係で、boto3は少し古いバージョンのものです）

$ pipenv install boto3==1.9.49 aiobotocore==0.10.0

$ time python use_boto3.py
finished!
python use_boto3.py  5.33s user 0.76s system 4% cpu 2:28.25 total

$ time python use_aiobotocore.py
finished!
python use_aiobotocore.py  4.28s user 1.10s system 8% cpu 1:06.30 total

どちらも無事動き、4.0 → 8.9 fps程度に処理能力が上がっていることを確認できました

（わかりにくくて恐縮ですが、）上記のメトリクスで 18.2 の線のあたりにある山が use_boto3.py を実行した際のCPU使用率、 36.4 の線のあたりにある山が use_aiobotocore.py を実行した際のCPU使用率のため、非同期処理によってサーバー側をより効率的に使用できていることも確認できました

まとめ

非同期処理を利用することでSageMakerを用いた推論のスループットを向上させることができました

（あらゆる用途に使用できるTIPSではないですが、）なにかの参考となれば幸いです！

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up