SageMakerでローカル訓練モデルをホスティングする完全ガイド - Dockerイメージ作成からECRデプロイまで

Posted at 2025-06-18

概要

ローカル環境で訓練した機械学習モデルをAmazon SageMakerでホスティングする手順を解説します。推論用Dockerイメージの作成、ECRへのアップロード、SageMakerエンドポイントのデプロイまでの実践的な流れを、具体的なコード例とともに紹介します。

はじめに

ローカル環境で機械学習モデルを訓練した後、本格的な運用に移行する際にAmazon SageMakerを活用することで、スケーラブルで管理しやすい推論環境を構築できます。

SageMakerでは、カスタムのDockerコンテナを使用してモデルをホスティングすることが可能です。AWS公式ドキュメント ( https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html ) によると、「SageMakerは独自の推論コードを実行するDockerコンテナと相互作用し、ホスティングサービスを提供する」とされています。

本記事では、以下の内容を学ぶことができます：

推論用Dockerイメージの作成方法
Amazon ECRへの安全なイメージアップロード
SageMakerエンドポイントの作成とデプロイ
実際の推論実行とテスト方法

前提条件と環境準備

必要なツール

AWS CLI v2（認証設定済み）
Docker Desktop（または Docker Engine）
Python 3.8以上
訓練済みの機械学習モデル（例：scikit-learn、PyTorchなど）

IAMロールの設定

SageMakerがECRからイメージを取得し、エンドポイントを作成するために、適切なIAMロールが必要です：

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "ecr:GetDownloadUrlForLayer",
        "ecr:BatchGetImage",
        "ecr:BatchCheckLayerAvailability"
      ],
      "Resource": "*"
    }
  ]
}

ECRリポジトリの作成

まず、Dockerイメージを保存するためのECRリポジトリを作成します：

aws ecr create-repository --repository-name my-sagemaker-model --region us-east-1

推論用Dockerイメージの構築

ディレクトリ構成

推論用のファイルを以下のように配置します：

my-inference-container/
├── Dockerfile
├── requirements.txt
├── code/
│   └── serve.py
└── model/
    └── model.pkl  # 訓練済みモデルファイル

推論コード（serve.py）の作成

SageMakerが期待するHTTPサーバーとして動作する推論コードを作成します：

import os
import json
import pickle
import logging
from flask import Flask, request, jsonify
import numpy as np

# ログ設定
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

app = Flask(__name__)

# モデルの読み込み
model_path = '/opt/ml/model/model.pkl'
model = None

def load_model():
    global model
    try:
        with open(model_path, 'rb') as f:
            model = pickle.load(f)
        logger.info("Model loaded successfully")
    except Exception as e:
        logger.error(f"Error loading model: {e}")
        raise

@app.route('/ping', methods=['GET'])
def ping():
    """ヘルスチェック用エンドポイント"""
    return jsonify({'status': 'healthy'})

@app.route('/invocations', methods=['POST'])
def invocations():
    """推論実行用エンドポイント"""
    try:
        # リクエストデータの取得
        input_data = request.get_json()
        
        # 入力データの前処理
        features = np.array(input_data['features']).reshape(1, -1)
        
        # 推論実行
        prediction = model.predict(features)
        
        # 結果の返却
        return jsonify({
            'prediction': prediction.tolist()[0],
            'status': 'success'
        })
    
    except Exception as e:
        logger.error(f"Prediction error: {e}")
        return jsonify({'error': str(e)}), 500

if __name__ == '__main__':
    load_model()
    app.run(host='0.0.0.0', port=8080)

Dockerfileの作成

推論環境を構築するDockerfileを作成します：

FROM python:3.9-slim

# 作業ディレクトリの設定
WORKDIR /opt/ml

# 必要なパッケージのインストール
COPY requirements.txt .
RUN pip install -r requirements.txt

# モデルと推論コードのコピー
COPY model/ /opt/ml/model/
COPY code/ /opt/ml/code/

# 環境変数の設定
ENV PYTHONPATH=/opt/ml/code

# SageMakerが期待するディレクトリの作成
RUN mkdir -p /opt/ml/input /opt/ml/output

# ポートの公開
EXPOSE 8080

# 推論サーバーの起動
CMD ["python", "/opt/ml/code/serve.py"]

requirements.txtの設定

flask==2.3.2
scikit-learn==1.3.0
numpy==1.24.3

ローカルでのイメージビルド

cd my-inference-container
docker build -t my-sagemaker-model:latest .

ECRへのイメージアップロード

ECRにDockerイメージをアップロードする手順を実行します。AWS公式ドキュメント ( https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-push-ecr-image.html ) では、「Dockerを Amazon ECRレジストリに認証するには、aws ecr get-login-passwordコマンドを実行する」と説明されています。

ECRログインの実行

# ECRにログイン
aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin <account-id>.dkr.ecr.us-east-1.amazonaws.com

イメージのタグ付け

ECRリポジトリに対応するタグを付けます：

# イメージにECRタグを付与
docker tag my-sagemaker-model:latest <account-id>.dkr.ecr.us-east-1.amazonaws.com/my-sagemaker-model:latest

ECRへのプッシュ

# ECRにイメージをプッシュ
docker push <account-id>.dkr.ecr.us-east-1.amazonaws.com/my-sagemaker-model:latest

SageMakerモデルとエンドポイントの作成

SageMakerモデルの作成

boto3を使用してSageMakerモデルを作成します：

import boto3

sagemaker = boto3.client('sagemaker', region_name='us-east-1')

# モデルの作成
model_name = 'my-custom-model'
image_uri = '<account-id>.dkr.ecr.us-east-1.amazonaws.com/my-sagemaker-model:latest'
role_arn = 'arn:aws:iam::<account-id>:role/SageMakerExecutionRole'

response = sagemaker.create_model(
    ModelName=model_name,
    PrimaryContainer={
        'Image': image_uri,
        'Mode': 'SingleModel'
    },
    ExecutionRoleArn=role_arn
)

print(f"Model created: {response['ModelArn']}")

エンドポイント設定の作成

# エンドポイント設定の作成
endpoint_config_name = 'my-model-endpoint-config'

response = sagemaker.create_endpoint_config(
    EndpointConfigName=endpoint_config_name,
    ProductionVariants=[
        {
            'VariantName': 'primary',
            'ModelName': model_name,
            'InitialInstanceCount': 1,
            'InstanceType': 'ml.t2.medium',
            'InitialVariantWeight': 1.0
        }
    ]
)

エンドポイントのデプロイ

# エンドポイントの作成
endpoint_name = 'my-model-endpoint'

response = sagemaker.create_endpoint(
    EndpointName=endpoint_name,
    EndpointConfigName=endpoint_config_name
)

print(f"Endpoint creation started: {response['EndpointArn']}")

推論の実行とテスト

エンドポイントのデプロイが完了したら、推論を実行してテストします：

import boto3
import json

# SageMaker Runtimeクライアントの作成
runtime = boto3.client('sagemaker-runtime', region_name='us-east-1')

# テストデータの準備
test_data = {
    'features': [5.1, 3.5, 1.4, 0.2]  # サンプル特徴量
}

# 推論の実行
response = runtime.invoke_endpoint(
    EndpointName=endpoint_name,
    ContentType='application/json',
    Body=json.dumps(test_data)
)

# 結果の取得
result = json.loads(response['Body'].read().decode())
print(f"Prediction result: {result}")

注意点とベストプラクティス

セキュリティ考慮事項

ECRリポジトリにはプライベートアクセスを設定し、不要な公開は避ける
IAMロールは最小権限の原則に従って設定する
推論コードには入力値の検証を実装する

コスト最適化

推論量に応じて適切なインスタンスタイプを選択する
Auto Scalingを活用してコストと性能のバランスを取る
不要になったエンドポイントは速やかに削除する

参考料金：ml.t2.mediumインスタンスの場合、1時間あたり約$0.056（2024年時点）

モニタリングの重要性

CloudWatchメトリクスでエンドポイントの健全性を監視する
ログを活用してエラーや性能問題を早期発見する

終わりに

本記事では、ローカルで訓練したモデルをSageMakerでホスティングするための完全な手順を解説しました。Dockerコンテナを使用することで、ローカル環境と本番環境の一貫性を保ちながら、スケーラブルな推論環境を構築できます。

次のステップとして、以下の発展的な内容を検討してみてください：

Multi-Model Endpointを使用した複数モデルの同時ホスティング
バッチ推論を活用した大量データの効率的な処理
A/Bテスト機能を使用した段階的なモデル更新

参考文献・参考サイト

「Custom Inference Code with Hosting Services」AWS Documentation, https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html
「push a Docker image to an Amazon ECR repository」AWS Documentation, https://docs.aws.amazon.com/AmazonECR/latest/userguide/docker-push-ecr-image.html
「Adapt your own inference container for Amazon SageMaker AI」AWS Documentation, https://docs.aws.amazon.com/sagemaker/latest/dg/adapt-inference-container.html
Saturn Cloud Blog「How to Build Custom Docker Images For AWS SageMaker」2024年4月2日, https://saturncloud.io/blog/how-to-build-custom-docker-images-for-aws-sagemaker/
AWS Machine Learning Blog「Create a SageMaker inference endpoint with custom model & extended container」2025年1月27日, https://aws.amazon.com/blogs/machine-learning/create-a-sagemaker-inference-endpoint-with-custom-model-extended-container/

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up