More than 1 year has passed since last update.

AWS SAMとHugging Face Transformersでサーバーレスな推論APIを作る

Last updated at 2023-02-09Posted at 2023-02-09

作ったもの

Hugging Face AutoTrainを使って、テキストの二値分類をする機械学習モデルを作ったので、それをAWS SAMでデプロイしてみました。
↓のような形でPOSTできて、推論結果などを返してくれます。

❯ curl -X POST -H "Content-Type: application/json" -d '{"text":"テストだよ"}' https://xxxx.lambda-url.ap-northeast-1.on.aws/
{"label": 0, "message": "success"}

なお、Lambda関数をAPIとして使うにはAPI Gateway経由で呼び出すのが鉄板ですが、やってみたらAPI Gatewayの30秒制限に引っかかった（※）ので、今回はLambda Function URLというのを使いました。
※初回実行時のモデル読み込みだけ1分ほど時間がかかって、推論処理は数秒で返してくれました。

ちなみにこちらの記事ではAPI Gatewayを使ったやり方を書いてくれていて、とても参考にしました。

やり方（ざっくり）

具体的な手順は後述しますが、以下のステップで簡単に推論APIが構築できます。コードというコードを書いたのは推論処理の20行くらいです。

AutoTrainでモデルを作成する（公開されている学習済みモデルでも可）
sam initし、生成されたテンプレをちょっと変える
Hugging Face Transformersを使った推論処理を書く（20行くらい）
sam build && sam deployする

使った技術

今回は下記の3つのサービス・パッケージを組み合わせて、サーバーレスな推論APIを作りました。

Hugging Face AutoTrain

GCPのAutoMLのように、データをアップすると学習済みモデルをもとに機械学習モデルを組んで訓練＆検証してくれるサービス
AutoMLは（たぶん）作ったモデルをGCP外に出せないが、こちらは自由にモデルをダウンロードしてPytorchやTensorFlowなどで使えて便利
無料プランだと5個のモデルを組んでもらえて、好きなものを使える
（自分でもモデルを組んでみたがAutoTrainのほうが精度が良かったので、じゃあ全部これでいいじゃん！と不貞腐れた）

訓練結果の例）

Hugging Face Transformers

いろんな学習済みモデルをPytorchやTensorFlowなどから簡単に使えるようにしたパッケージ
今回はAutoTrainで作ったモデルで推論するのに使う
（強いモデルがこんなに簡単に使えちゃったら自分でモデルを組む気が削がれる）

学習済みモデルを使ったテキスト分類の例）

from transformers import pipeline

classifier = pipeline("text-classification")
classifier("We are very happy to show you the 🤗 Transformers library.")
# [{'label': 'POSITIVE', 'score': 0.9998}]

AWS SAM（AWS サーバーレスアプリケーションモデル）

最短3コマンドでLambdaやAPI Gatewayなどの構築ができるフレームワーク
AWS CloudFormationをサーバーレスアーキテクチャ用に使いやすく拡張したもので、Infrastructure as Code（IaC）な環境構築ができる

具体的な手順

AutoTrainでモデルを作成する

ここはGUIをポチポチして訓練データをアップするだけなので省略します。数分でそこそこの精度のモデルができるので本当に驚きです。

2023年2月時点だと画像・テキスト・テーブルデータの3種類に対応していて、たとえばテキストだと分類・Q&A・翻訳・要約などのタスクに対応しています。学習済みモデルも豊富で、今やスタンダードとなったBERTやその発展系を含めたいろんなモデルから選べます。

今回はいくつか試して最も精度が良かったstudio-ousia/luke-japanese-large-liteを使いました。

`sam init`する

sam initすると、テンプレの種類や使う言語・バージョンなどを聞かれます。今回は↓のように設定しました。

1 - AWS Quick Start Templates
1 - Hello World Example
16 - python3.9
2 - Image

処理が終わると、Project Name（デフォルトではsam-app）のディレクトリにいろんなファイルができます。

❯ sam init

You can preselect a particular runtime or package type when using the `sam init` experience.
Call `sam init --help` to learn more.

Which template source would you like to use?
        1 - AWS Quick Start Templates
        2 - Custom Template Location
Choice: 1

Choose an AWS Quick Start application template
        1 - Hello World Example
        2 - Multi-step workflow
        3 - Serverless API
        4 - Scheduled task
        5 - Standalone function
        6 - Data processing
        7 - Infrastructure event management
        8 - Serverless Connector Hello World Example
        9 - Multi-step workflow with Connectors
        10 - Lambda EFS example
        11 - Machine Learning
Template: 1

Use the most popular runtime and package type? (Python and zip) [y/N]: n

Which runtime would you like to use?
        1 - aot.dotnet7 (provided.al2)
        2 - dotnet6
        3 - dotnet5.0
        4 - dotnetcore3.1
        5 - go1.x
        6 - go (provided.al2)
        7 - graalvm.java11 (provided.al2)
        8 - graalvm.java17 (provided.al2)
        9 - java11
        10 - java8.al2
        11 - java8
        12 - nodejs18.x
        13 - nodejs16.x
        14 - nodejs14.x
        15 - nodejs12.x
        16 - python3.9
        17 - python3.8
        18 - python3.7
        19 - ruby2.7
        20 - rust (provided.al2)
Runtime: 16

What package type would you like to use?
        1 - Zip
        2 - Image
Package type: 2

Based on your selections, the only dependency manager available is pip.
We will proceed copying the template using pip.

Would you like to enable X-Ray tracing on the function(s) in your application?  [y/N]: n

Would you like to enable monitoring using CloudWatch Application Insights?
For more info, please view https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/cloudwatch-application-insights.html [y/N]: y
AppInsights monitoring may incur additional cost. View https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/appinsights-what-is.html#appinsights-pricing for more details

Project name [sam-app]:

Cloning from https://github.com/aws/aws-sam-cli-app-templates (process may take a moment)

    -----------------------
    Generating application:
    -----------------------
    Name: sam-app
    Base Image: amazon/python3.9-base
    Architectures: x86_64
    Dependency Manager: pip
    Output Directory: .

    Next steps can be found in the README file at ./sam-app/README.md


Commands you can use next
=========================
[*] Create pipeline: cd sam-app && sam pipeline init --bootstrap
[*] Validate SAM template: cd sam-app && sam validate
[*] Test Function in the Cloud: cd sam-app && sam sync --stack-name {stack-name} --watch

テンプレをちょっと変える

テンプレのままだと推論をするにはメモリサイズが足りなかったりするので、以下のようにカスタマイズします。

template.yaml

タイムアウト値やメモリサイズを大きくして、API GatewayではなくLambda Function URLを使うようにします。

template.yaml

Globals:
  Function:
-    Timeout: 3
-    MemorySize: 128
+    Timeout: 300
+    MemorySize: 5000
~~~
~~~
      Architectures:
        - x86_64
-      Events:
-        HelloWorld:
-          Type: Api # More info about API Event Source: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#api
-          Properties:
-            Path: /hello
-            Method: get
+      FunctionUrlConfig:
+        AuthType: NONE  # 認証必須にする場合は AWS_IAM にする
~~~
~~~
Outputs:
  # ServerlessRestApi is an implicit API created out of Events key under Serverless::Function
  # Find out more about other implicit resources you can reference within SAM
  # https://github.com/awslabs/serverless-application-model/blob/master/docs/internals/generated_resources.rst#api
-  HelloWorldApi:
-    Description: API Gateway endpoint URL for Prod stage for Hello World function
-    Value: !Sub "https://${ServerlessRestApi}.execute-api.${AWS::Region}.amazonaws.com/Prod/hello/"
  HelloWorldFunction:
    Description: Hello World Lambda Function ARN
    Value: !GetAtt HelloWorldFunction.Arn
  HelloWorldFunctionIamRole:
    Description: Implicit IAM Role created for Hello World function
    Value: !GetAtt HelloWorldFunctionRole.Arn
+  HelloWorldFunctionUrl:
+    Description: "Function URLs endpoint"
+    Value: !GetAtt HelloWorldFunctionUrl.FunctionUrl

Dockerfile

Transformersでは、最初にモデルをリポジトリからダウンロードする必要があるのですが、Lambda実行時はそれを省きたいので、あらかじめローカルにモデルを保存しておいて、Lambdaが実行されるコンテナ上にコピーするようにします。
また、TRANSFORMERS_CACHEがないとエラーになったためここで設定しておきます。

Dockerfile

FROM public.ecr.aws/lambda/python:3.9

COPY app.py requirements.txt ./
COPY model /opt/model  # ローカルに保存したモデル（後述）をLambdaコンテナ上にコピーする

RUN python3.9 -m pip install -r requirements.txt -t .
ENV TRANSFORMERS_CACHE /opt/cache/transformers  # これがないとLambda実行時にエラーになった

# Command can be overwritten by providing a different command in the template directly.
CMD ["app.lambda_handler"]

requirements.txt

Transformersの実行に必要なパッケージを書きます。今回は学習済みモデルをstudio-ousia/luke-japanese-large-liteにしてAutoTrainを使ったので、sentencepieceも必要になります。

-f https://download.pytorch.org/whl/torch_stable.html
torch==1.13.1+cpu
sentencepiece==0.1.97
transformers==4.26.0

推論処理を書く

Transformersの公式チュートリアルが充実しているので、詳細な説明は省略します。

ローカルにモデルを保存する

前準備として、Lambdaコンテナにコピーする用のモデルをローカルに保存します。

# 認証情報などの設定
auth_token = "API_token"  # Hugging Faceのページから取得する
repository = "xxxx/autotrain-xxxxxx"  # モデルのリポジトリ

# モデル読み込み
model = AutoModelForSequenceClassification.from_pretrained(repository, use_auth_token=auth_token)
tokenizer = AutoTokenizer.from_pretrained(repository, use_auth_token=auth_token)

# Lambdaで使うため、↓のようにモデルをローカルに保存しておく
model_path = "./model"
model.save_pretrained(model_path)
tokenizer.save_pretrained(model_path)

`app.py`に推論処理を書く

sam initで生成されたapp.pyに推論処理を書きます。

import json

import torch
from transformers import AutoModelForSequenceClassification, AutoTokenizer

model_path = "/opt/model"  # Dockerfileで設定したパス
model = AutoModelForSequenceClassification.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)


def predict(text: str) -> int:
    '''テキストを入力して推論結果（0 or 1）を返す'''
    inputs = tokenizer(text, padding=True, truncation=True, return_tensors="pt")
    outputs = model(**inputs)
    prob = torch.nn.functional.softmax(outputs.logits, dim=-1).detach().numpy()[0]

    label = 0
    if prob[0] < prob[1]:
        label = 1
    return label


def lambda_handler(event, context):
    try:
        input_text = json.loads(event["body"])["text"]
        label = predict(input_text)

        return {
            "statusCode": 200,
            "body": json.dumps(
                {
                    "label": label,
                    "message": "success",
                    "detail": None,
                }
            ),
        }
    except Exception as e:
        return {
            "statusCode": 500,
            "body": json.dumps(
                {
                    "label": None,
                    "message": "error",
                    "detail": f"{type(e).__name__}: {e}",
                }
            ),
        }

`sam build && sam deploy`する

あとはsam buildして

❯ sam build
~~~
~~~
Step 1/6 : FROM public.ecr.aws/lambda/python:3.9
 ---> ebf75d239046
Step 2/6 : COPY app.py requirements.txt ./
~~~
~~~
Build Succeeded

Built Artifacts  : .aws-sam\build
Built Template   : .aws-sam\build\template.yaml

Commands you can use next
=========================
[*] Validate SAM template: sam validate
[*] Invoke Function: sam local invoke
[*] Test Function in the Cloud: sam sync --stack-name {{stack-name}} --watch
[*] Deploy: sam deploy --guide

sam deployすればデプロイ完了です。初回はsam deploy --guidedするとインタラクティブに設定しながらデプロイができて、2回目以降はsam deployするとその設定をもとにデプロイしてくれます。

❯ sam deploy --guided

Configuring SAM deploy
======================

        Looking for config file [samconfig.toml] :  Found
        Reading default arguments  :  Success

        Setting default arguments for 'sam deploy'
        =========================================
        Stack Name [sam-app]:
        AWS Region [ap-northeast-1]:
        #Shows you resources changes to be deployed and require a 'Y' to initiate deploy
        Confirm changes before deploy [Y/n]: y
        #SAM needs permission to be able to create roles to connect to the resources in your template
        Allow SAM CLI IAM role creation [Y/n]: y
        #Preserves the state of previously provisioned resources when an operation fails
        Disable rollback [Y/n]: y
        HelloWorldFunction Function Url may not have authorization defined, Is this okay? [y/N]: y
        Save arguments to configuration file [Y/n]: y
        SAM configuration file [samconfig.toml]:
        SAM configuration environment [default]:
~~~
~~~
---------------------------------------------------------------------------------------------------------------------------   
Outputs
---------------------------------------------------------------------------------------------------------------------------   
Key                 HelloWorldFunctionIamRole
Description         Implicit IAM Role created for Hello World function
Value               arn:aws:iam::xxxxxx

Key                 HelloWorldFunction
Description         Hello World Lambda Function ARN
Value               arn:aws:lambda:xxxxxx     

Key                 HelloWorldFunctionUrl
Description         Function URLs endpoint
Value               https://xxxxxx.lambda-url.ap-northeast-1.on.aws/
---------------------------------------------------------------------------------------------------------------------------   

Successfully created/updated stack - sam-app in ap-northeast-1

最後に出力されたURLにPOSTすると結果が返ってくるはずです（初回はちょっと待ちます）。
エラーになったらLambdaのログなどを見にいくとよいと思います。

❯ curl -X POST -H "Content-Type: application/json" -d '{"text":"テストだよ"}' https://xxxx.lambda-url.ap-northeast-1.on.aws/
{"label": 0, "message": "success"}

まとめ

Hugging Face AutoTrainを使うと、簡単にそこそこの精度のモデルができて便利
Hugging Face Transformersを使うと、いろんなモデルが簡単に扱えてすごい
AWS SAMを使うと、数コマンドでAPIがデプロイできて素敵

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

AWS SAMとHugging Face Transformersでサーバーレスな推論APIを作る

作ったもの

やり方（ざっくり）

使った技術

Hugging Face AutoTrain

Hugging Face Transformers

AWS SAM（AWS サーバーレスアプリケーションモデル）

具体的な手順

AutoTrainでモデルを作成する

sam initする

テンプレをちょっと変える

template.yaml

Dockerfile

requirements.txt

推論処理を書く

ローカルにモデルを保存する

app.pyに推論処理を書く

sam build && sam deployする

まとめ

`sam init`する

`app.py`に推論処理を書く

`sam build && sam deploy`する