M5Stack Module LLM Advent Calendar 2024

Module-LLMのNPU用モデルへ変換する(EfficientViT編)

Last updated at 2024-12-21Posted at 2024-12-21

目的

　Module-LLMでEfficientViTのモデルを実行する手順を説明します。
　Module-LLMのNPUでEfficientViTのモデルを高速に実行するには、Pulsar2というツールを使用してINT8形式に量子化してモデルサイズを縮小する必要があります。

EfficientViTモデルのONNX出力

1.環境準備

GitHubリポジトリをダウンロードし、EfficientViT/Classificationパスに移動して関連する依存ライブラリをインストールします。

$ git clone https://github.com/microsoft/Cream.git
$ cd Cream/EfficientViT/classification/
$ pip install -r requirements.txt

2.PyTorchモデルの準備

公式リポジトリではすでにONNXモデルが提供されていますが、そのONNXモデルのBatch Sizeが16に設定されており、一般的なエッジデバイスチップでの画像処理には適していません。そのため、ここではpthファイルからBatch Size が1のONNXモデルを新たに生成する方法を選択します。

pthファイルのダウンロード

$ wget https://github.com/xinyuliu-jeffrey/EfficientViT_Model_Zoo/releases/download/v1.0/efficientvit_m5.pth

ONNXモデルをエクスポートし、onnxsimでonnxの最適化を行う

$ python export_onnx_efficientvit_m5.py
$ onnxsim efficientvit_m5.onnx efficientvit_m5-sim.onnx

export_onnx_efficientvit_m5.pyのソースコードは以下の通りです

export_onnx_efficientvit_m5.py

from model import build
from timm.models import create_model
import torch

model = create_model(
        "EfficientViT_M5",
        num_classes=1000,
        distillation=False,
        pretrained=False,
        fuse=False,
    )

checkpoint = torch.load("./efficientvit_m5.pth", map_location='cpu')
state_dict = checkpoint['model']
model.load_state_dict(state_dict)
model.eval()
dummy_input = torch.rand([1,3,224,224])

model(dummy_input)

torch.onnx.export(model, dummy_input, "efficientvit_m5.onnx", opset_version=11)

Pulsar2のインストール

こちらを参照して、Pulsar2をインストールします。

quick_start_exampleのダウンロード

モデルのコンパイルとシミュレーション実行に必要なオリジナルモデル、データ、画像、シミュレーションツールを、次のリンクからダウンロードできるファイルの中に、quick_start_exampleフォルダ内に用意しています。
サンプルファイルをダウンロードをクリックし、ダウンロードしたファイルを解凍してdockerの/dataパスにコピーします。

quick_start_example.zip

root@xxx:~/data# ls
config  dataset  model  output  pulsar2-run-helper

# model: オリジナルのONNXモデルを格納します（事前にonnxsimを使用して最適化済み）
# dataset: オフライン量子化キャリブレーション（PTQ Calibration）に必要なデータセットの圧縮ファイルを格納します（tar、tar.gz、gzなどの一般的な圧縮形式に対応）
# config: 実行に必要な設定ファイルconfig.jsonを格納します
# output: 結果出力を格納します
# pulsar2-run-helper: X86環境でのaxmodelのシミュレーション実行をサポートする

modelフォルダの下に、mobilenetv2-sim.onnxをコピーします。

EfficientViTモデルのAXモデルへの変換

Pulsar2がインストールされている、Dockerを起動します。

$ sudo docker run -it --net host --rm -v $PWD:/data pulsar2:temp-58aa62e4

Pulsar2のbuildコマンドで、onnxモデルをModule-LLM(ax630c)のNPUに対応するaxモデルへ変換することができます。グラフ最適化、オフライン量子化、コンパイル、性能比較機能が一括で完了します。

$ pulsar2 build --input model/efficientvit_m5-sim.onnx --output_dir efficientvit-m5/ --config config/effientvit_config.json

efficientvit_m5-sim.jsonは、pulsar2でモデル変換を行うための設定ファイルを記載しているものです。今回の設定は以下のようになっています。

mobilenet_v2_build_config.json

{
  "model_type": "ONNX",
  "npu_mode": "NPU1",
  "quant": {
    "input_configs": [
      {
        "tensor_name": "input",
        "calibration_dataset": "./dataset/imagenet-32-images.tar",
        "calibration_size": 32,
        "calibration_mean": [103.939, 116.779, 123.68],
        "calibration_std": [58.0, 58.0, 58.0]
      }
    ],
    "calibration_method": "MinMax",
    "precision_analysis": false
  },
  "input_processors": [
    {
      "tensor_name": "input",
      "tensor_format": "BGR",
      "src_format": "BGR",
      "src_dtype": "U8",
      "src_layout": "NHWC",
      "csc_mode": "NoCSC"
    }
  ],
  "compiler": {
    "check": 2
  }
}

AXモデルが生成できていることを確認します。

$ ls output
compiled.axmodel

Module-LLMで実行

efficientvitのモデルは、exampleで入っているmobilenetv2のAXモデルと同じ形式であるため、mobilenetv2のexampleを参考にModule-LLMで動作させてみます。

root@m5stack-LLM:/opt/usr/241218_Vit# python3 classification.py                                                       ls
[INFO] Chip type: ChipType.MC20E
[INFO] Engine version: 2.6.3sp
[INFO] VNPU type: VNPUType.DISABLED
[INFO] Model type: 0 (half core)
[INFO] Compiler version: 3.2-patch1 58aa62e4
Top 5 Predictions:
Class Index: 824, Score: 5.689029693603516
Class Index: 501, Score: 5.6416215896606445
Class Index: 775, Score: 5.214943885803223
Class Index: 689, Score: 4.74085807800293
Class Index: 452, Score: 4.646040916442871

参考リンク

nnn112358/M5_LLM_Module_Report
https://github.com/nnn112358/M5_LLM_Module_Report

基于 AX650N 部署 EfficientViT
https://zhuanlan.zhihu.com/p/630775597

pulsar2-docs
https://pulsar2-docs.readthedocs.io/en/latest/index.html

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up