More than 1 year has passed since last update.

Jetson NanoでOpenAI Whisperを使った文字起こしを実現しようとしたが、ハードルがたくさんあった

Last updated at 2023-08-01Posted at 2023-08-01

はじめに

Jetson Nano Developer Kitを使って、いろいろと試してみようと考え起動したら、Python 3.6.9だったので、Python 3.8以上にする方法を探して苦労した、という記録です。

Jetson Download Centerにて探してみるとJetson Nanoに対応したJetPackは2021/08/04にリリースされたJetPack 4.6が最新でした。Python 3.6.9が同梱されています。利用したいアプリの都合から、Python 3.8以上にしたいので、色々と試してみたのですが、結果的にうまくいきませんでした。具体的にはモジュールの依存関係を解決することができませんでした。

色々探していると、Q-engineeringがJetson Nano向けにリリースしているUbuntu 20.04ベースのOS imageが見つかったので、こちらで対応してみました。ちなみにJetPack 4.6はUbuntu 18.04ベースです。

本来の目的は、、

この準備作業は、OpenAI/WhisperをJetson Nanoで動かすとサクサク動くのか、ということを確認する、という目的でした。が、結論からいうと、期待する結果は得られませんでした。条件や利用したファイルはこちらの記事と同じです。CPUだと遅く、smallにて29分30秒、baseにて2分47秒かかりました。Macbook Pro M1(8コア)で実行するとsmallにて3分1秒なので、CPU勝負だと話にならない、ということがわかりました。コア数も違うし当然かと思いますが。肝心のGPUでの実行ですが、途中でフリーズしてしまい、最後まで結果を得ることができませんでした。GPUが動作していることはjtopで確認できています。モデルもbase、smallを何度か実行しましたが、いずれも失敗したので検証を断念しました。

NVIDIAの記事には「128コアのNVIDIA GPUにより、472 GFLOPSという演算性能を発揮します」とあります。こちらのQiita記事の比較を確認すると、Jetson Nanoはハイスペックな部類ではないということがよくわかりました。
Whisperのような処理を行おうとすると、472 GFLOPS(FP16)という処理能力は十分ではないのかもしれません。機会があれば、他のGPUで比較できると良いなと考えています。

利用できるJetPackのVersionとハードウェアスペックを考慮すると、Jetson Nano以外を使う方が良さそうです。

利用したモデルとスペック

Jetson Nano Developer Kit (945-13450-0000-000)
Carrier board A02 revision
128 core Maxwell GPU
4 core ARM A57
4GB Memory
1 camera connector
472 GFLOPS

NVIDIA公式のOS imageを利用

こちらよりダウンロードして利用しました。

作業内容

python3 --version
Python 3.6.9

まずはupdateを実施します。これを行わないと、pip3のインストールに失敗しました。

sudo apt update

Python3 のpipをインストール

sudo apt install python3-pip

pip3 list | grep torchなどで確認するとわかりますが、PyTorchはインストールされていません。PyTorch for Jetsonにバイナリが公開されているので、JetPack 4 PyTorch v1.10.0を選択してファイルをダウンロードしてインストールします。ちなみにファイル名も重要です。

sudo apt install libopenblas-base libopenmpi-dev
pip3 install Cython
wget https://nvidia.box.com/shared/static/fjtbno0vpo676a25cgvuqc1wty0fkkg6.whl -O torch-1.10.0-cp36-cp36m-linux_aarch64.whl
pip3 install torch-1.10.0-cp36-cp36m-linux_aarch64.whl

ここまで実行するとtorch.cuda.is_available()がTrueとなることを確認できました。

import torch
print(torch.cuda.is_available())

ちなみにPython 3.6.9では、openai-whisperのインストールは適切なversionが見つからない、ということで失敗します。WhisperはPython 3.8から3.11をサポートしている、とGitHubに記載があるのでインストールできないのでしょう。ということで、Python 3.8以上の環境をJetson Nanoで用意しようとすると、正攻法では解決できませんでした。下記なども参考に色々と試したのですが。。

Pytorch/Installing Previous Versions of Pytorch

Q-engineeringのOS imageを利用

彷徨っているとQ-engineeringという企業が提供しているUbuntu 20.04ベースのJetson Nano with Ubuntu 20.04 OS imageを発見しましたのでこちらを試してみました。情報はgithub / Qengineering / Jetson-Nano-Ubuntu-20-imageにあります。

Jetson NanoにPytorchなどをインストールするための情報もQ-engineering / Install PyTorch on Jetson Nanoに公開されています。

作業内容

OS imageを用意して起動
username、passwordともにjetsonとなる

何も追加インストールしていない状態で、torch.cuda.is_available()がTrueとなることを確認できた

import torch
print(torch.cuda.is_available())

PythonのVersion確認

python3 --version
Python 3.8.10

一通り動作する状態のpip3 listは下記の通りとなります。

pip3 list | grep torch
torch                   1.13.0a0+git7c98e70
torchvision             0.14.0a0+5ce4506

備考

Q-engineeringのOS imageにWhisperのインストールを試した記録を残しておきます。基本的にgithub/openai/whisperに案内されている手順なのですが、--upgradeオプションをつけた状態でインストールすると、python自体も更新され、torchもtorchvisionもGPU未対応の物が上書きされます。ご注意ください。ということでupgradeオプションをつけずに実行したが、numbaのインストールでエラーが出ました。エラー内容を確認し、対応しているnumpyをインストールすることで対応しました。

pip3 install openai-whisper
Collecting openai-whisper
  Downloading openai-whisper-20230314.tar.gz (792 kB)
     |████████████████████████████████| 792 kB 9.7 MB/s 
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
    Preparing wheel metadata ... done
Requirement already satisfied: tqdm in /usr/local/lib/python3.8/dist-packages (from openai-whisper) (4.62.3)
Requirement already satisfied: numpy in ./.local/lib/python3.8/site-packages (from openai-whisper) (1.18.5)
Collecting numba
  Downloading numba-0.57.1-cp38-cp38-manylinux2014_aarch64.manylinux_2_17_aarch64.whl (3.3 MB)
     |████████████████████████████████| 3.3 MB 24.2 MB/s 
Collecting tiktoken==0.3.1
  Downloading tiktoken-0.3.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (1.6 MB)
     |████████████████████████████████| 1.6 MB 16.5 MB/s 
Collecting ffmpeg-python==0.2.0
  Downloading ffmpeg_python-0.2.0-py3-none-any.whl (25 kB)
Requirement already satisfied: torch in /usr/local/lib/python3.8/dist-packages (from openai-whisper) (1.13.0a0+git7c98e70)
Requirement already satisfied: more-itertools in /usr/lib/python3/dist-packages (from openai-whisper) (4.2.0)
Requirement already satisfied: importlib-metadata; python_version < "3.9" in /usr/lib/python3/dist-packages (from numba->openai-whisper) (1.5.0)
Collecting llvmlite<0.41,>=0.40.0dev0
  Downloading llvmlite-0.40.1-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (41.1 MB)
     |████████████████████████████████| 41.1 MB 20 kB/s 
Collecting regex>=2022.1.18
  Downloading regex-2023.6.3-cp38-cp38-manylinux_2_17_aarch64.manylinux2014_aarch64.whl (771 kB)
     |████████████████████████████████| 771 kB 16.9 MB/s 
Requirement already satisfied: requests>=2.26.0 in /usr/local/lib/python3.8/dist-packages (from tiktoken==0.3.1->openai-whisper) (2.26.0)
Requirement already satisfied: future in /usr/lib/python3/dist-packages (from ffmpeg-python==0.2.0->openai-whisper) (0.18.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.8/dist-packages (from torch->openai-whisper) (3.7.4.3)
Requirement already satisfied: certifi>=2017.4.17 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken==0.3.1->openai-whisper) (2019.11.28)
Requirement already satisfied: idna<4,>=2.5; python_version >= "3" in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken==0.3.1->openai-whisper) (2.8)
Requirement already satisfied: charset-normalizer~=2.0.0; python_version >= "3" in /usr/local/lib/python3.8/dist-packages (from requests>=2.26.0->tiktoken==0.3.1->openai-whisper) (2.0.6)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /usr/lib/python3/dist-packages (from requests>=2.26.0->tiktoken==0.3.1->openai-whisper) (1.25.8)
Building wheels for collected packages: openai-whisper
  Building wheel for openai-whisper (PEP 517) ... done
  Created wheel for openai-whisper: filename=openai_whisper-20230314-py3-none-any.whl size=796917 sha256=85edb4f0983a2746e2dc22f744adb08d6c95314c05f613bb847692373acc64c9
  Stored in directory: /home/jetson/.cache/pip/wheels/09/a2/b6/7de9e2f763d72cb5fd70e9623d5880cfdc7f4e0d17842c2fd5
Successfully built openai-whisper
ERROR: numba 0.57.1 has requirement numpy<1.25,>=1.21, but you'll have numpy 1.18.5 which is incompatible.
Installing collected packages: llvmlite, numba, regex, tiktoken, ffmpeg-python, openai-whisper
Successfully installed ffmpeg-python-0.2.0 llvmlite-0.40.1 numba-0.57.1 openai-whisper-20230314 regex-2023.6.3 tiktoken-0.3.1

下記を実行し、numpyの1.25より下で1.21以上を探して下さい。

pip3 install numpy==

ここでは、numpy 1.24.4を選びました。

pip3 install numpy==1.24.4

再度openai-whisperのインストールを実行するとエラーなく完了できました。

pip3 install openai-whisper

コマンドラインツールのffmpegもインストールも必要なはずです。

sudo apt update && sudo apt install ffmpeg

これでwhisperのインストールは完了しました。ただし、すでに述べましたが、Jetson Nanoでは、正常に処理が完了しませんでした。途中結果は表示されるので、過剰な負荷がかかっているのだろう、と思います。

参考情報

NVIDIA DEVELOPER / Support Resources

NVIDIA / エッジコンピューティング / Jetson

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up