0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

Minisforum N5 Pro AI NAS で Ollama with iGPU (GPT-oss:20b & 120b)

Posted at

概要

AMD の APU 搭載 NAS で Local LLM 出来ないか試みました。
gpt-oss モデルで動作確認して、以下のパフォーマンスが得られました。
20b : 10.4 tokens/s (GPU のみで動作)
120b : 5.7 tokens/s (CPUもそれなりに食う)

システム構成

Minisforum N5 Pro AI NAS
https://www.minisforum.jp/products/minisforum-n5-pro-nas

CPU: AMD Ryzen™ AI 9 HX PRO 370
RAM: 96GB, (48GB for VRAM)
Disk: SSD 1TB

OS: Ubuntu 24.04 server

$ lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 24.04.2 LTS
Release:        24.04
Codename:       noble
$ uname -r
6.8.0-64-generic

関連記事

ROCm/AMDGPU install

公式に従って入れます。

amdgpu-install_6.4.60401-1_all.deb 入れるときにパーミッションエラーが出たのでそこだけフルパスに変更

# install ROCm
wget https://repo.radeon.com/amdgpu-install/6.4.1/ubuntu/noble/amdgpu-install_6.4.60401-1_all.deb
# ----------------
# sudo apt install ./amdgpu-install_6.4.60401-1_all.deb <- これだとエラーが出たのでフルパスに書き換え
sudo apt install /home/<path to dir>/amdgpu-install_6.4.60401-1_all.deb
# ----------------
sudo apt update
sudo apt install python3-setuptools python3-wheel
sudo usermod -a -G render,video $LOGNAME # Add the current user to the render and video groups
sudo apt install rocm

# install AMDGPU
sudo apt install "linux-headers-$(uname -r)" "linux-modules-extra-$(uname -r)"
sudo apt install amdgpu-dkms

終わったら、とりあえず再起動

sudo reboot

Ollama の install

公式に従って入れます。
https://ollama.com/download/linux
アップデートするときも、このコマンドを実行するだけのようです。(古いバージョンは勝手にアンインストールされる)

curl -fsSL https://ollama.com/install.sh | sh

インストーラーが良い感じに使えるデバイスを探して対応してくれるらしい。
ログを取るのを忘れましたが、GPU使えます!みたいなメッセージが出ます。

ollama の設定

AMD Ryzen™ AI 9 HX PRO 370 の内蔵GPU 890M のGPUIDは gfx1150 ですが、HSA_OVERRIDE_GFX_VERSION は 11.0.0 にしておきます。(11.5にするとセグフォる 2025年7月22日時点)

sudo systemctl edit ollama.service
### Editing /etc/systemd/system/ollama.service.d/override.conf
### Anything between here and the comment below will become the contents of the drop-in file

[Service]
Environment="HSA_OVERRIDE_GFX_VERSION=11.0.0"
Environment="LD_LIBRARY_PATH=/opt/rocm/lib"
Environment="OLLAMA_HOST=0.0.0.0"

### Edits below this comment will be discarded

Environment="OLLAMA_HOST=0.0.0.0" は外部から localhost 以外からのアクセスを許可する設定なのでお好みで。

ためしてみる

流行の gpt-oss モデルを試しましたが、 20b(GPUのみ), 120b(GPU+CPU) 共に動作が確認出来ました。
gpt-oss:20b は 10.4 tokens/s
gpt-oss:120b は 5.7 tokens/s (但し、CPUもそれなりに食う)

gpt-oss:20b

動作中のリソース消費:
GPUは99%に張り付き、
CPUは1コア(スレッド)占有

$ ollama run gpt-oss:20b --verbose
>>> 空はなぜ青いのですか?
Thinking...
The user asks in Japanese: "空はなぜ青いのですか?" meaning "Why is the sky blue?" We should answer in Japanese,
explaining Rayleigh scattering, the shorter wavelengths scattering more, etc. Maybe mention also at sunrise/sunset

< --- 略 --- >

total duration:       1m57.311512004s
load duration:        42.410712ms
prompt eval count:    77 token(s)
prompt eval duration: 2.19813595s
prompt eval rate:     35.03 tokens/s
eval count:           1201 token(s)
eval duration:        1m55.070071236s
eval rate:            10.44 tokens/s
========================================= ROCm System Management Interface =========================================
=================================================== Concise Info ===================================================
Device  Node  IDs              Temp    Power     Partitions          SCLK  MCLK     Fan  Perf  PwrCap  VRAM%  GPU%
              (DID,     GUID)  (Edge)  (Socket)  (Mem, Compute, ID)
====================================================================================================================
0       1     0x150e,   18837  51.0°C  43.035W   N/A, N/A, 0         N/A   2800Mhz  0%   auto  N/A     30%    99%
====================================================================================================================
=============================================== End of ROCm SMI Log ================================================
 %CPU  %MEM     TIME+ COMMAND
 106.6   3.8   0:27.82 ollama

gpt-oss:120b

GPUに加えてCPU 5~8スレッドくらい食います。
GPU負荷が上がりきらないので、 VRAM に 70GB くらい割り当てられれば iGPU だけで動くかも?あるいは NPU が使えれば...

>>> 空はなぜ青いのですか?
< --- 略 --- >
total duration:       3m1.011234749s
load duration:        53.811736ms
prompt eval count:    77 token(s)
prompt eval duration: 1.449320539s
prompt eval rate:     53.13 tokens/s
eval count:           975 token(s)
eval duration:        2m50.248872405s
eval rate:            5.73 tokens/s
======================================== ROCm System Management Interface ========================================
================================================== Concise Info ==================================================
Device  Node  IDs              Temp    Power     Partitions          SCLK  MCLK  Fan  Perf  PwrCap  VRAM%  GPU%
              (DID,     GUID)  (Edge)  (Socket)  (Mem, Compute, ID)
==================================================================================================================
0       1     0x150e,   18837  50.0°C  36.057W   N/A, N/A, 0         N/A   N/A   0%   auto  N/A     94%    61%
==================================================================================================================
============================================== End of ROCm SMI Log ===============================================
 %CPU  %MEM     TIME+ COMMAND
 527.2  41.4   4:03.88 ollama
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?