KDDI株式会社

MLX でローカルLLM：「mlx-community/phi-4-4bit」を試す（M4 Mac mini、mlx-lm を利用）

Posted at 2025-01-12

最近いくつかの内容を試している、MLX でローカルLLM という話の 1つです。

今回は、以下の「mlx-community/phi-4-4bit」を試してみました。

●mlx-community/phi-4-4bit · Hugging Face
　https://huggingface.co/mlx-community/phi-4-4bit

試した内容

結果の一例

最初に、動作させている時の様子を掲載してみます。

プロンプトを含む入力コマンドは以下の通りです。

mlx_lm.generate --model mlx-community/phi-4-4bit --prompt "東京につい て説明して"

今回用いたマシンのスペックは、この後に記載しています。

利用したマシン

今回のお試しで使っているのは、以下の M4 の Mac mini です。

Apple M4チップ
CPU: 10コアCPU
グラフィックス: 10コアGPU
16コアNeural Engine
メモリ: 24GBユニファイドメモリ
ストレージ: 512GB SSD

MLX版の Phi-4 の情報を探す

ここ最近、Microsoft さんの Phi-4 の話題が出てきていました。

それが気になっていたのですが、今回、上記の MLX対応のものがないかや参考になりそうな情報がないかを探してみました。検索先は、以下のとおり Web・Hugging Face・X です。

●phi-4 mlx - Google 検索
　https://www.google.com/search?q=phi-4+mlx

●Full Text Search - Hugging Face
　https://huggingface.co/search/full-text?q=mlx-community%2Fphi-4&type=model

●mlx phi-4 - 検索 / X
　https://x.com/search?q=mlx%20phi-4&src=typed_query&f=live

X での情報の例

そうすると、例えばＸでは以下の情報が出てきました。

Hugging Face にあるモデル

上記の検索結果などから、「mlx-community/phi-4-bf16」と「mlx-community/phi-4-4bit」の 2つを試してみようと思いました。

●mlx-community/phi-4-bf16 · Hugging Face
　https://huggingface.co/mlx-community/phi-4-bf16

●mlx-community/phi-4-4bit · Hugging Face
　https://huggingface.co/mlx-community/phi-4-4bit

モデルのサイズを見た感じ、自分の環境で動きそうなのは「mlx-community/phi-4-4bit」のみという感じでしたが...

この後、「mlx-community/phi-4-4bit」をキーワードにした検索も行ってみたりしました。

●"mlx-community/phi-4-4bit" - Google 検索
　https://www.google.com/search?q=%22mlx-community%2Fphi-4-4bit%22

MLX版の Phi-4 を使う

あとは、上記の「mlx-community/phi-4-4bit に関する Hugging Face のページ」や「そこで登場していた mlx-lm」を使う流れを確認しました。

●mlx-lm · PyPI
　https://pypi.org/project/mlx-lm/

そして以下の部分のコマンドや、その後ろに書いてあるモデル指定の方法を見ていきました。

それで冒頭に書いていた、 mlx_lm.generate --model mistralai/Mistral-7B-Instruct-v0.3 --prompt "hello" という構成のコマンドを使えば良さそうなことが分かりました。

その他に気になった情報

以下は、上記の手順を進める中で見かけて気になった情報の羅列です。

●Phi-3CookBook/translations/ja/md/03.Inference/MLX_Inference.md at main · microsoft/Phi-3CookBook
　https://github.com/microsoft/Phi-3CookBook/blob/main/translations/ja/md/03.Inference/MLX_Inference.md

●2025年さらに熱くなるであろうローカルLLMのためのm4mac購入検討
　https://zenn.dev/afk2777/articles/localllm-mac

Llama 3.1 Swallow関連

以下の Llama 3.1 Swallow の MLX版という話も気になりました。

●mlx-community/Llama-3.1-Swallow-8B-Instruct-v0.3-8bit · Hugging Face
　https://huggingface.co/mlx-community/Llama-3.1-Swallow-8B-Instruct-v0.3-8bit

●mlx-community/Llama-3.1-Swallow-8B-Instruct-v0.3-4bit · Hugging Face
　https://huggingface.co/mlx-community/Llama-3.1-Swallow-8B-Instruct-v0.3-4bit

●Full Text Search - Hugging Face
　https://huggingface.co/search/full-text?q=mlx-community+%2F+Llama-3.1-Swallow&type=model

ローカルLLM を試す環境

ローカルLLM を試す環境についても気になるものがいくつかあります。

メジャーっぽいものもいくつか目にすることも多いですが、このあたりも手を付けてみられればと思っています。

●ローカルでLLMを動かす6つの簡単な方法 + α
　https://zenn.dev/00/articles/6-ways-to-run-llm-locally

●ggerganov/llama.cpp: LLM inference in C/C++
　https://github.com/ggerganov/llama.cpp

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up