LoginSignup
2
0

More than 1 year has passed since last update.

llama.cppでalpaca(4bit量子化)を動かす

Last updated at Posted at 2023-04-05

llama.cppのコンパイル

git clone git@github.com:ggerganov/llama.cpp.git
cd llama.cpp
make

(投稿時点の最終コミットは53dbba769537e894ead5c6913ab2fd3a4658b738)

tokenizerとalpacaモデルのダウンロード

mkdir models/alpaca_7b
wget -O models/alpaca_7b/tokenizer.model https://huggingface.co/chavinlo/alpaca-native/resolve/main/tokenizer.model
wget -O models/alpaca_7b/ggml-alpaca-7b-q4.bin https://huggingface.co/Sosaka/Alpaca-native-4bit-ggml/resolve/main/ggml-alpaca-7b-q4.bin

新ggml形式へ変換

python convert-unversioned-ggml-to-ggml.py models/alpaca_7b models/alpaca_7b/tokenizer.model
python migrate-ggml-2023-03-30-pr613.py models/alpaca_7b/ggml-alpaca-7b-q4.bin models/alpaca_7b/ggml-alpaca-7b-q4.bin.1

# 変換前のファイル削除
rm models/alpaca_7b/ggml-alpaca-7b-q4.bin.orig
rm models/alpaca_7b/ggml-alpaca-7b-q4.bin

推論

./main -m models/alpaca_7b/ggml-alpaca-7b-q4.bin.1
2
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
0