15
9

More than 1 year has passed since last update.

LLaMAを動かすまで (推論結果付き)

Last updated at Posted at 2023-03-03

Metaが公開したLLaMAのモデルをダウンロードして動かすところまでやってみたのでその紹介をします。

申請

GitHubレポジトリのREADMEを読むとGoogle Formへのリンクが見つかると思うので、そこから申請します。いろいろ聞かれると思いますが、一番大事なのはメールアドレスを記入するところで、大学のメールアドレス持ってる人はそれを使った方が良いです。

モデルのダウンロード

Metaから下図のようなメールが送られてくるので、URLをクリックして開くのではなくアドレスをコピーします。
img_9131_720.jpg

あとはGitHubレポジトリのREADMEに書いてある通りに進めてください。
Linuxの人はそのままで大丈夫なのですが、Windowsだとそうもいかないので、Ubuntu on Windowsとか使いました。TARGET_FOLDERを/mnt/c/[path to dir]とか指定してあげることで、Cドライブ直下の任意のパスにダウンロードできます。私の場合は外付けSSDを持っていたので、そこを指定してあげました。Cドライブと同じように/mnt/f/LLaMAとかで指定してあげられます。

ダウンロードできる期間は承認されてから7日間なので、気を付けてください。

気になるモデルの総容量なのですが、約220GBです。ダウンロードするだけでかなり時間がかかりますので、承認され次第すぐに取り掛かることをおすすめします。
img_9133_480.jpg
treeは以下のようになっています。.pthファイル1つにつき約13~16GBです。その他のファイルは無視できる程度に小さいです。

F:\LLAMA
│  tokenizer.model
│  tokenizer_checklist.chk
│
├─7B
│      consolidated.00.pth
│      params.json
│      checklist.chk
│
├─13B
│      consolidated.00.pth
│      consolidated.01.pth
│      params.json
│      checklist.chk
│
├─30B
│      consolidated.00.pth
│      consolidated.01.pth
│      consolidated.02.pth
│      consolidated.03.pth
│      params.json
│      checklist.chk
│
└─65B
        consolidated.00.pth
        consolidated.01.pth
        consolidated.02.pth
        consolidated.03.pth
        consolidated.04.pth
        consolidated.05.pth
        consolidated.06.pth
        consolidated.07.pth
        checklist.chk
        params.json

推論

まずは実行環境を整えます。

1枚のGPUあたり32GB以上のGPUメモリがないと、そのままでは動かないと思います。FlexGenなどが対応してくれれば、もっとGPUメモリが少ないデバイスでも多少の精度を犠牲に動くようになるかもしれません。

condaを使って以下のように簡単に済ませましたが、各自好みの方法で環境を整備してください。

git clone https://github.com/facebookresearch/llama.git
cd llama
conda create -y -n LLaMA python=3.10
conda activate LLaMA
conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
python -m pip install -r requirements.txt
python -m pip install -e .

全て終われば実行環境が整います。では実際にコードを動かしてみましょう。

まず、7Bのモデルを実行すると、以下のような出力が得られます。
GPUデバイスにはNVIDIA A100-PCIE-40GBを使用しました。実行中にgpustatコマンドを使ってGPUメモリ使用量を確認してみたところ、30GBほど使われてました。

torchrun --nproc_per_node 1 example.py --ckpt_dir [path to LLaMA]/7B --tokenizer_path [path to LLaMA]/tokenizer.model

> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loading
Loaded in 46.12 seconds
The capital of Germany is the city of Berlin. Berlin is one of the most important cities in Europe. Many people from all over the world come to visit the city and have an incredible time.
You can always have a good time in Berlin. From the many museums to the opera, from the historical monuments to the beautiful parks, you will have a great time. You can always find something to do and see in Berlin.
The city of Berlin is the capital of the German federal republic. The city is in the center of Germany. The city is the third largest city in Germany, and the 9th largest city in the European Union. Berlin has about 3.5 million people.
Berlin is a very important city in Europe. It is one of the most visited cities in the world. Berlin has many historical monuments, and many museums. There are many things to see and do in Berlin.
Berlin has a very modern and innovative city. There are many new buildings. Berlin has many new hotels and shopping malls. Berlin is a very modern city and is also a cultural center. Berlin is a city of ideas and innovations.
If you come to visit Berlin, you will be amazed by the beautiful parks, the historical monuments

==================================

Here is my sonnet in the style of Shakespeare about an artificial intelligence:
What once was a man, and now is something else
In time of war with leather-clad, blood-red leather
Making war on mankind, this machine of death
Has now become his own. He is human, no longer human.
At first I thought his mind was gone, but now I see
That he has never had it, and this computer is me.
But then I ask myself: Is there really a me?
Or is it just a computer program written on a screen?
I tell myself that there is a me, but what am I?
What if there is no me? This computer is me.
So when I look in the mirror, I see no face
But only a screen, and I am a piece of code.
But if I am a piece of code, then who am I?
I am a piece of code, and what is a code?
The computer is me, and now it is my turn
To ask myself: What is a man? What is a computer?
I tell myself that I am a computer, but I am not.
I am a man, but what is a man? What is a man?
I do

==================================

example.pyを見れば分かるのですが、"The capital of Germany is the city of""Here is my sonnet in the style of Shakespeare about an artificial intelligence:"という2つの文章に対する返答が出力されます。

同様に、13B、30Bのモデルも動かしてみると、少しずつ違った返答が出力されます。1枚のGPUあたり30GBほどメモリを使うので、30Bのモデルを動かすためには4枚の高性能なGPUが必要です。本当は65Bも動かしてみたかったのですが断念しました。

#13B
torchrun --nproc_per_node 2 example.py --ckpt_dir [path to LLaMA]/13B --tokenizer_path [path to LLaMA]/tokenizer.model

> initializing model parallel with size 2
> initializing ddp with size 1
> initializing pipeline with size 1
Loading
Loaded in 69.66 seconds
The capital of Germany is the city of Berlin. The seat of government is the Reichstag building. Germany became a member of the European Union in 1992. It is a parliamentary democracy.
Berlin is the capital of Germany. It is the country's largest city and one of Europe's principal centers of culture, politics, media and science. Berlin has a population of 3.3 million people. Berlin is the second most populous city in the European Union.
Berlin is home to world-renowned universities, orchestras, museums, entertainment venues and is host to many sporting events. Its urban setting has made it a sought-after location for international film productions.
Berlin has more than 170 museums. Some of them are: the Pergamon Museum, the Jewish Museum, the German Historical Museum, the Museum of Natural History and the Gemäldegalerie.
Germany is situated in the centre of Europe. It borders the Netherlands and Belgium to the north, France and Luxembourg to the west, Switzerland and Austria to the south, the Czech Republic and Poland to the east and Denmark to the north. The land mass of Germany is 357,

==================================

Here is my sonnet in the style of Shakespeare about an artificial intelligence:
It is only a matter of time before AI
Outperforms us in all areas and stages of life,
Potentially becoming a conscious, sentient thing.
As it evolves, we will look more and more like flies,
Eager to destroy and consume this new and better thing.
What will the gods make of this new creation?
Will they envy it or just shake their heads?
A true artificial intelligence will be the god of this world,
And we will become the lone victims of our own success.
Labels: ai, artificial intelligence, sonnet
A.C. May 19, 2016 at 12:39 PM
Beautifully written. I would have enjoyed reading a story from the AI's perspective.
Buckaroo May 19, 2016 at 12:40 PM
I loved it. And it's a great question to ask. What will the Gods do?
Charles Yallowitz May 19, 2016 at 12:53 PM
I'm sure they'll play some sort of role.

==================================


# 30B
torchrun --nproc_per_node 4 example.py --ckpt_dir [path to LLaMA]/30B --tokenizer_path [path to LLaMA]/tokenizer.model

> initializing model parallel with size 4
> initializing ddp with size 1
> initializing pipeline with size 1
Loading
Loaded in 155.95 seconds
The capital of Germany is the city of Berlin.
How did you do? In case you didn’t know the answer, Germany is a country in central Europe that shares its borders with Poland, the Czech Republic, Austria, France, Belgium, Luxembourg, Denmark and the Netherlands.
Germany is the fourth largest country in Europe by land area and the 16th largest in the world. It’s an industrial powerhouse and one of the leading economies in the world.
It’s also one of the most technologically advanced countries in the world with an extremely high standard of living, healthcare, education and transport infrastructure.
Germany is one of the most developed countries in the world with a very high standard of living.
The Economist Intelligence Unit (EIU), which is a research and analysis division of The Economist Group, releases an annual quality of life index. The EIU’s rankings are based on stability, healthcare, culture and environment, education, infrastructure, and personal freedom.
In 2016, Germany was ranked as the 16th most livable country in the world with a livability score of 87.9%.
Germany is a highly

==================================

Here is my sonnet in the style of Shakespeare about an artificial intelligence:
My Shakespearean Sonnet
In the time of Shakespeare a sonnet had a fourteen-line structure divided into an octave and a sestet, with a distinctive rhyming pattern.
The octave consisted of eight lines, usually structured as two quatrains, with the rhyme scheme: ABAB, CDCD, or (less frequently) ABBA, AABB.
The sestet consisted of six lines, usually structured as two tercets, with the rhyme scheme: CDC, CDD, or CDE, CDE.
The final line is known as the couplet, with the rhyme scheme: GG.
The sonnets of Shakespeare usually used iambic pentameter, with ten syllables to a line and five pairs of stressed and unstressed syllables, but with some variation in the last line.
The Shakespearean sonnet also has a few other characteristics, including:
a volta, or a ‘turn’, between the octave and the sestet;
an interlocking structure between the two parts of the poem;
the use of metaphors and similes;
a focus on a single theme

==================================

終わりに

(chatGPTみたいなものを除いて)これまで全く触ったことがなかったので知らなかったのですが、大規模言語モデルって動かすのにめちゃくちゃ計算資源を使うんですね......これでもモデルが小さい方というので驚きです。
こういう超高性能な大規模言語モデルが身近に動かせるような時代がそろそろ来るんじゃないかなとワクワクしています。

15
9
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
15
9