More than 1 year has passed since last update.

llama 3 祭りに参加してみた

Posted at 2024-04-21

Xがあまりに賑やかなので、自分もちょっと参加してみようと思います。画期的なシステムなんでしょうね。

マーク・ザッカーバーグのコメント全文

All right big day here we are releasing the new version of Meta AI, our assistant that you can ask any question across our apps and glasses. And our goal is to build the world's leading AI and make it available to everyone. Now today we are upgrading Meta AI with llama 3, our new state-of-the-art AI model that we're open sourcing. And I'm going to go deeper on llama 3 in just a minute but the bottom line is that we believe that meta AI is now the most intelligent AI assistant that you can freely use. To make that AI even smarter we've also integrated real time knowledge from Google and Bing right into the answers. We're also making Meta AI much easier to use across our apps. We built it into the search box that's right at the top of WhatsApp, Instagram, Facebook and Messenger. So anytime you have a question, you can just ask it right there, and we built a new website meta dot AI, for using it from the web. We're also releasing a bunch of unique creation features. Meta AI now creates animations and it now creates high quality images so fast that it actually generates an update to the images for you in real time as you are typing. It's pretty wild and you can go check it out now on WhatsApp or the website. We are investing massively to build the leading AI and open sourcing our models responsibly is an important part of our approach. The tech industry has shown over and over that open source leads to better, safer and more secure products, faster innovation, and a healthier market. And beyond improving Meta products, these models have the potential to help unlock progress in fields like science, healthcare and more. So today we're open sourcing the first set of our llama 3 models at 8 billion and 70 billion parameters, they have best in class performance for their scale. And we've also got a lot more releases coming soon that are going to bring multimodality and bigger context windows. We're also still training a larger dense model with more than 400 billion parameters. And to give you a sense of llama 3's performance this first release of the 8 billion, is already nearly as powerful as the largest lama 2 model that we released. And this version of the 70 billion model is already around 82 MMLU(Massive Multitask Language Understanding) with leading reasoning and math benchmarks. The 400 billion parameter model is currently around 85 MMLU, but it's still training. So we expect it to be industry leading on a number of benchmarks. We're going to run a blog post with more technical details on all of this, if you want to go deeper. In the meantime, enjoy Meta AI and let me know what you think.

ソースは何か分かりませんが、Xで検索すると動画が出てきます。その全文を文字起こししてみました。

英語の文字起こしは簡単ですね。聞き取りの精度はほぼ100%です。

一番言いたいのは、オープンソースでのリリースだということだと思います。ただ、自分には何がオープンになっているのかがよく分かりません。ニューラルネットの構造がオープンということなのかなとは想像していますが、どなたか解説してくれると嬉しいです。

Hugging Faceにアカウント登録し、アクセストークンを作成する。

Hugging Faceで公開されているモデルを使用する場合は、アクセストークンを使って認証する必要があります。

アカウントの無い方は、登録してアクセストークンを作成する必要があります。

Hugging Faceにログイン

Pythonのコードを実行する際に、自動的にモデルをダウンロードしてきます。

そのために、ログインをする必要があります。

pip install huggingface-hub
huggingface-cli login

    _|    _|  _|    _|    _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|_|_|_|    _|_|      _|_|_|  _|_|_|_|
    _|    _|  _|    _|  _|        _|          _|    _|_|    _|  _|            _|        _|    _|  _|        _|
    _|_|_|_|  _|    _|  _|  _|_|  _|  _|_|    _|    _|  _|  _|  _|  _|_|      _|_|_|    _|_|_|_|  _|        _|_|_|
    _|    _|  _|    _|  _|    _|  _|    _|    _|    _|    _|_|  _|    _|      _|        _|    _|  _|        _|
    _|    _|    _|_|      _|_|_|    _|_|_|  _|_|_|  _|      _|    _|_|_|      _|        _|    _|    _|_|_|  _|_|_|_|

    A token is already saved on your machine. Run `huggingface-cli whoami` to get more information or `huggingface-cli l
ogout` if you want to log out.
    Setting a new token will erase the existing one.
    To login, `huggingface_hub` requires a token generated from https://huggingface.co/settings/tokens .
Enter your token (input will not be visible):

ここで、先ほど作成したアクセストークンをコピペしますが、カーソルがそのまま動かないので分かりにくいですが、ペーストしてリターンすると、下記のメッセージが出ますのでそれで確認できます。

Add token as git credential? (Y/n) n
Token is valid (permission: read).
Your token has been saved to /home/okuno/.cache/huggingface/token
Login successful

サンプルプログラムを実行する。

以下のコードを実行します。

import transformers
import torch

model_id = "meta-llama/Meta-Llama-3-8B"

# pipelineを初期化する際に`use_auth_token`を追加
pipeline = transformers.pipeline(
    "text-generation",
    model=model_id,
    model_kwargs={"torch_dtype": torch.bfloat16},
    device_map="auto",
)

# モデルを使ってテキスト生成を行う
result = pipeline("Hey how are you doing today?",max_new_tokens=50)
print(result)

実行

モデルファイルのダウンロードが始まります。

12:45開始
13:00ダウンロード終了

実行開始まで、約15分かかりました。

結果：

WARNING:root:Some parameters are on the meta device device because they were offloaded to the cpu.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Setting pad_token_id to eos_token_id:128001 for open-end generation.
[{'generated_text': 'Hey how are you doing today? I am doing well. I am a little bit tired because I have been working a lot. I am a little bit tired because I have been working a lot. I am a little bit tired because I have been working a lot. I am a'}]

ワーニングで出ていますが、回答をしています。

I am a little bit tired because I have been working a lot. が繰り返されていて、気持ち悪いです。

まあ、変な感じですが、パラメーターの微調整が必要なんでしょうね。

ただ、自分の環境だと十分な速度が出ないです。1.30s/itで動いており、GPUだけだとメモリ不足なので、勝手にDRAMを使って実行しているようです。50トークンを出力するのに、だいたい１分ぐらいかかります。

とりあえずのトライでしたが、こんなところで。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up