More than 1 year has passed since last update.

OLMo-7Bを試してみる on Databricks

Posted at 2024-02-04

導入

こちらのBlogで紹介されているOLMoが気になったので、触ってみました。

OLMoとは

The Allen Institute for AI (AI2) 社がリリースしたLLMです。
OLMoは学習のデータセットやコード、モデル全てをOSSで公開しており、真にオープンであるとアナウンスされています。

正直、詳細はまだ全然追えていないのですが、どんなものか確認してみたく、以下の7Bモデルを少し触ってみました。

動作検証はDatabricks on AWS上で実施しました。
DBRは14.2ML、i3.2xlargeのCPUクラスタを利用しています。

Step1. パッケージインストール

必要なパッケージをインストール。

# パッケージのインストール
%pip install -U transformers accelerate ai2-olmo

dbutils.library.restartPython()

Step2. モデルのロードと推論実行

公式サンプル通りにモデルロードと推論を実行します。
モデルは事前にダウンロードしたものを利用しました。

import hf_olmo
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer


model_path = "/Volumes/training/llm/model_snapshots/models--allenai--OLMo-7B"

olmo = AutoModelForCausalLM.from_pretrained(model_path)
tokenizer = AutoTokenizer.from_pretrained(model_path)

message = ["Language modeling is "]
inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
print(tokenizer.batch_decode(response, skip_special_tokens=True)[0])

出力

Language modeling is 
an 
active area of research in the machine learning and NLP communities.  
Recent breakthroughs in language 
modeling include 
large-scale pre-training models 
for unsupervised or 
self-supervised language 
pre-training and the GPT-3 model. 
This project will investigate 
how the GPT-3 
model can be leveraged to 
improve natural language 
generation, dialogue and 
task-

Step3. もう少し推論

テキスト生成部を関数でラップし、いくつか実行してみます。

def generate_text(prompt:str):
    message = [prompt]
    inputs = tokenizer(message, return_tensors='pt', return_token_type_ids=False)
    response = olmo.generate(**inputs, max_new_tokens=100, do_sample=True, top_k=50, top_p=0.95)
    return tokenizer.batch_decode(response, skip_special_tokens=True)[0]

print(generate_text("Databricks is "))

出力

Databricks is 
in addition available for an additional fee of 30% per year.
The 30% is paid as part of the annual subscription fee.

### License

The `Databricks` theme is released under the MIT license, as found in the 
[LICENSE](LICENSE.md) file.


## Credits

This theme was created by [Michael Rose](https://mademistakes.com), based off the 
[minty](https://m

うーん、これは何の説明なんだろう。。。

次、日本語での生成。

print(generate_text("データブリックスとは、"))

出力

データブリックスとは、 まずノードによってポッドが リストされたノードからボリュームのデータがポッドから読み取られます 。 ノード1上で 、ノード2上でポッドのプロセスが 起動され ます。 プロセスの `ls -al` コマンド を実行し `/

日本語の文章を生成はしてくれましたが、内容はまったく異なるものになっていますね。

最後にnpaka氏リスペクトのおまけ。

print(generate_text("The cutest girl in Madoka Magica is "))

出力

The cutest girl in Madoka Magica is アナウンサ (Anzu). She is a first year student at the Kamimami Gakuen in Aichi and is the daughter of a former police detective. Her name means “everlasting” in the Japanese language. She was known as the “most normal” character in the series. In the anime, Anzu’s appearance was that of a cute young girl with green eyes and a bright smile. Her hair was short and curly, and her eyes were blue. Her

ノーコメント。

まとめ

真にオープンなLLMであるOLMoを試してみました。
推論結果はともかく、これらの学習プロセスやデータセットが公開されるというのは非常に興味深いです。
このような動きが今後も広がっていくのかどうか、楽しみにしています。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up