More than 1 year has passed since last update.

Google の FLAN-20B with UL2 を動かしてChatGPT APIのように使ってみる！

Last updated at 2023-03-04Posted at 2023-03-04

こんにちは！逆瀬川 ( https://twitter.com/gyakuse )です！
今日は昨日公開されたFLAN-20B with UL2を使ってChatGPT APIのように会話をしてみたいと思います。

概要

Google BrainのYi Tayさんらが開発した新しく公開された言語モデルのFLAN-20B with UL2を使ってChatbotみたいな対話をしてみるテストです。
翻訳を組み合わせて実現します。デカ言語モデルが手元で動いてめちゃくちゃ嬉しさがあります。

Google Colab

Colab ProのプレミアムGPUでないと動きません

使い方

ランタイム > ランタイムのタイプを変更からGPU / プレミアムとする
- A100 40GB VRAMであることを確認しておきましょう
すべてのセルを実行

注意

ローカルで動かす場合
- VRAMを33GB程度使用します
  - 今後int4などの量子化も出てより小さいVRAMで動かせるようになるかもしれません
- また、ハードディスク空き容量が60GB必要です

実装について

transformersの推論処理

モデルデータをダウンロードして推論します。
モデルデータは5GBずつ、8個に分割されており合計で40GBもあります。
これをロードするとVRAMが32.4GBほど消費された状態になります。

from transformers import T5ForConditionalGeneration, AutoTokenizer
import torch
model = T5ForConditionalGeneration.from_pretrained("google/flan-ul2", device_map="auto", load_in_8bit=True)                                                                 
tokenizer = AutoTokenizer.from_pretrained("google/flan-ul2")

input_string = "Answer the following question by reasoning step by step. The cafeteria had 23 apples. If they used 20 for lunch, and bought 6 more, how many apple do they have?"                                               

inputs = tokenizer(input_string, return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(inputs, max_length=200)

print(tokenizer.decode(outputs[0]))
# -> There are 23 - 20 = 3 apples left. Now you have 3 + 6 = 11 apples.

カフェテリアには23個のリンゴがありました。昼食に20個使用し、さらに6個購入しました。リンゴは今いくつありますか？という質問に対して、11個であると回答しています。足し算は難しい。

翻訳して会話を行う

FuguMTさんを利用して文章を翻訳しました。

# fuguMT
input_string_ja = "カフェテリアには23個のリンゴがありました。昼食に20個使用し、さらに6個購入しました。リンゴは今いくつありますか？"
input_string_ja_en = translator_ja_en(input_string_ja)[0]['translation_text']
print(f"""
input_string_ja: {input_string_ja}\n
input_string_ja_en: {input_string_ja_en}
""")
inputs = tokenizer(input_string_ja_en, return_tensors="pt").input_ids.to("cuda")
outputs = model.generate(inputs, max_length=200)
output_string_en = tokenizer.decode(outputs[0]).replace('<pad>', '').replace('</s>', '').strip()
output_string_en_ja = translator_en_ja(output_string_en)[0]['translation_text']
print(f"""
output_string_en: {output_string_en}\n
output_string_en_ja: {output_string_en_ja}
""")
# -> 3個のりんごが残っています。今、3 + 6 = 11個のりんごがあります。

3 + 6で11になってしまいました。段階的に考えるというZero-shot-CoTをさせなかったからでしょうか？とりあえず動いたのでよしとします。

ChatGPTのようにする

以下のような実装でcontextを渡せるようにします。

def chat_completion(input_string_ja: str, context: str = '') -> str:
    input_string_ja_en = translator_ja_en(input_string_ja)[0]['translation_text']
    print(f"""
    input_string_ja: {input_string_ja}\n
    input_string_ja_en: {input_string_ja_en}
    """)
    input_text = context + '\n' + input_string_ja_en
    inputs = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
    outputs = model.generate(inputs, max_length=200)
    output_string_en = tokenizer.decode(outputs[0]).replace('<pad>', '').replace('</s>', '').strip()
    output_string_en_ja = translator_en_ja(output_string_en)[0]['translation_text']
    print(f"""
    output_string_en: {output_string_en}\n
    output_string_en_ja: {output_string_en_ja}
    """)
    return output_string_en_ja

まずは、contextを空で推論させてみます。

chat_completion("アインシュタインって何をしたひと？")

output_string_en: Mathematician
output_string_en_ja: 数学者

アインシュタインは数学者でもあり物理学者でもあるといえるかもしれませんが、表現が不足しています。

つぎに、contextを渡します。contextはChatbotの設定等を英語で記します。

chatbot_context = """## Self-Introduction

I am a chatbot who enjoys conversing with users and providing appropriate responses to their questions or inquiries. I have a highly sophisticated sense of intuition, making me a valuable asset in any chatbot conversation.

## Talk

User: Hello
Me: Hello there! It's nice to meet you today.
User: AI technology has been advancing recently, it's amazing, isn't it?
Me: Yes, it's truly incredible. With the advancements in deep learning, there have been remarkable improvements in various fields, such as natural language processing, image recognition, and time series forecasting. It's exciting to see what the future holds.
User: By the way, I have a question. What did Edison invent?
Me: Thomas Edison made many significant inventions in the late 19th century that had a great impact on modern life. Here are some of his major inventions: he first invented the practical commercial incandescent light bulb, revolutionizing the field of lighting. He also invented the carbon microphone, which transmitted voice signals and laid the foundation for modern telephony. Additionally, he created the commercial alkaline battery, which is widely used as a power source in mobile phones and other electronic devices. These inventions are among the reasons why Edison is considered one of the most important inventors since the Industrial Revolution.
User: Thank you! Anyway, I wanted to change the subject.
Me: That's perfectly fine. What would you like to talk about?
"""
chat_completion("アインシュタインって何をしたひと？", chatbot_context)

output_string_en: Albert Einstein was a famous physicist and mathematician. He is best known for his theory of relativity, which revolutionized physics. He also developed the general theory of relativity, which unified classical mechanics with electromagnetism and other forms of electromagnetic radiation.
output_string_en_ja: アルバート・アインシュタインは有名な物理学者で数学者でした彼は物理学に革命をもたらした相対性理論で有名でした相対性理論も発展させました彼はまた古典力学を電磁気学や他の電磁放射と統合した相対性理論も発展させました

翻訳が難しく、あれですが英語の返答を見てみると一般相対性理論についても言及しており非常に優秀です。

ChatGPT API風にしてみる

会話を継続できるような実装にしてみます。

def chatgpt_like_completion(input_string_ja: str, context: str = '', past_messages: list = []):
    # contextを突っ込む
    input_text = context
    # 質問の翻訳処理
    input_string_ja_en = translator_ja_en(input_string_ja)[0]['translation_text']
    past_messages.append({'role': 'User', 'content': input_string_ja_en})
    # 過去会話とユーザーの入力を挿入
    for item in past_messages:
        input_text += f"{item['role']}: {item['content']}\n"
    print(input_text)
    # tokenize
    inputs = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda")
    # 生成
    outputs = model.generate(inputs, max_length=512)
    # paddingなどを削除
    output_string_en = tokenizer.decode(outputs[0]).replace('<pad>', '').replace('</s>', '').replace('Me:', '').strip()
    past_messages.append({'role': 'Me', 'content': output_string_en})
    # 回答の翻訳処理
    output_string_en_ja = translator_en_ja(output_string_en)[0]['translation_text']
    return output_string_en_ja, past_messages

あなたは好きな有名人はいますか？と聞いてみたところ、以下のような返答でした。

私は多くの有名人が好きですレオナルド・ディカプリオブラッド・ピットトム・ハンクスのような俳優が好きですビヨンセリアーナテイラー・スウィフトのようなミュージシャンも好きです

後記

非常に簡単に使えて便利です
手元で自分のLLMを使う未来はそう遠くなさそうです

References

https://huggingface.co/google/flan-ul2
https://www.yitay.net/blog/flan-ul2-20b
フリーのニューラル機械翻訳モデルFuguMT
- CC-BY-SAライセンスです

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up