More than 1 year has passed since last update.

無料・商用利用可なオープンソースの大規模言語モデル Dolly 2.0(dolly-v2-12b) を試してみた

Last updated at 2023-04-18Posted at 2023-04-13

概要

大規模言語モデル Dolly 2.0 を試してみました。

公式ブログに詳しく書いてありますが、 Alpaca、Koala、GPT4All、Vicuna など最近話題のモデルたちは 商用利用 にハードルがあったが、Dolly 2.0 は自社で準備した 15000件のデータで学習させたデータを使っているためそのハードルがなくなったようです。

ありがたいですね。さっそく試してみました。

2023/04/18 コード更新
Dolly 2.0モデルの独自パイプライン処理が本稿公開時(2023/04/13)から変更されているため、それに対応するよう本稿ソースコードも修正しました。

該当コード（変更後）

res = generate_text(text)
generated_text = res[0]["generated_text"]

res が str 型から list 型に変更されました

できあがったチャット

以下のように Web UI もつけて Dolly 2.0 を Webブラウザから試せるようにしました

実験環境

OS: Ubuntu 22.04 on AWS
- g4dn.12xlarge インスタンス
GPU: Tesla T4 x 4
Python: Python 3.10/Anaconda3
Windows 環境で試した記事はこちら。

STEP 1: Anaconda 仮想環境を作る

dolly-v2 を試すために env-dolly-v2 という名前で Anaconda 仮想環境を作り python 3.10.10 をインストールする

conda を最新版にしておく

conda update -n base -c defaults conda --yes

env-dolly-v2 という仮想環境を作り、 python をインストールする

conda create --yes -n env-dolly-v2
conda activate env-dolly-v2
conda install python=3.10.10 --yes

STEP 2: 必要パッケージのインストール

以下のように、必要なパッケージをインストールしていく。

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu117
pip install accelerate
pip install transformers
pip install fastapi uvicorn

STEP 3: ソースコードを記述する

さっさと試したい場合は、以下の拙作ソースコードをクローンする

git clone https://github.com/riversun/chatux-server-dolly.git

自分で書く場合は、以下のようにする

チャットサーバーを書く

main.py

import torch
from transformers import pipeline
from fastapi import FastAPI
from fastapi.staticfiles import StaticFiles
import uvicorn
import os

# specify chat server
HOST = 'localhost'
PORT = 8001
URL = f'http://{HOST}:{PORT}'

model_name = "databricks/dolly-v2-12b"

current_path = os.path.dirname(os.path.abspath(__file__))
generate_text = pipeline(model=model_name,
                         torch_dtype=torch.bfloat16,
                         trust_remote_code=True,
                         device_map="auto")

app = FastAPI()


@app.get("/chat_api")
async def chat(text: str = ""):
    res = generate_text(text)
    generated_text = res[0]["generated_text"]
    reply = generated_text.replace('\n', '<br>')
    print(f'input:{text} reply:{reply}')

    outJson = {
        "output": [
            {
                "type": "text",
                "value": reply
            }
        ]
    }
    return outJson


app.mount("/", StaticFiles(directory="html", html=True), name="html")


def start_server():
    uvicorn.run(app, host=HOST, port=PORT)


def main():
    start_server()



if __name__ == "__main__":
    main()

チャットクライアントを書く

/html というディレクトリをつくって、以下のような index.html を置く

index.html

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, maximum-scale=1">
    <title>Chat </title>
</head>
<body>
<script src="https://riversun.github.io/chatux/chatux.min.js"></script>
<script>
    const chatux = new ChatUx();

    // initializing param for chatux
 const initParam =
        {
            renderMode: 'auto',
            api: {
                //echo chat server
                endpoint:'/chat_api',
                method: 'GET',
                dataType: 'json'
            },
            bot: {
                botPhoto: 'https://riversun.github.io/chatbot/bot_icon_operator.png',
                humanPhoto: null,
                widget: {
                    sendLabel: 'SEND',
                    placeHolder: 'Say something'
                }
            },
            window: {
                title: 'dolly-v2',
                infoUrl: 'https://github.com/riversun/chatux'
            }
        };
    chatux.init(initParam);
    chatux.start(true);


</script>
</body>
</html>

STEP 4: チャットサーバーを起動する

python main.py

STEP 5: Dolly 2.0 とお話しする

サーバーが起動した後、ブラウザで http://localhost:8001 にアクセスすると、チャットウィンドウが開くので Dolly 2.0 を試すことができる!

以下のようなやりとりをしてみた。

とくに Instruction-Tuned LLM ということを気にせず、好きなこと聞いてみた

質問系

私：What is the difference between classic and classical
(クラシックとクラシカルの違い何？)

ドリーさん:Classic and classical are often used interchangeably to describe literary works that are often referred to as being classic. The essential difference between the two terms is that classical refers to the works beingfit into a classical style, which is based on Ancient Greek and Roman mythology and imagery. This does not necessarily mean the work has to be set in those times, but the imagery and concepts should match that style. For example, a classic work of literature is Jane Eyre, while a classical work is The Odyssey.

私：Do you know the highest mountain in the world?
(世界で１番高い山しってる？）

ドリーさん：Yes, the highest mountain in the world is Mount Everest, but it's not even the highest mountain in Nepal. That honor goes to Mount K2, which is also located in the Himalaya mountain range.

私：Who was the man who starred in the movie "Titanic" released in 1997?
Leonardo DiCaprio
(1997年の映画タイタニックの主演男優は？)

私：Who was the director of that movie?
(その映画の監督は？）

ドリーさん：The director of Fight Club is David Fincher. He is a celebrated director of films such as The Game, Seven, Zodiac, and Fight Club.

生成系

私：Please create a recipe using carrots and potatoes
(人参とじゃがいもつかったレシピつくって）


ドリーさん：carrots, potatoes and other ingredients of your choice (you may use bacon for example)

Boil potatoes and carrots for 10 mins
Dice bacon and cook until crispy
Add other ingredients to the pan and saute until the vegetables are cooked through

Season with salt and pepper to taste
Serve in a bowl

私：Please write a program in Java language to show "Hello world" 3 times.
(Javaで Hello World が3回表示されるプログラム書いて)

ドリーさん：Here is one way to do it.

System.out.println("Hello World");

System.out.println("Hello World");

System.out.println("Hello World");

私：Thanks!

Sure, no problem. I'll send you an email with the details.

私：lol

yes

【動画】試したときの様子。
（g4dn.12xlarge上で動作している実際の応答速度がわかります）

【動画】こちらはローカルPC with RTX 3090 で試したときの様子。応答速度の違いがわかります。

ついでに　「桃太郎」の続き を生成した動画

GPUメモリ使用率

nvidia-smi で確認すると、メモリ使用量はこんな感じ
(torch_dtype=torch.bfloat16 を指定してメモリを節約するモード)

まとめ

Dolly 2.0 をChatUX(WebチャットUI) をつかってお試ししました
Dolly 2.0 は LLM の 商用利用 という高い壁をとっぱらってくれるということで、とても可能性を感じますね。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up