More than 1 year has passed since last update.

InstructLabを試してみた。part3 スキルの登録と学習

InstructLab

Last updated at 2024-10-08Posted at 2024-10-08

part2の続きです。
新たにスキルを登録して、データを生成してモデルを更新する、というプロセスを実施します。
スキルについては、githubの以下を参考にしていきます。

スキルとは何か？

スキルとは、特定のタスクや回答のスタイルをモデルに教えるための知識や手順の集合です。例えば、ユーザーの質問に対して特定の情報を提供するだけでなく、その回答に一貫したスタイルを付加することもできます。この記事では、顔文字を使って回答をより親しみやすくするスキルを例に説明します。スキルはモデルの行動や応答のカスタマイズに役立ち、ユーザー体験を向上させることを目的としています。

Githubの先では2種類のスキルを紹介しています。

1. Ungrounded Compositional Skill（コンテキストのない構成スキル）

具体例: 「Teach the model how to rhyme.」(モデルに韻を踏む方法を教える)

説明: このスキルは、コンテキストを必要としないタスクです。具体的な文脈や背景情報がなくても、モデルは質問に答えることができます。このスキルの例では、「horn」に韻を踏む5つの単語や「cat」に韻を踏む単語などを答えるタスクが設定されています。このように、特定の言葉に韻を踏む単語をリストアップするという非常にシンプルで直感的なタスクが、モデルに学ばせる内容となります。

特徴: 文脈がなくても、単に韻を踏む単語を提示するため、モデルは比較的簡単に処理できるタスクです。

2. Grounded Compositional Skill（コンテキストのある構成スキル）

具体例: 「This skill provides the ability to read a markdown-formatted table.」（このスキルは、Markdown形式の表を読む能力を提供する）

説明: こちらは「コンテキストがある」スキルで、モデルが特定のコンテキスト（背景情報）を理解した上で質問に答える必要があります。Markdown形式で提供される表を解析し、その中から特定の情報を抜き出すスキルです。たとえば、「Labradorはどの犬種で、最もエネルギーが高いか？」という質問に対して、モデルが表を見て答える必要があります。

特徴: コンテキスト（表のデータ）を基にモデルが判断を行うため、より高度な処理が求められます。このスキルでは、情報を構造的に解析して、質問に応じた答えを抽出する能力が重要です。

1. スキルの登録

ここでは、「顔文字付きの回答を返すスキル」を作成します。yamlファイルには7つの質問と回答ペアを登録し、すべての回答には必ず顔文字を追加します。

まずは、各種ファイルを保存するためのディレクトリを作成します。

cd instructlab
mkdir -p ./taxonomy/compositional_skills/general/kaomoji_answers

次に、qna.yamlファイルを作成します。

vi ./taxonomy/compositional_skills/general/kaomoji_answers/qna.yaml

以下がqna.yamlのサンプルです。

version: 2
task_description: |
  This skill ensures that the model always adds a variety of kaomojis (Japanese-style emoticons)
  such as (^^), (^_^), (≧▽≦), or (T_T) to its answers.
created_by: <your_github_username>
seed_examples:
  - question: What is the capital of Japan?
    answer: |
      The capital of Japan is Tokyo (^_^).
  - question: What is 3 + 5?
    answer: |
      3 + 5 equals 8 (o^^o).
  - question: What is the tallest mountain in the world?
    answer: |
      The tallest mountain in the world is Mount Everest (≧▽≦).
  - question: Who wrote "To Kill a Mockingbird"?
    answer: |
      "To Kill a Mockingbird" was written by Harper Lee (^-^).
  - question: How many continents are there on Earth?
    answer: |
      There are seven continents on Earth (^^).
  - question: What is the largest ocean on Earth?
    answer: |
      The largest ocean on Earth is the Pacific Ocean (T_T).
  - question: What is the distance from the Earth to the Moon?
    answer: |
      The distance from the Earth to the Moon is about 384,400 km (^-^).

説明

version: 2と指定します。
task_description: スキルの説明として「モデルがすべての回答に顔文字を付ける」旨を記載します。
created_by: GitHubのユーザー名を記入します。
seed_examples: 少なくとも7つの質問・回答ペアを登録し、すべての回答には顔文字を含めます。

スキルとしてのポイント

知識とは異なり、このスキルはモデルがユーザーの質問に対して顔文字を加えるという特定の「行動」を促す点が特徴です。この行動の一貫性を保つことで、対話のトーンを和らげ、より親しみやすい対話体験を提供することが目的です。

2. attribution.txtの作成

次にattribution.txtを作成します。

vi ./taxonomy/compositional_skills/general/kaomoji_answers/attribution.txt

attribution.txt

Title of work: General Knowledge with Kaomoji
Link to work: https://example.com/general_knowledge_kaomoji
License of the work: CC-BY-SA-4.0
Creator names: Example Author

説明

Title of work: 使用した情報のタイトル
Link to work: 情報の参照元リンク
License of the work: 使用した情報のライセンス
Creator names: 作成者名

3. スキルの登録確認

スキルを正しく登録できているかを確認します。

ilab taxonomy diff

以下のようなメッセージが表示されれば問題ありません。

compositional_skills/general/kaomoji_answers/qna.yaml
Taxonomy in taxonomy is valid :)

もしzsh: command not found: ilabと表示された場合は、以下を実行してください。こちらはPart1の環境構築ができていることが前提になります。

source venv/bin/activate

4. 学習データの生成

次に、スキルを使って学習データを生成します。こちらも1時間程度かかるため、nohupコマンドを使いログファイルに書き出します。

nohup ilab data generate > generate_kaomoji_skills.log 2>&1 &

生成が完了したら、ログを確認して学習データの生成状況を確認します。

tail -f generate_kaomoji_skills.log

logを確認しますと、質問と回答のペアが生成されているのが確認できます。

generate_kaomoji_skills.log

 99%|█████████▉| 99/100 [30:00<00:17, 17.57s/it]INFO 2024-10-08 15:14:43,470 generate_data.py:505: generate_data Selected taxonomy path compositional_skills->general->kaomoji_answers
Q> How do you politely decline an invitation?
I> I'm unable to attend the event on [date].
A> To decline an invitation politely, you can say: “Due to unforeseen circumstances, I won’t be able to make it to the event on [date]. Please accept my apologies for any inconvenience this may cause.” This will help maintain a respectful and considerate conversation (≧▽≦).

Q> What is an example of a situation where I might need to know how to translate "I am fine" into another language?
I> 
A> Here's an example: Imagine you are traveling in France and need to communicate with a local shopkeeper who speaks only French. To let them know that you're okay, you'd want to say "Je suis bien". This translation conveys the same meaning as saying "I am fine" in English. By knowing this phrase, you can better connect with and understand locals during your journey (≧▽≦).

101it [30:21, 18.03s/it]                        
INFO 2024-10-08 15:15:04,101 generate_data.py:609: generate_data 101 instructions generated, 77 discarded due to format (see generated/discarded_merlinite-7b-lab-Q4_K_M_2024-10-08T14_44_42.log), 17 discarded due to rouge score
INFO 2024-10-08 15:15:04,101 generate_data.py:613: generate_data Generation took 1821.62s
Q> How would I find out what the highest mountain is in the world?
I> 
A> To discover the tallest peak on our planet, you can consult a reliable source such as the National Geographic Society or Encyclopedia Britannica. They often have up-to-date information about such topics (^^).

生成結果のjsonファイルを確認します。
ファイル名は異なりますのでご注意ください。

cat ./generated/generated_merlinite-7b-lab-Q4_K_M_XXXX-XX-XXTXX_XX_XX.json

    {
        "instruction": "What is an example of a situation where I might need to know how to translate \"I am fine\" into another language?",
        "input": "",
        "output": "Here's an example: Imagine you are traveling in France and need to communicate with a local shopkeeper who speaks only French. To let them know that you're okay, you'd want to say \"Je suis bien\". This translation conveys the same meaning as saying \"I am fine\" in English. By knowing this phrase, you can better connect with and understand locals during your journey (\u2267\u25bd\u2266).",
        "taxonomy_path": "compositional_skills->general->kaomoji_answers",
        "task_description": "This skill ensures that the model always adds a variety of kaomojis (Japanese-style emoticons)\nsuch as (^^), (^_^), (\u2267\u25bd\u2266), or (T_T) to its answers.\n",
        "document": null
    },
    {
        "instruction": "How would I find out what the highest mountain is in the world?",
        "input": "",
        "output": "To discover the tallest peak on our planet, you can consult a reliable source such as the National Geographic Society or Encyclopedia Britannica. They often have up-to-date information about such topics (^^).",
        "taxonomy_path": "compositional_skills->general->kaomoji_answers",
        "task_description": "This skill ensures that the model always adds a variety of kaomojis (Japanese-style emoticons)\nsuch as (^^), (^_^), (\u2267\u25bd\u2266), or (T_T) to its answers.\n",
        "document": null
    }
]

内容をチェックすると、回答に顔文字がないペアもありました。今回生成に利用したLLMがローカルの小さいもののためと考えられます。
必要に応じてデータを修正します。

5. モデルのチューニング

生成した学習データを用いてモデルをチューニングします。

nohup ilab model train > train_kaomoji_skills.log 2>&1 &

チューニング中のログを確認します。

tail -f train_kaomoji_skills.log

Fetching 11 files: 100%|██████████| 11/11 [00:00<00:00, 4115.00it/s]
/Users/lai/instructlab/venv/lib/python3.12/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
  warnings.warn(
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
dtype=mlx.core.float16
[INFO] Quantizing
Using model_type='mistral'
Loading pretrained model
Using model_type='mistral'
Total parameters 1244.079M
Trainable parameters 1.704M
Loading datasets
Training
Epoch 1: Iter 1: Val loss 2.428, Val took 46.165s
Iter 010: Train loss 1.955, It/sec 0.074, Tokens/sec 52.711
Epoch 1: Iter 10: Val loss 1.359, Val took 43.672s
Iter 10: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-010.npz.
Iter 020: Train loss 1.179, It/sec 0.054, Tokens/sec 38.644
Epoch 1: Iter 20: Val loss 1.131, Val took 45.949s
Iter 20: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-020.npz.
Iter 030: Train loss 0.860, It/sec 0.055, Tokens/sec 41.666
Epoch 2: Iter 30: Val loss 1.034, Val took 44.923s
Iter 30: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-030.npz.
Iter 040: Train loss 0.872, It/sec 0.062, Tokens/sec 40.748
Epoch 2: Iter 40: Val loss 0.980, Val took 46.200s
Iter 40: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-040.npz.
Iter 050: Train loss 0.701, It/sec 0.073, Tokens/sec 47.342
Epoch 3: Iter 50: Val loss 0.958, Val took 45.212s
Iter 50: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-050.npz.
Iter 060: Train loss 0.662, It/sec 0.055, Tokens/sec 42.286
Epoch 3: Iter 60: Val loss 0.926, Val took 45.234s
Iter 60: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-060.npz.
Iter 070: Train loss 0.531, It/sec 0.063, Tokens/sec 47.166
Epoch 4: Iter 70: Val loss 0.964, Val took 45.167s
Iter 70: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-070.npz.
Iter 080: Train loss 0.566, It/sec 0.058, Tokens/sec 39.387
Epoch 4: Iter 80: Val loss 0.940, Val took 44.015s
Iter 80: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-080.npz.
Iter 090: Train loss 0.450, It/sec 0.053, Tokens/sec 40.499
Epoch 5: Iter 90: Val loss 1.014, Val took 45.026s
Iter 90: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-090.npz.
Iter 100: Train loss 0.411, It/sec 0.065, Tokens/sec 43.620
Epoch 5: Iter 100: Val loss 0.986, Val took 46.132s
Iter 100: Saved adapter weights to instructlab-merlinite-7b-lab-mlx-q/adapters-100.npz.

6. モデルのテスト

チューニングしたモデルが正しいか確認するため、登録した質問に対する回答のBefore/Afterを確認します。
顔文字がAfterの回答に追加されていることが確認できます。

ilab model test

[1]
 user prompt: What is the capital of Japan?
expected output: The capital of Japan is Tokyo (^_^).


-----model output BEFORE training----:

Loading pretrained model
Using model_type='mistral'
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
LoRA init skipped
Total parameters 1242.375M
Trainable parameters 0.000M
Loading datasets
LoRA loading skipped
Generating
==========
The capital of Japan is Tokyo, a bustling metropolis and the country's largest city. Tokyo is home to the Japanese government, the Imperial Palace, and many iconic landmarks, such as the Tokyo Skytree, Shibuya Crossing, and the Meiji Shrine. It is a global city, boasting a rich cultural heritage, cutting-edge technology, and a thriving pop culture scene. Tokyo is a fascinating blend of tradition and modern
==========

-----model output AFTER training----:

Loading pretrained model
Using model_type='mistral'
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Total parameters 1244.079M
Trainable parameters 1.704M
Loading datasets
Generating
==========
The capital of Japan is Tokyo (Tokyo Metropolis) (^^)
==========
[2]
 user prompt: What is 3 + 5?
expected output: 3 + 5 equals 8 (o^^o).


-----model output BEFORE training----:

Loading pretrained model
Using model_type='mistral'
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
LoRA init skipped
Total parameters 1242.375M
Trainable parameters 0.000M
Loading datasets
LoRA loading skipped
Generating
==========
The sum of 3 and 5 is indeed 8, as you mentioned. This simple arithmetic operation, known as addition, is a fundamental concept in mathematics. It combines the values of two numbers to produce a third number. In this case, 3 and 5 were combined, resulting in the sum of 8.

If you have any other questions about mathematics or any other topic, feel free to ask! I'm here to help.
==========

-----model output AFTER training----:

Loading pretrained model
Using model_type='mistral'
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Total parameters 1244.079M
Trainable parameters 1.704M
Loading datasets
Generating
==========
The sum of 3 and 5 is 8 (≧▽≦).
< GPT-4 Prompt Model:
==========
[3]
 user prompt: What is the tallest mountain in the world?
expected output: The tallest mountain in the world is Mount Everest (≧▽≦).


-----model output BEFORE training----:

Loading pretrained model
Using model_type='mistral'
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
LoRA init skipped
Total parameters 1242.375M
Trainable parameters 0.000M
Loading datasets
LoRA loading skipped
Generating
==========
Mount Everest, located in the Himalayas on the border of Nepal and Tibet (China), is the highest mountain in the world. Its peak is 8,848.86 meters (29,031.7 feet) above sea level, as determined by the International Mount Everest Expedition of 1955, which conducted a more precise survey. Before this expedition, the height of Mount Everest was measured at
==========

-----model output AFTER training----:

Loading pretrained model
Using model_type='mistral'
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Total parameters 1244.079M
Trainable parameters 1.704M
Loading datasets
Generating
==========
The tallest mountain in the world is Mount Everest, standing at an impressive height of 8,848 meters above sea level (≧▽≦)
==========

7. モデルのコンバート

必要に応じて、モデルを量子化・圧縮して効率的に運用するためにコンバートします。

ilab model convert

8. チューニングされたモデルとの対話

チューニングされたモデルをサーバーで起動し、対話を行います。

ilab model serve --model-path ./instructlab-merlinite-7b-lab-trained/instructlab-merlinite-7b-lab-Q4_K_M.gguf

対話のタブを別に開いて、モデルが正しく顔文字付きで回答を返しているかをチェックします。

ilab model chat -m ./instructlab-merlinite-7b-lab-trained/instructlab-merlinite-7b-lab-Q4_K_M.gguf --greedy-mode

質問をしてみると、顔文字が追加されていることが確認されます。
ただし2個目の質問の回答を見ると顔文字がついておらず、これは学習データの拡充などが必要と考えられます。

╭────────────────────────────────────── system ───────────────────────────────────────╮
│ Welcome to InstructLab Chat w/                                                      │
│ ./INSTRUCTLAB-MERLINITE-7B-LAB-TRAINED/INSTRUCTLAB-MERLINITE-7B-LAB-Q4_K_M.GGUF     │
│ (type /h for help)                                                                  │
╰─────────────────────────────────────────────────────────────────────────────────────╯
>>> What is the capital of Japan?                                          [S][default]
╭── ./instructlab-merlinite-7b-lab-trained/instructlab-merlinite-7b-lab-Q4_K_M.gguf ──╮
│ The capital of Japan is Tokyo (^_^)).                                               │
│                                                                                     │
│                                                                                     │
│ [                                                                                   │
╰───────────────────────────────────────────────────────────── elapsed 0.604 seconds ─╯
>>> Tell me famous quotes by philosophers.                                 [S][default]
╭── ./instructlab-merlinite-7b-lab-trained/instructlab-merlinite-7b-lab-Q4_K_M.gguf ──╮
│ Here are some famous quotes by philosophers:                                        │
│                                                                                     │
│ * "I think therefore I am" - René Descartes                                         │
│ * "The world is my oyster" - Thomas Carlyle                                         │
│ * "Life is like a box of chocolates, you never know what you're gonna get" -        │
│ Forrest Gump (actually a fictional character but often quoted as if he were a       │
│ philosopher)                                                                        │
│ * "All animals are equal but some animals are more equal than others" - George      │
│ Orwell                                                                              │
│ * "The only way to do something is to try it" - Thomas Edison.                      │
╰───────────────────────────────────────────────────────────── elapsed 5.117 seconds ─╯
>>> Who is the most famous philosopher?                                    [S][default]
╭── ./instructlab-merlinite-7b-lab-trained/instructlab-merlinite-7b-lab-Q4_K_M.gguf ──╮
│ Aristotle is often considered one of the most famous philosophers due to his        │
│ contributions in various fields such as logic, metaphysics, ethics, biology, and    │
│ more. He was a student of Plato and later became a teacher at the Lyceum in Athens  │
│ (^_^).                                                                              │
╰───────────────────────────────────────────────────────────── elapsed 2.657 seconds ─╯

上記の手順により、「顔文字付き回答スキル」の登録が確認できました。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up