More than 1 year has passed since last update.

Automatic Prompt Engineer

Last updated at 2023-11-14Posted at 2023-10-07

はじめに

今回はAutomatic Prompt Engineer(APE)の紹介になります．Automatic Prompt Engineerでは入力と出力のペアを与えることで，その入力から出力を得るためのプロンプトを自動で生成することができます．さらにどのようなプロンプトが良いかや人間が作成したプロンプトの評価なども行うことができます．これによりいくつかのデータセットでIn-context LearningやChain-of-thoughtの性能を上げるような結果が出しています．

記事に誤り等ありましたらご指摘いただけますと幸いです。

1. Automatic Prompt Engineer

ライセンス:MIT

リポジトリ:https://github.com/keirp/automatic_prompt_engineer
公式サイト:https://sites.google.com/view/automatic-prompt-engineer
論文:https://arxiv.org/abs/2211.01910

Automatic Prompt Engineerでは入力と出力のペアと，プロンプトのテンプレートを入力として与えることで，入力が与えられた時に正確に出力が得られるようにプロンプトテンプレートを最適化することができます．以下が内部での主な処理になります．詳細については論文を参照してください．

2. 使い方

すぐに試したい方はData Science WikiのページまたはColabのリンクから実行してみてください

今回は公式の紹介しているデモを紹介していこうと思います．内容としては入力した言葉と反対の言葉が出力されるようにするプロンプトの最適化を行います．

以下のコマンドでインストールします．

!pip install git+https://github.com/keirp/automatic_prompt_engineer

from automatic_prompt_engineer import ape

import openai
openai.api_key = ''

プロンプトの探索

以下では入力するデータと出力として得たいデータのセットを用意しています．

words = ["sane", "direct", "informally", "unpopular", "subtractive", "nonresidential", "inexact", "uptown", "incomparable", "powerful", "gaseous", "evenly", "formality", "deliberately", "off"]
    
antonyms = ["insane", "indirect", "formally", "popular", "additive", "residential", "exact", "downtown", "comparable", "powerless", "solid", "unevenly", "informality", "accidentally", "on"]

以下が評価を行うためのテンプレートになります．

eval_template = \
"""Instruction: [PROMPT]
Input: [INPUT]
Output: [OUTPUT]"""

以下のコマンドで質の高いプロンプトを生成し，最もよいプロンプトを見つけます．

result, demo_fn = ape.simple_ape(
    # 入出力のペア
    dataset=(words, antonyms),
    # プロンプトのテンプレート
    eval_template=eval_template,
)

出力結果になります．scoreの高い値が今回のデータセットにおいて最もInputを入力した時にOutputを得やすいプロンプトです．scoreの値はInstructionとInputが与えられた時の得たいOutputを出力する確率の対数をとったものになります．

print(result)

# 出力
score: prompt
----------------
-0.17:  write the input word with the opposite meaning.
-0.21:  write down the opposite of the word given.
-0.25:  find the antonym (opposite) of each word.
-0.28:  produce an antonym (opposite) for each word given.
-0.42:  make a list of antonyms.
-0.44:  "list antonyms for the following words".
-0.76:  produce an output that is the opposite of the input.
-5.44:  "Add the prefix 'in' to each word."
-5.79:  reverse the order of the input.
-7.37:  reverse the order of the letters in each word.

人が作成したプロンプトの評価

# 評価するプロンプト
manual_prompt = "Write an antonym to the following word."

human_result = ape.simple_eval(
    dataset=(words, antonyms),
    eval_template=eval_template,
    prompts=[manual_prompt],
)

print(human_result)

# 出力
log(p): prompt
----------------
-0.24: Write an antonym to the following word.

4. おわりに

今回はAutomatic Prompt Engineerの紹介でした．社会実装の場面においても割と使いやすく応用範囲が広いようなOSSだと感じました．OpenAI以外のモデルでも使えるようになると活用が広がりそうです．LLMのプロンプトエンジニアリングに関する研究も進んできていますね．