More than 1 year has passed since last update.

【OpenAI API】presence_penaltyとfrequency_penaltyの違い

Posted at 2023-09-29

この記事は?

OpenAI APIを用いて、GPTによる返答を生成する際、様々なパラメータを指定することができる。
そのパラメータの中で、presence_penaltyとfrequency_penaltyの違いが分かりにくかったので解説する。

API Referenceによる記載

OpenAIのAPI referenceには次のように記載されている。

presence_penalty (number or null Optional Defaults to 0)
Number between -2.0 and 2.0. Positive values penalize new tokens based on whether they appear in the text so far, increasing the model's likelihood to talk about new topics.

frequency_penalty (number or null Optional Defaults to 0)
Number between -2.0 and 2.0. Positive values penalize new tokens based on their existing frequency in the text so far, decreasing the model's likelihood to repeat the same line verbatim.

Deeplによる翻訳

presence_penalty
2.0 から 2.0 の間の数値。正の値は、新しいトークンがこれまでにテキストに現れたかどうかに基づいてペナルティを課し、モデルが新しいトピックについて話す可能性を高める。

frequency_penalty
2.0から2.0の間の数値。正の値は、新しいトークンに、これまでのテキストにおける既存の頻度に基づいてペナルティを与え、モデルが同じ行を逐語的に繰り返す可能性を低下させる。

↓

両方ともトークンの繰り返しを抑制するペナルティ。
何が違うかよくわからない。

詳細説明

パラメータの詳細についてはこちらに書かれている。
各ペナルティはトークンの出現確率に次のように関係しており、presence_penaltyとfrequency_penaltyは式中のalpha_presenceとalpha_frequencyに対応している。

mu[j] -> mu[j] - c[j] * alpha_frequency - float(c[j] > 0) * alpha_presence

mu[j] is the logits of the j-th token
c[j] is how often that token was sampled prior to the current position
float(c[j] > 0) is 1 if c[j] > 0 and 0 otherwise
alpha_frequency is the frequency penalty coefficient
alpha_presence is the presence penalty coefficient

式を見てみると、alpha_frequencyには過去のトークン出現回数が直接かけられているが、一方、alpha_presenceには過去にトークンが出現しているかどうかの0か1の二値がかけられている。

つまり、同じトークンの出現回数が増えるほどfrequency_penaltyによるペナルティは増加するが、`presence_penaltyによるペナルティは一定（同じトークンか過去にあらわれていない場合は0）となる。

まとめ

presence_penalty

過去に同じトークンが現れたか否かによって、一定のペナルティを課す。

fequency_penalty

過去に同じトークンが現れた回数が増えれば増えるほど、大きなペナルティを課す。

参照

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up