More than 1 year has passed since last update.

NovelAI系（Naifu/NGUI）とstable-diffusion-webui(AUTOMATIC111)のプロンプト強調構文の違い

Last updated at 2022-10-30Posted at 2022-10-30

プロンプトの強調構文には違いがある

stable-diffusion-webuiを用いても、NovelAI用モデルの環境を構築できるということだが、
これらの2つの環境はプロンプトの前処理が異なっている。（Naifu/NGUIはNovelAIと同環境の模様）

なので「このプロンプトで、この結果が出た」という情報通りに入力セットを与えても、
環境が互いに違うと結果に違いが出てくる。

それを整理しておくと、手元の環境でうまく再現できないというのが減るかもしれない。

【参考】webuiでの環境構築：Github Discussions Emulate NovelAI

NovelAIのプロンプトの強調構文

公式ドキュメント Strengthening & Weakening Vectorsより

The weight of AI focus will be multiplied by 1.05 if you enclose the tags or text you want the AI to focus on more with { and }.
The weight of AI focus will be divided by 1.05 if you enclose the tags or text you want the AI to focus on less with [ and ].
Multiple { or [ inside another will multiply the weight each time, so {{ would result in the enclosed part of the prompt being weighted by 1.1025.

｛｝で括れば強調、[]で括れば弱化。係数が1.05で括弧の数だけ乗算する。

stable-diffusion-webuiの強調構文

公式ドキュメント Attention/emphasisより

a (word) - increase attention to word by a factor of 1.1

a ((word)) - increase attention to word by a factor of 1.21 (= 1.1 * 1.1)

a [word] - decrease attention to word by a factor of 1.1

a (word:1.5) - increase attention to word by a factor of 1.5

a (word:0.25) - decrease attention to word by a factor of 4 (= 1 / 0.25)

a \(word\) - use literal () characters in prompt

{}には対応していない。代わりに、()が強調で（弱化は同じ）、係数が1.1という違いがある。

NovelAI系（Naifu/NGUI）に丸括弧`()`は効くのか

（Naifu系の）コードを確認したところ、丸括弧への処理は入ってないかった。
また、webUI系の入力にある数値で指定する形式((word:1.5)など)も対応していないようだ。
（丸括弧や数値を入れて結果が変わるのはCLIPのモデルによる解釈なのだろうか。）

"naifu/ldm/modules/encoders/modules.py"

class FrozenCLIPEmbedder(AbstractEncoder):
    def __init__(self, version="./models/openai--clip-vit-large-patch14", device="cuda", max_length=77):  # clip-vit-base-patch32
        # ...
        self.emphasis_factor = 1.05 # strength of () and []
        # ...
        self.token_mults = {}
        tokens_with_parens = [(k, v) for k, v in self.tokenizer.get_vocab().items() if '{' in k or '}' in k or '[' in k or ']' in k]
        fac = self.emphasis_factor
        for text, ident in tokens_with_parens:
            mult = 1.0
            for c in text:
                if c == '[':
                    mult /= fac
                if c == ']':
                    mult *= fac
                if c == '{':
                    mult *= fac
                if c == '}':
                    mult /= fac
            if mult != 1.0:
                self.token_mults[ident] = mult

stable-diffusion-webuiは強調をどう処理しているのか

上述の説明通りの処理になっている。

該当コードから抜粋

stable-diffusion-webui/modules/prompt_parser.py

def parse_prompt_attention(text):
    # ...
    round_bracket_multiplier = 1.1
    square_bracket_multiplier = 1 / 1.1

    def multiply_range(start_position, multiplier):
        for p in range(start_position, len(res)):
            res[p][1] *= multiplier

    for m in re_attention.finditer(text):
        text = m.group(0)
        weight = m.group(1)

        if text.startswith('\\'):
            res.append([text[1:], 1.0])
        elif text == '(':
            round_brackets.append(len(res))
        elif text == '[':
            square_brackets.append(len(res))
        elif weight is not None and len(round_brackets) > 0:
            multiply_range(round_brackets.pop(), float(weight))
        elif text == ')' and len(round_brackets) > 0:
            multiply_range(round_brackets.pop(), round_bracket_multiplier)
        elif text == ']' and len(square_brackets) > 0:
            multiply_range(square_brackets.pop(), square_bracket_multiplier)
        else:
            res.append([text, 1.0])

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up