Qiita Engineer Festa20242024年7月17日まで開催中！

日本語生成AIのTokenClassificationPipelineにBellman-Ford拡張は必要か

Posted at 2024-06-23

長尾浩良・五藤巧・是枝祐太『単方向・双方向事前学習済み言語モデルにおけるアーキテクチャ・事前学習方法の違いによる影響の分析』(第38回人工知能学会全国大会論文集, 4Xin2-37, pp.1-4, 2024年5月31日)と村脇有吾『文字言語モデルからの単語言語モデルの教師なし合成』(情報処理学会研究報告, Vol.2024-NL-260, No.2, pp.1-14, 2024年6月28日)を横目に、GPT2ForTokenClassificationを使ってgpt2-{small,medium,large}-japanese-uposを作ってみた。しかし、単なるTokenClassificationPipelineだとイマイチ精度が上がらなかったので、系列ラベリングの「B-」「I-」解消にBellman-Fordを入れて、文末から文頭へと逆方向に情報が伝わるようにした。UD_Japanese-GSDによるベンチマーク・プログラムは、こんな感じ。

#! /usr/bin/python3
# pip3 install transformers accelerate spacy-alignment
model="KoichiYasuoka/gpt2-small-japanese-upos"
ud="https://github.com/UniversalDependencies/UD_Japanese-GSD"
import os
d=os.path.basename(ud)
os.system(f"test -d {d} || git clone --depth=1 {ud}")
os.system("for F in train dev test; do cp "+d+"/*-$F.conllu $F.conllu; done")
with open("test.conllu","r",encoding="utf-8") as r:
  tst=r.read().strip().split("\n\n")
def ext(x):
  u=v=""
  for s in x.split("\n"):
    if s.startswith("# text ="):
      u=s[9:].strip()
    else:
      t=s.split("\t")
      if t[0].isdigit():
        v+="|"+t[1]+"_"+t[3]
  return (u,v[1:])
from transformers import pipeline
nlp=pipeline("upos",model,trust_remote_code=True,aggregation_strategy="simple")
from spacy_alignments import get_alignments
gold=system=correct=0
for t in tst:
  u,v=ext(t)
  g=v.split("|")
  s=[u[t["start"]:t["end"]]+"_"+t["entity_group"].split("|")[0] for t in nlp(u)]
  gold+=len(g)
  system+=len(s)
  correct+=sum(1 for t,k in zip(s,get_alignments(g,s)[1]) if len(k)==1 and t==g[k[0]])
print("\n***",model)
print("Precision",correct/system if system else 0.0)
print("Recall   ",correct/gold)
print("F1 Score ",2*correct/(system+gold))

3行目のmodelを変えつつ実行した結果、私(安岡孝一)の手元では以下の結果が得られた。

*** KoichiYasuoka/gpt2-small-japanese-upos
Precision 0.9209577162151792
Recall    0.9207457419057848
F1 Score  0.9208517168616919

*** KoichiYasuoka/gpt2-medium-japanese-upos
Precision 0.9268629254829807
Recall    0.9275740371336505
F1 Score  0.9272183449651047

*** KoichiYasuoka/gpt2-large-japanese-upos
Precision 0.9328375373763704
Recall    0.9334816633420285
F1 Score  0.9331594892050465

*** KoichiYasuoka/Swallow-MS-7b-upos
Precision 0.7905047792084683
Recall    0.7677612398342796
F1 Score  0.7789670338224419

*** KoichiYasuoka/Swallow-7b-plus-upos
Precision 0.7852307213446424
Recall    0.774205922970692
F1 Score  0.7796793509754684

*** KoichiYasuoka/deberta-small-japanese-upos
Precision 0.8477737197917476
Recall    0.8370415835507136
F1 Score  0.8423734702544107

*** KoichiYasuoka/deberta-base-japanese-upos
Precision 0.9149018097805159
Recall    0.9114623292926193
F1 Score  0.9131788308543757

*** KoichiYasuoka/deberta-large-japanese-upos
Precision 0.933862026485987
Recall    0.9305662114469848
F1 Score  0.9322112059026977

GPT-2の品詞付与「精度」が、青空文庫DeBERTaに迫っている。ちなみに、Bellman-Fordを挟まない(22行目の"upos"を"token-classification"にする)場合の結果は、以下の通り。

*** KoichiYasuoka/gpt2-small-japanese-upos
Precision 0.7608359133126935
Recall    0.7541813717968391
F1 Score  0.7574940278955075

*** KoichiYasuoka/gpt2-medium-japanese-upos
Precision 0.7910177756803524
Recall    0.7715973607488108
F1 Score  0.781186888302004

*** KoichiYasuoka/gpt2-large-japanese-upos
Precision 0.8086894473955443
Recall    0.7825686665643701
F1 Score  0.7954146683822669

*** KoichiYasuoka/Swallow-MS-7b-upos
Precision 0.6804858727224716
Recall    0.5931410158048182
F1 Score  0.6338184054109449

*** KoichiYasuoka/Swallow-7b-plus-upos
Precision 0.6716443677168753
Recall    0.5969771367193494
F1 Score  0.6321134083431496

*** KoichiYasuoka/deberta-small-japanese-upos
Precision 0.7664715719063545
Recall    0.703314408470155
F1 Score  0.7335360486516764

*** KoichiYasuoka/deberta-base-japanese-upos
Precision 0.83087192323738
Recall    0.764001841338039
F1 Score  0.7960350133898237

*** KoichiYasuoka/deberta-large-japanese-upos
Precision 0.8558143431635389
Recall    0.7837195028387295
F1 Score  0.8181818181818182

もちろん、Bellman-FordじゃなくConditional Random Fieldsでもいいのだが、要は順方向(文頭から文末へ)の情報の流れだけでなく、最後の最後に逆方向(文末から文頭へ)の流れも要るということだろう。ただ、それは生成AIとしては「禁じ手」なのかなあ。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up