一昨日の記事に続いて、ロシア語係り受け解析モデルの「精度」を、UD_Russian-Taigaのru_taiga-ud-test.conlluとUD_Russian-SyntagRusのru_syntagrus-ud-test.conlluで測ってみた。ベンチマークプログラムは、こんな感じ。
#! /usr/bin/python3
# pip3 install esupar transformers triton
models=[
"KoichiYasuoka/bert-base-russian-upos",
"KoichiYasuoka/modernbert-small-russian-ud-goeswith",
"KoichiYasuoka/modernbert-small-russian-ud-embeds",
"KoichiYasuoka/modernbert-base-russian-ud-goeswith"
"KoichiYasuoka/modernbert-base-russian-ud-embeds"
]
import os,sys,subprocess
url="https://github.com/UniversalDependencies/UD_Russian-"
tests=["Taiga","SyntagRus"]
for t in tests:
u=url+t+"/raw/refs/heads/master/ru_"+t.lower()+"-ud-test.conllu"
f=os.path.basename(u)
os.system(f"test -f {f} || curl -LO {u}")
url="https://universaldependencies.org/conll18/conll18_ud_eval.py"
c=os.path.basename(url)
os.system(f"test -f {c} || curl -LO {url}")
for mdl in models:
if mdl.endswith("-upos"):
import esupar
nlp=esupar.load(mdl)
else:
from transformers import pipeline
nlp=pipeline("universal-dependencies",mdl,trust_remote_code=True,aggregation_strategy="simple")
for f in tests:
with open(f"ru_{f.lower()}-ud-test.conllu","r",encoding="utf-8") as r:
s=[t[8:].strip() for t in r if t.startswith("# text =")]
with open(f,"w",encoding="utf-8") as w:
for t in s:
w.write(str(nlp(t)).strip()+"\n\n")
os.system(f"mkdir -p result/{mdl}")
with open(f"result/{mdl}/result.txt","w",encoding="utf-8") as w:
for f in tests:
p=subprocess.run([sys.executable,c,"-v",f"ru_{f.lower()}-ud-test.conllu",f],encoding="utf-8",stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
print(f"\n*** {mdl} ({f})",p.stdout,sep="\n",file=w)
os.system(f'cd result && cat `find {" ".join(models)} -name result.txt`')
私(安岡孝一)の手元では、以下の結果が出力された。
*** KoichiYasuoka/bert-base-russian-upos (Taiga)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.74 | 99.75 | 99.75 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.74 | 99.75 | 99.75 |
UPOS | 98.89 | 98.91 | 98.90 | 99.15
XPOS | 99.74 | 99.75 | 99.75 | 100.00
UFeats | 97.95 | 97.97 | 97.96 | 98.21
AllTags | 97.68 | 97.69 | 97.69 | 97.94
Lemmas | 0.07 | 0.07 | 0.07 | 0.07
UAS | 89.74 | 89.75 | 89.74 | 89.97
LAS | 87.00 | 87.01 | 87.00 | 87.22
CLAS | 85.46 | 85.35 | 85.41 | 85.65
MLAS | 82.47 | 82.37 | 82.42 | 82.65
BLEX | 0.04 | 0.04 | 0.04 | 0.04
*** KoichiYasuoka/bert-base-russian-upos (SyntagRus)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.96 | 99.94 | 99.95 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.96 | 99.94 | 99.95 |
UPOS | 99.58 | 99.57 | 99.57 | 99.62
XPOS | 99.96 | 99.94 | 99.95 | 100.00
UFeats | 98.74 | 98.73 | 98.74 | 98.79
AllTags | 98.66 | 98.64 | 98.65 | 98.70
Lemmas | 0.00 | 0.00 | 0.00 | 0.00
UAS | 95.20 | 95.18 | 95.19 | 95.24
LAS | 93.66 | 93.65 | 93.65 | 93.70
CLAS | 92.74 | 92.70 | 92.72 | 92.76
MLAS | 90.65 | 90.62 | 90.63 | 90.67
BLEX | 0.00 | 0.00 | 0.00 | 0.00
*** KoichiYasuoka/modernbert-small-russian-ud-goeswith (Taiga)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.08 | 99.27 | 99.17 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.08 | 99.27 | 99.17 |
UPOS | 96.15 | 96.34 | 96.25 | 97.05
XPOS | 99.08 | 99.27 | 99.17 | 100.00
UFeats | 93.47 | 93.65 | 93.56 | 94.34
AllTags | 92.75 | 92.93 | 92.84 | 93.61
Lemmas | 0.05 | 0.05 | 0.05 | 0.05
UAS | 88.60 | 88.78 | 88.69 | 89.43
LAS | 85.29 | 85.45 | 85.37 | 86.08
CLAS | 83.75 | 83.34 | 83.55 | 84.04
MLAS | 77.06 | 76.69 | 76.88 | 77.33
BLEX | 0.00 | 0.00 | 0.00 | 0.00
*** KoichiYasuoka/modernbert-small-russian-ud-goeswith (SyntagRus)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.66 | 99.73 | 99.70 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.66 | 99.73 | 99.70 |
UPOS | 98.02 | 98.09 | 98.05 | 98.35
XPOS | 99.66 | 99.73 | 99.70 | 100.00
UFeats | 95.23 | 95.30 | 95.27 | 95.56
AllTags | 94.95 | 95.01 | 94.98 | 95.27
Lemmas | 0.00 | 0.00 | 0.00 | 0.00
UAS | 93.87 | 93.93 | 93.90 | 94.19
LAS | 91.38 | 91.44 | 91.41 | 91.68
CLAS | 90.31 | 90.18 | 90.24 | 90.47
MLAS | 84.47 | 84.35 | 84.41 | 84.62
BLEX | 0.00 | 0.00 | 0.00 | 0.00
*** KoichiYasuoka/modernbert-small-russian-ud-embeds (Taiga)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 98.55 | 98.03 | 98.29 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 98.55 | 98.03 | 98.29 |
UPOS | 93.67 | 93.17 | 93.42 | 95.04
XPOS | 98.55 | 98.03 | 98.29 | 100.00
UFeats | 90.79 | 90.30 | 90.54 | 92.12
AllTags | 89.85 | 89.37 | 89.61 | 91.17
Lemmas | 0.07 | 0.07 | 0.07 | 0.07
UAS | 80.71 | 80.28 | 80.49 | 81.89
LAS | 76.81 | 76.40 | 76.60 | 77.93
CLAS | 80.00 | 79.14 | 79.56 | 79.61
MLAS | 73.00 | 72.21 | 72.60 | 72.64
BLEX | 0.04 | 0.04 | 0.04 | 0.04
*** KoichiYasuoka/modernbert-small-russian-ud-embeds (SyntagRus)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.00 | 98.22 | 98.61 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.00 | 98.22 | 98.61 |
UPOS | 96.49 | 95.73 | 96.10 | 97.46
XPOS | 99.00 | 98.22 | 98.61 | 100.00
UFeats | 93.58 | 92.84 | 93.21 | 94.52
AllTags | 93.25 | 92.52 | 92.88 | 94.19
Lemmas | 0.00 | 0.00 | 0.00 | 0.00
UAS | 86.93 | 86.25 | 86.59 | 87.81
LAS | 84.36 | 83.69 | 84.02 | 85.21
CLAS | 88.36 | 87.91 | 88.14 | 88.28
MLAS | 82.19 | 81.78 | 81.98 | 82.12
BLEX | 0.00 | 0.00 | 0.00 | 0.00
*** KoichiYasuoka/modernbert-base-russian-ud-goeswith (Taiga)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.10 | 99.47 | 99.28 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.10 | 99.47 | 99.28 |
UPOS | 96.68 | 97.04 | 96.86 | 97.56
XPOS | 99.10 | 99.47 | 99.28 | 100.00
UFeats | 94.83 | 95.19 | 95.01 | 95.70
AllTags | 94.23 | 94.59 | 94.41 | 95.09
Lemmas | 0.05 | 0.05 | 0.05 | 0.05
UAS | 89.82 | 90.16 | 89.99 | 90.64
LAS | 86.93 | 87.25 | 87.09 | 87.72
CLAS | 85.68 | 85.59 | 85.64 | 86.11
MLAS | 80.12 | 80.04 | 80.08 | 80.52
BLEX | 0.00 | 0.00 | 0.00 | 0.00
*** KoichiYasuoka/modernbert-base-russian-ud-goeswith (SyntagRus)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.69 | 99.75 | 99.72 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.69 | 99.75 | 99.72 |
UPOS | 98.34 | 98.41 | 98.37 | 98.65
XPOS | 99.69 | 99.75 | 99.72 | 100.00
UFeats | 96.10 | 96.17 | 96.13 | 96.40
AllTags | 95.83 | 95.89 | 95.86 | 96.13
Lemmas | 0.00 | 0.00 | 0.00 | 0.00
UAS | 94.63 | 94.70 | 94.67 | 94.93
LAS | 92.55 | 92.61 | 92.58 | 92.84
CLAS | 91.76 | 91.67 | 91.72 | 91.94
MLAS | 86.80 | 86.72 | 86.76 | 86.97
BLEX | 0.00 | 0.00 | 0.00 | 0.00
*** KoichiYasuoka/modernbert-base-russian-ud-embeds (Taiga)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 98.68 | 98.10 | 98.39 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 98.68 | 98.10 | 98.39 |
UPOS | 94.77 | 94.20 | 94.48 | 96.03
XPOS | 98.68 | 98.10 | 98.39 | 100.00
UFeats | 92.47 | 91.92 | 92.20 | 93.71
AllTags | 91.58 | 91.04 | 91.31 | 92.80
Lemmas | 0.07 | 0.07 | 0.07 | 0.07
UAS | 83.29 | 82.80 | 83.05 | 84.41
LAS | 80.25 | 79.77 | 80.01 | 81.32
CLAS | 82.44 | 81.90 | 82.17 | 82.29
MLAS | 76.21 | 75.72 | 75.96 | 76.07
BLEX | 0.04 | 0.04 | 0.04 | 0.04
*** KoichiYasuoka/modernbert-base-russian-ud-embeds (SyntagRus)
Metric | Precision | Recall | F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens | 99.02 | 98.29 | 98.65 |
Sentences | 100.00 | 100.00 | 100.00 |
Words | 99.02 | 98.29 | 98.65 |
UPOS | 96.91 | 96.20 | 96.55 | 97.87
XPOS | 99.02 | 98.29 | 98.65 | 100.00
UFeats | 94.54 | 93.85 | 94.20 | 95.48
AllTags | 94.24 | 93.55 | 93.89 | 95.18
Lemmas | 0.00 | 0.00 | 0.00 | 0.00
UAS | 89.75 | 89.10 | 89.42 | 90.64
LAS | 87.65 | 87.01 | 87.33 | 88.52
CLAS | 90.01 | 89.61 | 89.81 | 89.91
MLAS | 84.73 | 84.35 | 84.54 | 84.64
BLEX | 0.00 | 0.00 | 0.00 | 0.00
UPOS/LAS/MLASを表にしてみよう。
ru_taiga-ud-test.conllu | ru_syntagrus-ud-test.conllu | |
---|---|---|
bert-base-russian-upos | 98.90/87.00/82.42 | 99.57/93.65/90.63 |
modernbert-small-russian-ud-goeswith | 96.25/85.37/76.88 | 98.05/91.41/84.41 |
modernbert-small-russian-ud-embeds | 93.42/76.60/72.60 | 96.10/84.02/81.98 |
modernbert-base-russian-ud-goeswith | 96.86/87.09/80.08 | 98.37/92.58/86.76 |
modernbert-base-russian-ud-embeds | 94.48/80.01/75.96 | 96.55/87.33/84.54 |
「bert-base-russian-upos」が圧倒的に強い。「modernbert-base-russian-ud-goeswith」が追いすがっているものの、あと一歩で足りていない。ただし「bert-base-russian-upos」のBiaffineモジュールsupar.modelは、内部がpickleで実装されていることもあって、Protect AIに「unsafe」の烙印を押されている。このあたり、再実装が必要なのかなあ。