NLPre-ZHにdeberta-base-chinese-erlangshen-ud-goeswithで参加してみた

Last updated at 2025-01-03Posted at 2025-01-03

Martyna Wiącek, Piotr Rybak, Łukasz Pszenny, Alina Wróblewska『NLPre: A Revised Approach towards Language-centric Benchmarking of Natural Language Preprocessing Systems』の中国語(繁體字)ベンチマークNLPre-ZHに参加すべく、『Sequence-Labeling RoBERTa Model for Dependency-Parsing in Classical Chinese and Its Application to Vietnamese and Thai』で作ったRoBERTaモデルや、2023年1月3日の日記で公開した二郎神DeBERTaモデルを、ざっとベンチマークにかけてみた。Google Colaboratory (GPU版)だと、こんな感じ。

!pip install transformers
models=["KoichiYasuoka/roberta-base-chinese-ud-goeswith","KoichiYasuoka/deberta-base-chinese-ud-goeswith","KoichiYasuoka/deberta-base-chinese-erlangshen-ud-goeswith","KoichiYasuoka/deberta-large-chinese-erlangshen-ud-goeswith","KoichiYasuoka/deberta-xlarge-chinese-erlangshen-ud-goeswith"]
import os,sys,subprocess
from transformers import pipeline
url="https://github.com/UniversalDependencies/UD_Chinese-GSD"
d=os.path.basename(url)
os.system(f"test -d {d} || git clone --depth=1 {url}")
os.system("for F in train dev test ; do cp "+d+"/*-$F.conllu $F.conllu ; done")
url="https://universaldependencies.org/conll18/conll18_ud_eval.py"
c=os.path.basename(url)
os.system(f"test -f {c} || curl -LO {url}")
with open("test.conllu","r",encoding="utf-8") as r:
  s=[t[8:].strip() for t in r if t.startswith("# text =")]
for mdl in models:
  nlp=pipeline("universal-dependencies",mdl,trust_remote_code=True,aggregation_strategy="simple",device=0)
  with open("result.conllu","w",encoding="utf-8") as w:
    for t in s:
      w.write(nlp(t))
  p=subprocess.run([sys.executable,c,"-v","test.conllu","result.conllu"],encoding="utf-8",stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
  with open("result.txt","w",encoding="utf-8") as w:
    print(f"\n*** {mdl}",p.stdout,sep="\n",file=w)
  os.system(f"mkdir -p {mdl} ; mv result.conllu result.txt {mdl}")
!cat `find . -name result.txt`

私(安岡孝一)の手元では、以下の結果が出力された。

*** KoichiYasuoka/roberta-base-chinese-ud-goeswith
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     96.90 |     96.99 |     96.94 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     96.90 |     96.99 |     96.94 |
UPOS       |     89.10 |     89.18 |     89.14 |     91.95
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     95.74 |     95.83 |     95.79 |     98.81
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |     96.15 |     96.24 |     96.19 |     99.23
UAS        |     74.91 |     74.98 |     74.95 |     77.31
LAS        |     70.95 |     71.01 |     70.98 |     73.22
CLAS       |     74.08 |     75.03 |     74.55 |     78.20
MLAS       |     62.85 |     63.66 |     63.25 |     66.35
BLEX       |     73.55 |     74.49 |     74.02 |     77.63

*** KoichiYasuoka/deberta-base-chinese-ud-goeswith
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     97.53 |     97.59 |     97.56 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     97.53 |     97.59 |     97.56 |
UPOS       |     90.20 |     90.26 |     90.23 |     92.48
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     96.67 |     96.74 |     96.70 |     99.12
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |     96.78 |     96.84 |     96.81 |     99.23
UAS        |     76.14 |     76.19 |     76.17 |     78.07
LAS        |     72.22 |     72.27 |     72.25 |     74.05
CLAS       |     75.35 |     76.42 |     75.88 |     78.99
MLAS       |     65.04 |     65.97 |     65.51 |     68.19
BLEX       |     74.81 |     75.88 |     75.34 |     78.43

*** KoichiYasuoka/deberta-base-chinese-erlangshen-ud-goeswith
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     97.50 |     97.49 |     97.50 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     97.50 |     97.49 |     97.50 |
UPOS       |     90.12 |     90.11 |     90.11 |     92.43
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     96.79 |     96.79 |     96.79 |     99.27
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |     96.75 |     96.74 |     96.75 |     99.23
UAS        |     77.38 |     77.37 |     77.38 |     79.36
LAS        |     73.54 |     73.53 |     73.54 |     75.42
CLAS       |     77.00 |     77.97 |     77.48 |     80.70
MLAS       |     66.31 |     67.15 |     66.73 |     69.50
BLEX       |     76.40 |     77.37 |     76.88 |     80.08

*** KoichiYasuoka/deberta-large-chinese-erlangshen-ud-goeswith
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     96.48 |     96.67 |     96.57 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     96.48 |     96.67 |     96.57 |
UPOS       |     88.44 |     88.62 |     88.53 |     91.67
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     95.59 |     95.78 |     95.68 |     99.08
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |     95.75 |     95.94 |     95.84 |     99.24
UAS        |     74.40 |     74.55 |     74.48 |     77.12
LAS        |     70.45 |     70.59 |     70.52 |     73.02
CLAS       |     73.00 |     74.17 |     73.58 |     77.74
MLAS       |     62.73 |     63.74 |     63.23 |     66.80
BLEX       |     72.42 |     73.58 |     72.99 |     77.12

*** KoichiYasuoka/deberta-xlarge-chinese-erlangshen-ud-goeswith
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     96.07 |     96.22 |     96.14 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     96.07 |     96.22 |     96.14 |
UPOS       |     88.11 |     88.25 |     88.18 |     91.71
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     95.13 |     95.28 |     95.20 |     99.02
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |     95.35 |     95.50 |     95.43 |     99.26
UAS        |     73.90 |     74.02 |     73.96 |     76.93
LAS        |     69.80 |     69.91 |     69.86 |     72.66
CLAS       |     72.31 |     73.36 |     72.83 |     77.38
MLAS       |     61.93 |     62.82 |     62.37 |     66.26
BLEX       |     71.73 |     72.77 |     72.25 |     76.76

ざっと見たところ、deberta-base-chinese-erlangshen-ud-goeswithのLAS/MLAS/BLEXが73.54/66.73/76.88で、この中では最も良さそうだ。そこで、出力結果のファイル名をud/gsd/test.conlluに変えてzipで固め、NLPre-ZHに登録してみた。そうしたところ、NLPre-ZHの「Leaderboard - UD Tagset」では、なぜかLAS/MLAS/BLEXが83.05/78.04/81.47になってしまった。いやいや、そんな高いはず無いんだけど、もしかしたらNLPre-ZHって、いまだにUniversal Dependencies 2.9使ってるのかしら。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up