0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

ウクライナ語係り受け解析モデルの「精度」をUD_Ukrainian-{IU,ParlaMint}のtestセットで測る

Posted at

8月3日の記事に続いて「modernbert-large-ukrainian-ud-goeswith」と「bert-large-ukrainian-ud-goeswith」も試作したので、RoBERTaモデルも含めウクライナ語係り受けの「精度」を、UD_Ukrainian-IUのuk_iu-ud-test.conlluとUD_Ukrainian-ParlaMintのuk_parlamint-ud-test.conlluで測ってみた。Google Colaboratoryだと、こんな感じ。

!pip install esupar transformers triton
models=[
  "KoichiYasuoka/roberta-base-ukrainian-upos",
  "KoichiYasuoka/roberta-base-ukrainian-ud-goeswith",
  "KoichiYasuoka/roberta-base-wechsel-ukrainian-ud-goeswith",
  "KoichiYasuoka/roberta-large-wechsel-ukrainian-ud-goeswith",
  "KoichiYasuoka/bert-large-ukrainian-ud-goeswith",
  "KoichiYasuoka/modernbert-large-ukrainian-ud-goeswith",
  "KoichiYasuoka/modernbert-large-ukrainian-ud-embeds"
]
import os,sys,subprocess
url="https://github.com/UniversalDependencies/UD_Ukrainian-"
tests=["IU","ParlaMint"]
for t in tests:
  u=url+t+"/raw/refs/heads/master/uk_"+t.lower()+"-ud-test.conllu"
  f=os.path.basename(u)
  os.system(f"test -f {f} || curl -LO {u}")
url="https://universaldependencies.org/conll18/conll18_ud_eval.py"
c=os.path.basename(url)
os.system(f"test -f {c} || curl -LO {url}")
for mdl in models:
  if mdl.endswith("-upos"):
    import esupar
    nlp=esupar.load(mdl)
  else:
    from transformers import pipeline
    nlp=pipeline("universal-dependencies",mdl,trust_remote_code=True,aggregation_strategy="simple")
  for f in tests:
    with open(f"uk_{f.lower()}-ud-test.conllu","r",encoding="utf-8") as r:
      s=[t[8:].strip() for t in r if t.startswith("# text =")]
    with open(f,"w",encoding="utf-8") as w:
      for t in s:
        w.write(str(nlp(t)).strip()+"\n\n")
  os.system(f"mkdir -p result/{mdl}")
  with open(f"result/{mdl}/result.txt","w",encoding="utf-8") as w:
    for f in tests:
      p=subprocess.run([sys.executable,c,"-v",f"uk_{f.lower()}-ud-test.conllu",f],encoding="utf-8",stdout=subprocess.PIPE,stderr=subprocess.STDOUT)
      print(f"\n*** {mdl} ({f})",p.stdout,sep="\n",file=w)
!( cd result && cat `find {" ".join(models)} -name result.txt` )

私(安岡孝一)の手元では、以下の結果が出力された。

*** KoichiYasuoka/roberta-base-ukrainian-upos (IU)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.70 |     99.63 |     99.67 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.69 |     99.61 |     99.65 |
UPOS       |     96.90 |     96.81 |     96.85 |     97.20
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     91.59 |     91.51 |     91.55 |     91.88
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.01 |      0.01 |      0.01 |      0.01
UAS        |     88.97 |     88.89 |     88.93 |     89.24
LAS        |     86.18 |     86.10 |     86.14 |     86.44
CLAS       |     83.21 |     82.92 |     83.06 |     83.34
MLAS       |     72.93 |     72.68 |     72.80 |     73.05
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/roberta-base-ukrainian-upos (ParlaMint)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.77 |     99.91 |     99.84 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.77 |     99.91 |     99.84 |
UPOS       |     97.89 |     98.03 |     97.96 |     98.12
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     92.57 |     92.70 |     92.63 |     92.78
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.00 |      0.00 |      0.00 |      0.00
UAS        |     92.14 |     92.27 |     92.21 |     92.35
LAS        |     89.13 |     89.26 |     89.19 |     89.34
CLAS       |     86.54 |     86.53 |     86.54 |     86.63
MLAS       |     76.98 |     76.97 |     76.97 |     77.05
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/roberta-base-ukrainian-ud-goeswith (IU)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.54 |     99.11 |     99.32 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.53 |     99.09 |     99.31 |
UPOS       |     96.40 |     95.97 |     96.19 |     96.86
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     91.30 |     90.89 |     91.09 |     91.73
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.01 |      0.01 |      0.01 |      0.01
UAS        |     87.20 |     86.82 |     87.01 |     87.61
LAS        |     83.38 |     83.02 |     83.20 |     83.78
CLAS       |     80.15 |     79.24 |     79.70 |     80.08
MLAS       |     72.01 |     71.19 |     71.60 |     71.95
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/roberta-base-ukrainian-ud-goeswith (ParlaMint)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.51 |     99.77 |     99.64 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.51 |     99.77 |     99.64 |
UPOS       |     97.45 |     97.71 |     97.58 |     97.93
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     92.67 |     92.92 |     92.79 |     93.13
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.00 |      0.00 |      0.00 |      0.00
UAS        |     90.62 |     90.86 |     90.74 |     91.07
LAS        |     87.04 |     87.27 |     87.15 |     87.47
CLAS       |     84.25 |     84.09 |     84.17 |     84.38
MLAS       |     76.71 |     76.57 |     76.64 |     76.83
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/roberta-base-wechsel-ukrainian-ud-goeswith (IU)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     98.58 |     97.29 |     97.93 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     98.57 |     97.27 |     97.92 |
UPOS       |     96.13 |     94.86 |     95.49 |     97.52
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     91.28 |     90.08 |     90.68 |     92.61
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.01 |      0.01 |      0.01 |      0.01
UAS        |     92.14 |     90.93 |     91.53 |     93.48
LAS        |     88.66 |     87.49 |     88.07 |     89.94
CLAS       |     87.59 |     86.99 |     87.29 |     87.70
MLAS       |     78.59 |     78.05 |     78.32 |     78.69
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/roberta-base-wechsel-ukrainian-ud-goeswith (ParlaMint)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.19 |     99.09 |     99.14 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.19 |     99.09 |     99.14 |
UPOS       |     97.40 |     97.30 |     97.35 |     98.19
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     92.62 |     92.52 |     92.57 |     93.38
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.00 |      0.00 |      0.00 |      0.00
UAS        |     93.39 |     93.30 |     93.35 |     94.16
LAS        |     90.05 |     89.96 |     90.00 |     90.79
CLAS       |     88.14 |     88.20 |     88.17 |     88.41
MLAS       |     79.65 |     79.70 |     79.68 |     79.90
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/roberta-large-wechsel-ukrainian-ud-goeswith (IU)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     98.45 |     97.25 |     97.85 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     98.44 |     97.23 |     97.83 |
UPOS       |     96.29 |     95.11 |     95.70 |     97.82
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     91.84 |     90.72 |     91.28 |     93.30
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.01 |      0.01 |      0.01 |      0.01
UAS        |     91.44 |     90.32 |     90.88 |     92.90
LAS        |     88.40 |     87.32 |     87.86 |     89.81
CLAS       |     87.15 |     86.76 |     86.95 |     87.50
MLAS       |     79.46 |     79.10 |     79.28 |     79.77
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/roberta-large-wechsel-ukrainian-ud-goeswith (ParlaMint)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.25 |     99.10 |     99.17 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.25 |     99.10 |     99.17 |
UPOS       |     97.35 |     97.20 |     97.27 |     98.08
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     93.06 |     92.92 |     92.99 |     93.76
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.00 |      0.00 |      0.00 |      0.00
UAS        |     93.64 |     93.49 |     93.56 |     94.34
LAS        |     90.62 |     90.48 |     90.55 |     91.30
CLAS       |     88.88 |     88.95 |     88.92 |     89.16
MLAS       |     81.06 |     81.13 |     81.09 |     81.31
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/bert-large-ukrainian-ud-goeswith (IU)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.71 |     99.55 |     99.63 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.70 |     99.52 |     99.61 |
UPOS       |     97.49 |     97.32 |     97.41 |     97.79
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     93.54 |     93.38 |     93.46 |     93.83
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.01 |      0.01 |      0.01 |      0.01
UAS        |     93.54 |     93.38 |     93.46 |     93.83
LAS        |     90.21 |     90.06 |     90.13 |     90.49
CLAS       |     87.88 |     87.49 |     87.68 |     87.91
MLAS       |     80.34 |     79.99 |     80.16 |     80.37
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/bert-large-ukrainian-ud-goeswith (ParlaMint)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.52 |     99.81 |     99.66 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.52 |     99.81 |     99.66 |
UPOS       |     97.70 |     97.98 |     97.84 |     98.17
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     93.79 |     94.07 |     93.93 |     94.25
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.00 |      0.00 |      0.00 |      0.00
UAS        |     93.66 |     93.93 |     93.79 |     94.11
LAS        |     90.84 |     91.11 |     90.98 |     91.28
CLAS       |     88.79 |     88.85 |     88.82 |     89.11
MLAS       |     81.57 |     81.63 |     81.60 |     81.86
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/modernbert-large-ukrainian-ud-goeswith (IU)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.74 |     99.41 |     99.58 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.73 |     99.38 |     99.56 |
UPOS       |     97.70 |     97.36 |     97.53 |     97.97
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     94.51 |     94.18 |     94.34 |     94.76
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.01 |      0.01 |      0.01 |      0.01
UAS        |     94.57 |     94.24 |     94.41 |     94.83
LAS        |     91.78 |     91.46 |     91.62 |     92.02
CLAS       |     90.04 |     89.56 |     89.80 |     90.12
MLAS       |     83.50 |     83.05 |     83.28 |     83.58
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/modernbert-large-ukrainian-ud-goeswith (ParlaMint)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.74 |     99.90 |     99.82 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.74 |     99.90 |     99.82 |
UPOS       |     98.13 |     98.28 |     98.21 |     98.38
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     94.30 |     94.45 |     94.38 |     94.54
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.00 |      0.00 |      0.00 |      0.00
UAS        |     94.59 |     94.73 |     94.66 |     94.83
LAS        |     91.91 |     92.05 |     91.98 |     92.14
CLAS       |     90.26 |     90.44 |     90.35 |     90.58
MLAS       |     82.90 |     83.07 |     82.98 |     83.20
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/modernbert-large-ukrainian-ud-embeds (IU)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.72 |     99.48 |     99.60 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.71 |     99.45 |     99.58 |
UPOS       |     97.08 |     96.83 |     96.95 |     97.36
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     92.69 |     92.45 |     92.57 |     92.96
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.01 |      0.01 |      0.01 |      0.01
UAS        |     91.16 |     90.93 |     91.04 |     91.43
LAS        |     87.82 |     87.59 |     87.71 |     88.07
CLAS       |     85.69 |     85.04 |     85.36 |     85.57
MLAS       |     77.82 |     77.22 |     77.52 |     77.70
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

*** KoichiYasuoka/modernbert-large-ukrainian-ud-embeds (ParlaMint)
Metric     | Precision |    Recall |  F1 Score | AligndAcc
-----------+-----------+-----------+-----------+-----------
Tokens     |     99.63 |     99.85 |     99.74 |
Sentences  |    100.00 |    100.00 |    100.00 |
Words      |     99.63 |     99.85 |     99.74 |
UPOS       |     97.75 |     97.97 |     97.86 |     98.12
XPOS       |      0.00 |      0.00 |      0.00 |      0.00
UFeats     |     93.28 |     93.49 |     93.39 |     93.63
AllTags    |      0.00 |      0.00 |      0.00 |      0.00
Lemmas     |      0.00 |      0.00 |      0.00 |      0.00
UAS        |     91.89 |     92.10 |     92.00 |     92.24
LAS        |     88.62 |     88.83 |     88.73 |     88.96
CLAS       |     86.92 |     87.07 |     87.00 |     87.26
MLAS       |     79.33 |     79.46 |     79.39 |     79.63
BLEX       |      0.00 |      0.00 |      0.00 |      0.00

UPOS/LAS/MLASを表にしてみよう。

uk_iu-ud-test.conllu uk_parlamint-ud-test.conllu
roberta-base-ukrainian-upos 96.85/86.14/72.80 97.96/89.19/76.97
roberta-base-ukrainian-ud-goeswith 96.19/83.20/71.60 97.58/87.15/76.64
roberta-base-wechsel-ukrainian-ud-goeswith 95.49/88.07/78.32 97.35/90.00/79.68
roberta-large-wechsel-ukrainian-ud-goeswith 95.70/87.86/79.28 97.27/90.55/81.09
bert-large-ukrainian-ud-goeswith 97.41/90.13/80.16 97.84/90.98/81.60
modernbert-large-ukrainian-ud-goeswith 97.53/91.62/83.28 98.21/91.98/82.98
modernbert-large-ukrainian-ud-embeds 96.95/87.71/77.52 97.86/88.73/79.39

やはりModernBERTの精度が高い。ただ、上三角行列によるmodernbert-large-ukrainian-ud-embedsの精度がイマイチで、このあたり、チューニング手法をもう少し考える必要がありそうだ。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?