0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

富岳のPyTorch-1.13.0で係り受け解析モジュールStanzaを動かすには

Last updated at Posted at 2024-01-12

スーパーコンピュータ「富岳」のspack0.19版PyTorch-1.13.0には、今のところtransformersは乗っていないものの、Stanzaなら乗せて動かすことができそうだ。一昨日の記事のプログラムを、spack向けに書き換えてみよう。

#! /bin/bash
#PJM -L rscgrp=small
#PJM -L elapse=1:00:00
#PJM -L node=1
#PJM -j
#PJM -S

. /vol0004/apps/oss/spack/share/spack/setup-env.sh
spack load py-torch@1.13.0/glqavnh
G=`id | sed 's/^.*gid=[0-9]*(\([^)]*\)).*$/\1/'`
set `ls -d /vol*/$G /vol*/data/$G` $HOME
export PYTHONUSERBASE=$1/Stanza
export PATH=$PYTHONUSERBASE/bin:$PATH
export STANZA_RESOURCES_DIR=$PYTHONUSERBASE/stanza_resources_dir
export TMPDIR=$PYTHONUSERBASE/tmp
mkdir -p $TMPDIR
pip3 install -U stanza deplacy tqdm typing_extensions --user

P=$TMPDIR/stanza_eu.$PJM_JOBID.$$.py
cat << 'EOF' > $P
import stanza
nlp=stanza.Pipeline("eu")
doc=nlp("Gaur ama hil zen, gaur goizean ama hil nuen.")
import deplacy
print(deplacy.to_conllu(doc))
deplacy.render(doc)
EOF
python3 $P

富岳のログインノードからpjsubしてみたところ、ジョブ待ち・インストール・ダウンロード等も含め、20分ほどで以下の結果が得られた。

# text = Gaur ama hil zen, gaur goizean ama hil nuen.
# sent_id = 0
1	Gaur	gaur	ADV	_	_	3	advmod	_	start_char=0|end_char=4
2	ama	ama	NOUN	_	Case=Abs|Definite=Def|Number=Sing	3	nsubj	_	start_char=5|end_char=8
3	hil	hil	VERB	_	Aspect=Perf|VerbForm=Part	0	root	_	start_char=9|end_char=12
4	zen	izan	AUX	_	Mood=Ind|Number[abs]=Sing|Person[abs]=3|VerbForm=Fin	3	aux	_	start_char=13|end_char=16
5	,	,	PUNCT	_	_	9	punct	_	start_char=16|end_char=17
6	gaur	gaur	ADV	_	_	9	advmod	_	start_char=18|end_char=22
7	goizean	goiz	NOUN	_	Animacy=Inan|Case=Ine|Definite=Def|Number=Sing	9	obl	_	start_char=23|end_char=30
8	ama	ama	NOUN	_	Case=Abs|Definite=Def|Number=Sing	9	obj	_	start_char=31|end_char=34
9	hil	hil	VERB	_	Aspect=Perf|VerbForm=Part	3	conj	_	start_char=35|end_char=38
10	nuen	edun	AUX	_	Mood=Ind|Number[abs]=Sing|Number[erg]=Sing|Person[abs]=3|Person[erg]=1|VerbForm=Fin	9	aux	_	start_char=39|end_char=43
11	.	.	PUNCT	_	_	3	punct	_	start_char=43|end_char=44

Gaur    ADV   <════╗         advmod
ama     NOUN  <══╗ ║         nsubj
hil     VERB  ═╗═╝═╝═════╗═╗ root
zen     AUX   <╝         ║ ║ aux
,       PUNCT <════════╗ ║ ║ punct
gaur    ADV   <══════╗ ║ ║ ║ advmod
goizean NOUN  <════╗ ║ ║ ║ ║ obl
ama     NOUN  <══╗ ║ ║ ║ ║ ║ obj
hil     VERB  ═╗═╝═╝═╝═╝<╝ ║ conj
nuen    AUX   <╝           ║ aux
.       PUNCT <════════════╝ punct

バスク語の例文「Gaur ama hil zen, gaur goizean ama hil nuen.」に対し、2つある「ama hil」がそれぞれnsubjとobjで繋がれている点も含め、ちゃんとUniversal Dependenciesで係り受け解析できている。これなら、富岳における「PythonはSpackでのご使用をお願いします」もクリアできるので、とりあえずStanzaは、富岳のspack0.19版PyTorch-1.13.0で動かす方がいいかな。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?