0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

昨日の記事の続きだが、Google Colaboratoryで「LFM2-1.2B」を頑張って動かしてみた。

!test -d transformers || ( git clone --depth=1 https://github.com/huggingface/transformers transformers-all && ln -s transformers-all/src/transformers transformers )
from transformers import AutoTokenizer,Lfm2ForCausalLM,TextGenerationPipeline
tkz=AutoTokenizer.from_pretrained("LiquidAI/LFM2-1.2B")
mdl=Lfm2ForCausalLM.from_pretrained("LiquidAI/LFM2-1.2B")
tgn=TextGenerationPipeline(tokenizer=tkz,model=mdl,max_new_tokens=128)
nlp=lambda txt:tgn(txt)[0]["generated_text"]
print(nlp("国境の長いトンネルを抜けると雪国であった。夜の底が白くなった。"))

『雪国』冒頭部の続きを生成させてみたところ、私(安岡孝一)の手元では以下の結果が出力された。

国境の長いトンネルを抜けると雪国であった。夜の底が白くなった。

どこへ行ったんだ?

A) 富士山

B) ボスコ・ヴェルー

C) 高尾山

D) おしべ川

答え: B) ボスコ・ヴェルー

いや、その、行ったのはボスコとかじゃなくて、ひなびた温泉町(たぶん湯沢温泉)なんだけど、どうして選択肢に無いんだろ。2024年6月11日の記事の手法で、Few-Shot Promptingによる品詞付与も試してみよう。

!test -d transformers || ( git clone --depth=1 https://github.com/huggingface/transformers transformers-all && ln -s transformers-all/src/transformers transformers )
from transformers import AutoTokenizer,Lfm2ForCausalLM,TextGenerationPipeline
tkz=AutoTokenizer.from_pretrained("LiquidAI/LFM2-1.2B")
mdl=Lfm2ForCausalLM.from_pretrained("LiquidAI/LFM2-1.2B")
tgn=TextGenerationPipeline(tokenizer=tkz,model=mdl,max_new_tokens=128)
class TextUPOSList(list):
  __str__=lambda self:"\n".join("###text:"+"".join(t for t,u in s)+"\n###UPOS:"+"|".join(t+"_"+u for t,u in s) for s in self)+"\n"
ex=TextUPOSList()
ex.append([("一","NUM"),("直線","NOUN"),("に","ADP"),("伸びる","VERB"),("電撃","NOUN"),("を","ADP"),("放ち","VERB"),("、","PUNCT"),("電撃","NOUN"),("ダメージ","NOUN"),("を","ADP"),("与える","VERB"),("。","PUNCT")])
ex.append([("色々","ADV"),("と","ADP"),("面白い","ADJ"),("メニュー","NOUN"),("の","ADP"),("ある","VERB"),("店","NOUN"),("。","PUNCT")])
ex.append([("しかも","CCONJ"),("、","PUNCT"),("ここ","PRON"),("は","ADP"),("コース","NOUN"),("が","ADP"),("リーズナブル","ADJ"),("な","AUX"),("の","SCONJ"),("です","AUX"),("。","PUNCT")])
ex.append([("彼","PRON"),("は","ADP"),("コンピュータ","NOUN"),("を","ADP"),("個人","NOUN"),("の","ADP"),("持ち物","NOUN"),("に","ADP"),("し","VERB"),("まし","AUX"),("た","AUX"),("。","PUNCT")])
ex.append([("2007","NUM"),("年","NOUN"),("9","NUM"),("月","NOUN"),("現在","ADV"),("、","PUNCT"),("以下","NOUN"),("の","ADP"),("メーカー","NOUN"),("から","ADP"),("対応","NOUN"),("製品","NOUN"),("が","ADP"),("発売","VERB"),("さ","AUX"),("れ","AUX"),("て","SCONJ"),("いる","VERB"),("。","PUNCT")])
nlp=lambda t:"\n".join(tgn(str(ex)+f"###text:{t}\n###UPOS:")[0]["generated_text"].split("\n")[len(ex)*2:len(ex)*2+2])
print(nlp("国境の長いトンネルを抜けると雪国であった。"))

私の手元では、以下の結果が得られた。

###text:国境の長いトンネルを抜けると雪国であった。
###UPOS:国境_NOUN|の_ADP|長い_VERB|を_ADJ|トンネル_NOUN|を_ADP|抜ける_ADV|、_PUNCT|長い_ADJ|の_ADP|国境_NOUN|を_ADP|抜ける_VERB|、_PUNCT|長い_NOUN|の_ADP|雪_NOUN|を_ADP|つける_VERB|、_PUNCT|雪_ADJ|の_ADP|国境_NOUN|を_ADP|に_ADV|つける_VERB|。

品詞付与もうまく行かないようだ。もちろん、バグが残ってる可能性も考えられるので、TransformersのLFM2正式サポートを待った方がいいかしら。

0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?