Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

NeoBERTは「It don't [MASK] a thing if it ain't got that swing」の[MASK]に何を埋めてくるのか

Posted at

Lola Le Breton, Quentin Fournier, Mariam El Mezouar, Sarath Chandar『NeoBERT: A Next-Generation BERT』を横目に、NeoBERTを使ってみることにした。Google Colaboratoryだと、こんな感じ。

!pip install -U transformers xformers torchvision
from transformers import AutoTokenizer,AutoModelForMaskedLM,FillMaskPipeline
print(fmp("It don't [MASK] a thing if it ain't got that swing"))

「It don't [MASK] a thing if it ain't got that swing」の[MASK]に何を埋めてくるのか試したところ、私(安岡孝一)の手元では以下の結果が得られた。

[{'score': 0.9794992804527283, 'token': 2812, 'token_str': 'mean', 'sequence': "it don ' t mean a thing if it ain ' t got that swing"}, {'score': 0.0026010603178292513, 'token': 2342, 'token_str': 'need', 'sequence': "it don ' t need a thing if it ain ' t got that swing"}, {'score': 0.002511243801563978, 'token': 3465, 'token_str': 'cost', 'sequence': "it don ' t cost a thing if it ain ' t got that swing"}, {'score': 0.002277001040056348, 'token': 17042, 'token_str': 'weigh', 'sequence': "it don ' t weigh a thing if it ain ' t got that swing"}, {'score': 0.002097407588735223, 'token': 2965, 'token_str': 'means', 'sequence': "it don ' t means a thing if it ain ' t got that swing"}]

「mean」が98%でダントツなあたり、この英文をNeoBERTは知っているのだろう。ただ、NeoBERTのmodel.pyには、系列ラベリング(token classification)が見当たらない。https://github.com/chandar-lab/NeoBERT も、今のところリンクが繋がらない。いくら入出力幅4096トークンとは言っても、係り受け解析に使うには情報がなさすぎるのだけど、待てば何とかなるのかしら。


Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?