More than 3 years have passed since last update.

BERTとワタシで、論文『Attention Is All You Need』に対する読解力を、競ってみた(全20問）。

Last updated at 2022-01-08Posted at 2021-02-12

概要

以前に、

という記事を書いたことがある。
この記事は、
以下のサイトで、提示されていた、BERTを使って、TOEICのPart 5の問題を解くためのコード（30行程度）を動かしてみたというものである。
https://www.ai-shift.jp/techblog/281

今回、
BERTで使っているTransformerを提案している有名な論文『Attention Is All You Need』内の文章を対象に、TOEICのPart 5の穴埋め問題を作成し、
競ってみた。

勝ち負けは、

間違った候補を選択させた：　ワタシの勝ち
正しい候補を選択した：　BERT の勝ち

結果

問題1		備考
問題文	To the best of our * , however, the Transformer is the ﬁrst transduction model relying entirely on self-attention to compute representations of its input and output without using sequencealigned RNNs or convolution.
選択肢	["effort", "property", "knowledge", "method", "technology"]##knowledge	末尾再掲の候補が正解
BERTの回答	(0, 'knowledge')	候補以外も含めた第一の解：knowledge
勝者	BERT

問題2		備考
問題文	The dominant sequence * models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder.
選択肢	["transduction", "transformation", "translation", "method", "transfer"]##transduction	末尾再掲の候補が正解
BERTの回答	(68, 'method')	候補以外も含めた第一の解：learning
勝者	ワタシ	論文の出だしの文章であり、ここで、間違うのは、かなりBERTダメ！！！

問題3		備考
問題文	We are excited about the future of attention-based models and plan to * them to other tasks.
選択肢	["make", "apply", "have", "show", "append"]##apply	末尾再掲の候補が正解
BERTの回答	(0, 'apply')	候補以外も含めた第一の解：apply
勝者	BERT

問題4		備考
問題文	We show that the Transformer * well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.
選択肢	["generalizes", "has", "makes", "does", "computes"]##generalizes	末尾再掲の候補が正解
BERTの回答	(6, 'does')	候補以外も含めた第一の解：responds
勝者	ワタシ	does選んで頂きありがとうございます。ダメでしょう、、、文章の雰囲気で、ここは、doesじゃないでしょ。

問題5		備考
問題文	Self-attention, * called intra-attention is an attention mechanism relating different positions of a single sequence in order to compute a representation of the sequence.
選択肢	["always", "sometimes", "often", "only", "quickly"]##sometimes	末尾再掲の候補が正解
BERTの回答	(1, 'sometimes')	候補以外も含めた第一の解：also
勝者	BERT

問題6		備考
問題文	"Due to the reduced dimension of each head, the total computational * is similar to that of single-head attention with full dimensionality.
選択肢	["value", "power", "effort", "time", "cost"]##cost	末尾再掲の候補が正解
BERTの回答	(0, 'cost')	候補以外も含めた第一の解：cost
勝者	BERT

問題7		備考
問題文	The encoder is * of a stack of N = 6 identical layers.
選択肢	["consisted", "composed", "made", "devoted", "kind"]##composed	末尾再掲の候補が正解
BERTの回答	(0, 'composed')	候補以外も含めた第一の解：composed
勝者	BERT	この類、簡単なので。。。

問題8		備考
問題文	Most competitive neural sequence transduction models have an encoder-decoder *.
選択肢	["structure", "format", "process", "architecture", "scheme"]##structure	末尾再掲の候補が正解
BERTの回答	(4, 'structure')	候補以外も含めた第一の解：interface
勝者	BERT

問題9		備考
問題文	An attention function can be described as mapping a query and a set of key-value pairs to an output, * the query, keys, values, and output are all vectors.
選択肢	["that", "which", "where", "when", "so"]##where	末尾再掲の候補が正解
BERTの回答	(0, 'where')	候補以外も含めた第一の解：where
勝者	BERT	英語が得意な人には、たぶん、簡単

問題10		備考
問題文	Multi-head attention allows the model to jointly attend to information from * representation subspaces at different positions.
選択肢	["prominent", "sufficient", "efficient", "some", "different"]##different	末尾再掲の候補が正解
BERTの回答	(0, 'different')	候補以外も含めた第一の解：different
勝者	BERT

問題11		備考
問題文	There are many choices of positional encodings, learned and *.
選択肢	["modified", "fed", "trained", "made", "ﬁxed"]##fixed	末尾再掲の候補が正解
BERTの回答	(90, 'modified')	候補以外も含めた第一の解：learned
勝者	ワタシ	ちょっと、問題に無理があるかも。。

問題12		備考
問題文	We chose the sinusoidal version because it may * the model to extrapolate to sequence lengths longer than the ones encountered during training.
選択肢	["have", "allow", "follow", "introduce", "make"]##allow	末尾再掲の候補が正解
BERTの回答	(0, 'allow')	候補以外も含めた第一の解：allow
勝者	BERT

問題13		備考
問題文	Even our base model * all previously published models and ensembles, at a fraction of the training cost of any of the competitive models.
選択肢	["prefers", "surpasses", "has", "uses", "makes"]##surpasses	末尾再掲の候補が正解
BERTの回答	(3, 'uses')	候補以外も含めた第一の解：includes
勝者	ワタシ

問題14		備考
問題文	Making generation less sequential is another research * of ours.
選択肢	["income", "benefits", "purposes", "goals", "aims"]##goals	末尾再掲の候補が正解
BERTの回答	(229, 'goals')	候補以外も含めた第一の解：project
勝者	BERT	これは、「229」番目なので、ワタシには、沢山の攻めどころがあったみたい、残念！

問題15		備考
問題文	Similarly, self-attention layers in the decoder allow each position in the decoder to attend to all positions in the decoder up to and including * position.
選択肢	["one", "any", "that", "the", "some"]##that	末尾再掲の候補が正解
BERTの回答	(0, 'that')	候補以外も含めた第一の解：that
勝者	BERT

問題16		備考
問題文	We chose the sinusoidal * because it may allow the model to extrapolate to sequence lengths longer than the ones encountered during training.
選択肢	["format", "type", "version", "style", "type"]##version	末尾再掲の候補が正解
BERTの回答	(13, 'version')	候補以外も含めた第一の解：model
勝者	BERT	「13」番目なので、ワタシに攻めどころがあったよう。versionを当てたのは、ちょっと、凄いかも。。。

問題17		備考
問題文	Learning long-range dependencies is a * challenge in many sequence transduction tasks.
選択肢	["main", "key", "new", "hot", "famous"]##key	末尾再掲の候補が正解
BERTの回答	(2, 'key')	候補以外も含めた第一の解：major
勝者	BERT

問題18		備考
問題文	While single-head attention is 0.9 BLEU worse than the best setting, quality also drops off with too many *.
選択肢	["heads", "hands", "efforts", "eyes", "tools"]##heads	末尾再掲の候補が正解
BERTの回答	(6, 'eyes')	候補以外も含めた第一の解：people
勝者	ワタシ	へっ、eyesにかかるかーーーーー。

問題19		備考
問題文	For * tasks, the Transformer can be trained signiﬁcantly faster than architectures based on recurrent or convolutional layers.
選択肢	["transformation", "conversion", "transduction", "transfer", "translation"]##translation	末尾再掲の候補が正解
BERTの回答	(314, 'transformation')	候補以外も含めた第一の解： specific
勝者	ワタシ	ここは、BERTが必ずとらないといけない問題！！

問題20		備考
問題文	We are grateful to Nal Kalchbrenner and Stephan Gouws for their fruitful comments, corrections and *.
選択肢	["inspiration", "inception", "ideas", "efforts", "mails"]##inspiration	末尾再掲の候補が正解
BERTの回答	(49, 'inspiration')	候補以外も含めた第一の解： corrections
勝者	BERT	「49」番目なので、ここも、ワタシに攻めどころがあったよう。

まとめ

特にありません。
このやり方だと、BERTは驚愕するレベルではない。もう少し、**BERTの凄さが出るものを考えたい。**凄いハズなので。

追記(2022/01/08)

↓　この記事とかみると、あまり、凄くはならない気がしてきた。。。。

『BertとかのAI、、、文章の本質理解は、まだ遠い！！ことを「穴埋め問題」で確認する。』
https://ai-de-seikei.hatenablog.com/entry/2022/01/08/165146

引用 (3個の部分が、穴埋め問題！！）

先ほど、お店でりんごを 9 個買って、帰りながら 2 個食べた。だから、残りは、3 個になった。

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up

BERTとワタシで、論文『Attention Is All You Need』に対する読解力を、競ってみた(全20問）。

概要

結果

まとめ

追記(2022/01/08)

参考（自分の関連記事）