自然言語処理モデルBERTの検証(7)-GLUEベンチマーク(その5)

Last updated at 2022-04-07Posted at 2022-04-07

[前回] 自然言語処理モデルBERTの検証(6)-GLUEベンチマーク(その4)

はじめに

今回は、GLUEベンチマークで「2つの入力文の意味が等価であるか」を判定するタスクです。
以下、2タスクを検証します。

QQP: 2つの質問文の意味が等価か判定
MRPC: 2つの文が等しいか否かを判定

検証手順

以下をご参照ください。
自然言語処理モデルBERTの検証(3)-GLUEベンチマーク(その1)

その中で、手順「GLUEからタスクを選択」のみ、
ドロップダウンメニューから、該当するタスクを選びなおす必要あります。

検証過程は前回と変わらないので、割愛させていただきます。
テスト結果の共有と比較です。

QQP: 2つの質問文の意味が等価か判定

※ ファインチューニングに、46分かかりました。。。

question1: tf.Tensor([b'Why do parents have to beat up their children?'], shape=(1,), dtype=string)
question2: tf.Tensor([b'Why do children hit their parents during their childhood?'], shape=(1,), dtype=string)
Questions are similar
BERT raw results: tf.Tensor([-2.197264   1.4847997], shape=(2,), dtype=float32)

question1: tf.Tensor([b'How can someone make money online without any money?'], shape=(1,), dtype=string)
question2: tf.Tensor([b'Is there any way I can earn money online without any kind of investment?'], shape=(1,), dtype=string)
Questions are NOT similar
BERT raw results: tf.Tensor([ 1.1174352 -0.5453076], shape=(2,), dtype=float32)

question1: tf.Tensor([b'If a space shuttle have reached the speed of light, which habitable planet that you will travel first?'], shape=(1,), dtype=string)
question2: tf.Tensor([b'Is it possible for humans to ever travel at (or near) the speed of light?'], shape=(1,), dtype=string)
Questions are NOT similar
BERT raw results: tf.Tensor([ 4.424968  -3.3564684], shape=(2,), dtype=float32)

question1: tf.Tensor([b'How does Britain voting to leave European Union affect Indian students pursuing higher education in UK?'], shape=(1,), dtype=string)
question2: tf.Tensor([b'What exactly is BREXIT? How does it impact India? What are the repercussions of world economic order post brexit?'], shape=(1,), dtype=string)
Questions are NOT similar
BERT raw results: tf.Tensor([ 5.2118597 -4.5501575], shape=(2,), dtype=float32)

question1: tf.Tensor([b'What are some creative DIY gift ideas for a girlfriend?'], shape=(1,), dtype=string)
question2: tf.Tensor([b'How do I find the best handmade crafts?'], shape=(1,), dtype=string)
Questions are NOT similar
BERT raw results: tf.Tensor([ 5.3236904 -4.571609 ], shape=(2,), dtype=float32)

2つの入力文が等価であるか判定してくれました。
1番目の問題が等価判定されたのは微妙ですが、その内容に思わず笑いました(アメリカンジョーク?)。

入力文1

Why do parents have to beat up their children?
なぜ親は子供を殴らなければならないのですか？

入力文2

Why do children hit their parents during their childhood?
なぜ子供たちは子供時代に両親を殴るのですか？

MRPC: 2つの文が等しいか否かを判定

※ こちらのファインチューニングは2分しかかかりませんでした

sentence1: tf.Tensor([b'Last year , he made an unsuccessful bid for the Democratic nomination for governor .'], shape=(1,), dtype=string)
sentence2: tf.Tensor([b'He ran last year for the Democratic nomination for Texas governor , but lost the primary to multimillionaire Tony Sanchez .'], shape=(1,), dtype=string)
Are NOT a paraphrase
BERT raw results: tf.Tensor([1.0423813 0.0442829], shape=(2,), dtype=float32)

sentence1: tf.Tensor([b'Ohmer ruled the law warranted further review by the Missouri Supreme Court and would have caused irreparable harm had it taken effect Saturday .'], shape=(1,), dtype=string)
sentence2: tf.Tensor([b'Circuit Judge Steven Ohmer ruled Friday that the law needed a further review by the state Supreme Court and would have caused irreparable harm had it taken effect Saturday .'], shape=(1,), dtype=string)
Are a paraphrase
BERT raw results: tf.Tensor([-2.8398404  2.3117552], shape=(2,), dtype=float32)

sentence1: tf.Tensor([b'Quinn was assigned to the 2nd Squadron , 3rd Armor Cavalry Regiment .'], shape=(1,), dtype=string)
sentence2: tf.Tensor([b'Quinn was assigned to the 3rd Armored Cavalry Regiment , based in Fort Carson , Colo .'], shape=(1,), dtype=string)
Are a paraphrase
BERT raw results: tf.Tensor([0.25087687 0.58792216], shape=(2,), dtype=float32)

sentence1: tf.Tensor([b'The router will be available in the first quarter of 2004 and will cost around $ 200 , the company said .'], shape=(1,), dtype=string)
sentence2: tf.Tensor([b'Netgear prices the WGT634U Super Wireless Media Router , which will be available in the first quarter of 2004 , at under $ 200 .'], shape=(1,), dtype=string)
Are NOT a paraphrase
BERT raw results: tf.Tensor([ 1.0009788  -0.26314878], shape=(2,), dtype=float32)

sentence1: tf.Tensor([b'The indictment follows a criminal complaint filed by federal prosecutors on April 23 .'], shape=(1,), dtype=string)
sentence2: tf.Tensor([b"Monday 's three-count indictment replaces a criminal complaint filed by prosecutors on April 23 ."], shape=(1,), dtype=string)
Are a paraphrase
BERT raw results: tf.Tensor([-2.7094476  2.1998882], shape=(2,), dtype=float32)

二つの入力文が入れ替えられるか判定してくれました。
2番目の問題は、入れ替えられると見事に判定されています。

入力文1

Ohmer ruled the law warranted further review by the Missouri Supreme Court and would have caused irreparable harm had it taken effect Saturday.
オーマー氏は、ミズーリ州最高裁判所によるさらなる見直しが必要であり、土曜日に施行された場合、取り返しのつかない損害をもたらす可能性があるとの判決を下した。

入力文2

Circuit Judge Steven Ohmer ruled Friday that the law needed a further review by the state Supreme Court and would have caused irreparable harm had it taken effect Saturday.
巡回裁判官のスティーブン・オーマーは金曜日に、法律は州最高裁判所によるさらなる見直しが必要であり、それが土曜日に発効した場合、取り返しのつかない損害を引き起こすであろうと裁定した。

おわりに

2つの文が等しいか否かを判定するタスクを検証しました。
どのようなビジネスシーンで利用できるか気になります。

次回は、残りのタスクを検証します。
お楽しみに。

[次回] 自然言語処理モデルBERTの検証(8)-GLUEベンチマーク(その6)

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up