1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

BERTベースのモデルを数行コードで文書要約を試した

Last updated at Posted at 2022-11-29

pipelineで文章要約

前回紹介して、Transformersのpipelineを利用して、わずか数行のコードで NLP タスクを実装できる。
今回は文書要約を試した内容を記載します。

嬉しいこと

BERTベースのモデルを数行コードで文章要約が推論できます。

利用条件

タスク名を指定するだけで簡単にタスクの実行が可能です。

実際に試した

1.要約したい文章

#要約したい文章
text_to_summarize='''
It has earned the moniker "bin chicken" for its propensity to scavenge food from anywhere it can - messily raiding garbage and often stealing food right out of people's hands.
But the native bird may have figured out how to overhaul its bad reputation.
It has developed an "ingenious" method of eating one of the only animals Australians hate more - the cane toad, a toxic and pervasive pest.
First introduced to Australia in the 1930s, cane toads have no natural predators in the country and have wrought havoc on native animal populations.
The toad's skin contains venom which it releases when threatened, causing most animals that come into contact with it to die quickly of a heart attack.
Hence Emily Vincent's surprise when members of the community started sending her pictures and videos of ibis "playing" with the amphibians.
Ms Vincent, who runs the invasive species programmes at environment charity Watergum, says the behaviour has been reported up and down Australia's east coast.
"Ibis were flipping the toads about, throwing them in the air, and people just wondered what on earth they were doing," she told the BBC.
"After this they would always either wipe the toads in the wet grass, or they would go down to a water source nearby, and they would rinse the toads out."
She believes it is evidence of a "stress, wash and repeat" method that the birds have developed to rid the toads of their toxins before swallowing them whole.
"It really is quite amusing."
It isn't the first time birds have been spotted eating cane toads, Macquarie University Professor Rick Shine told the BBC.
They seem to be less susceptible to the poison than other animals, like snakes, mammals or crocodiles.
But they can still die from too much of it and it tastes "awful", Prof Shine says.
So as the species spread across Australia, birds like hawks and crows rather quickly figured out how eat around the poison glands on their shoulder.
They would flip the toads on their back and rip out their insides, leaving the glands untouched.
But this is the first time Prof Shine - who has studied toads for 20 years - has heard of birds using a method like this to eat them whole.
"Ibis do get an unfair reputation... [but] this demonstrates that these are clever birds," Ms Vincent says.
"They've actually forced the cane toad to get rid of the toxin itself, they haven't had to mutilate it in any way. The cane toad is doing all the work for them."
'''
#必要のライブラリーをインポートする
#pip install torch torchvision
#pip install sentencepiece
from transformers import *

#推論コード
summarizer = pipeline('summarization')
print(summarizer(text_to_summarize))
'''

文書が310ぐらいの文字数で要約できました。重要な文章を抜粋したそうです。

実行結果
[{'summary_text': ' Bird has developed an \'ingenious\' method of eating cane toads whole . The birds flip the toads on their back and rip out their insides, leaving the glands untouched . The method is evidence of a "stress, wash and repeat" method that the birds have developed to rid toads of toxins before swallowing them whole .'}]

2.max_lengthとmin_lengthを指定して、要約文書の長さをコントロールできそうです。

print(summarizer(text_to_summarize, max_length=100, min_length=50, do_sample=False))

文書が290ぐらいの文字数で要約できました。引数の130文字以上になりました。

実行結果
[{'summary_text': ' Bird has developed an \'ingenious\' method of eating cane toads whole . The birds flip the toads on their back and rip out their insides, leaving the glands untouched . The method is evidence of a "stress, wash and repeat" method to rid toads of their toxins before swallowing them whole .'}]

3.ちなみに、ローカルのモデルを使用したい場合

tokenizer = AutoTokenizer.from_pretrained('ローカルのモデルのパース')
summarizer = pipeline("summarization", model='ローカルのモデルのパース', tokenizer=tokeneizer)

感想

非常に利用しやすい印象です!簡単にモデルの精度を体験したい場合、コードがいっぱい書く必要がありません。

関連記事

要約の新聞

1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?