0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 1 year has passed since last update.

すぐできる固有表現抽出

Posted at

はじめに

pythonのライブラリspaCy を使って固有表現抽出する.
今回はspaCyの最も精度が良いRoBERTa ベースのモデルを利用する.

実装

Install

# model 
!pip install https://huggingface.co/spacy/en_core_web_trf/resolve/main/en_core_web_trf-any-py3-none-any.whl

# spacy-transformers
!pip install spacy-transformers -f https://download.pytorch.org/whl/torch_stable.html

load

# Using spacy.load().
import spacy
import spacy_transformers

# model
nlp = spacy.load("en_core_web_trf")

use

この部分はspyCa サイトから

# Process whole documents
text = ("When Sebastian Thrun started working on self-driving cars at "
        "Google in 2007, few people outside of the company took him "
        "seriously. “I can tell you very senior CEOs of major American "
        "car companies would shake my hand and turn away because I wasn’t "
        "worth talking to,” said Thrun, in an interview with Recode earlier "
        "this week.")
doc = nlp(text)

# Analyze syntax
print("Noun phrases:", [chunk.text for chunk in doc.noun_chunks])
print("Verbs:", [token.lemma_ for token in doc if token.pos_ == "VERB"])

# Find named entities, phrases and concepts
for entity in doc.ents:
    print(entity.text, entity.label_)

感想

  • 入力文の長さによって結果が異なります.
  • 時々長い文を入力しても,何も返さない場合がある.これは困っています.
0
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
0
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?