LoginSignup
10
10

More than 5 years have passed since last update.

CoreNLPのダウンロードから利用のメモ

Posted at

CoreNLPとは

http://nlp.stanford.edu/software/corenlp.shtml
で公開されていて,tokenization,sentence splitting,POS tagging,lemmatization,NER,syntactic parsing,coreference resolution が全て入った無料ツール.
coreference resolutionが現在のstate of the artと言える.

ダウンロード

wget http://nlp.stanford.edu/software/stanford-corenlp-2012-05-22.tgz

解凍

tar xvfz stanford-corenlp-2012-05-22.tgz

さぁ使うぞと思ったら「PosTaggerのモデルが無い」と.はい.

jar -xvf stanford-corenlp-2012-05-22-models.jar

使い方

java -cp stanford-corenlp-2012-05-22.jar:xom.jar:joda-time.jar -Xmx3g edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref -file input.txt

-annotators tokenize,ssplit,pos,lemma,ner,parse,dcoref
はoption.どこまでやるか.defaultは全て.

アウトプットはinput.txt.xml
10秒程度で出力された.もう少し長い文章でも30秒で完了.

10
10
1

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
10
10