More than 5 years have passed since last update.

Stanford CoreNLP Server を立ててみる for Windows（最短ガイド）

Last updated at 2020-10-08Posted at 2020-10-08

CoreNLP Server

CoreNLP Server の詳しい説明は以下にあります。
サンプルも掲載されています。

CoreNLP Server - CoreNLP
https://stanfordnlp.github.io/CoreNLP/corenlp-server.html

ダウンロード

いくつかのダウンロード方法が用意されていますが、最も簡単なのは以下URLからファイルをダウンロードすることです。
http://nlp.stanford.edu/software/stanford-corenlp-latest.zip

ダウンロードされたZIPファイルを展開します。

2020年10月現在では以下フォルダが展開されます。

stanford-corenlp-4.1.0

フォルダを以下のパスに移動します。お好みに合わせて変更してください。

C:\usr\local\stanford-corenlp-4.1.0

実行

コマンドプロンプトにて C:\usr\local\stanford-corenlp-4.1.0 に移動し、Javaコマンドを実行します。

C:\>cd C:\usr\local\stanford-corenlp-4.1.0
C:\usr\local\stanford-corenlp-4.1.0>java -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
[main] INFO CoreNLP - --- StanfordCoreNLPServer#main() called ---
[main] INFO CoreNLP - Server default properties:
                        (Note: unspecified annotator properties are English defaults)
                        inputFormat = text
                        outputFormat = json
                        prettyPrint = false
[main] INFO CoreNLP - Threads: 8
[main] INFO CoreNLP - Starting server...
[main] INFO CoreNLP - StanfordCoreNLPServer listening at /0:0:0:0:0:0:0:0:9000

クライアントの作成

pom.xml

<!-- https://mvnrepository.com/artifact/edu.stanford.nlp/stanford-corenlp -->
<dependency>
	<groupId>edu.stanford.nlp</groupId>
	<artifactId>stanford-corenlp</artifactId>
	<version>4.0.0</version>
	<scope>provided</scope>
</dependency>


package nlp4j.stanford;

import java.util.List;
import java.util.Properties;

import edu.stanford.nlp.ling.CoreAnnotations.LemmaAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.PartOfSpeechAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TextAnnotation;
import edu.stanford.nlp.ling.CoreAnnotations.TokensAnnotation;
import edu.stanford.nlp.ling.CoreLabel;
import edu.stanford.nlp.pipeline.Annotation;
import edu.stanford.nlp.pipeline.StanfordCoreNLPClient;

public class StanfordClientSample1 {

	public static void main(String[] args) throws Exception {

		// creates a StanfordCoreNLP object with POS tagging, lemmatization
		Properties props = new Properties();
		props.setProperty("annotators", "tokenize, ssplit, pos, lemma");

		StanfordCoreNLPClient pipeline = new StanfordCoreNLPClient(props, "http://localhost", 9000, 2);

		// read some text in the text variable
		String text = "This is sample text for NLP.";

		// create an empty Annotation just with the given text
		Annotation document = new Annotation(text);

		// run all Annotators on this text
		pipeline.annotate(document);

		{
			List<CoreLabel> labels = document.get(TokensAnnotation.class);
			// for each labels
			for (CoreLabel label : labels) { //
				String str = label.get(TextAnnotation.class);
				String lex = label.get(LemmaAnnotation.class);
				String pos = label.get(PartOfSpeechAnnotation.class);
				int begin = label.beginPosition();
				int end = label.endPosition();

				System.err.println("str=" + str + ",lex=" + lex + ",pos=" + pos + ",begin=" + begin + ",end=" + end);

			} // for each labels
		}

	}

}

実行結果

str=This,lex=this,pos=DT,begin=0,end=4
str=is,lex=be,pos=VBZ,begin=5,end=7
str=sample,lex=sample,pos=NN,begin=8,end=14
str=text,lex=text,pos=NN,begin=15,end=19
str=for,lex=for,pos=IN,begin=20,end=23
str=NLP,lex=nlp,pos=NN,begin=24,end=27
str=.,lex=.,pos=.,begin=27,end=28

まとめ

JavaでStanfordNLPを起動するとロードに時間がかかるのですが、Core NLP Server を利用すると２回目以降の呼び出しがかなり高速になります
サーバーの立て方は慣れれば簡単ですのでオススメです。

以上

You get articles that match your needs
You can efficiently read back useful information
You can use dark theme

What you can do with signing up