LoginSignup
2
6

More than 1 year has passed since last update.

Dockerを使ったElasticsearchの構築

Last updated at Posted at 2021-07-03

はじめに

日頃は,土木分野をターゲットとしたAIの応用研究をしているのですが,隣の課から
「数千万のテキストファイルがあるのだけど,ファイル数が多すぎて検索スピードが遅いのようね.なんとかならない?」との相談があった.

昔,研究でHadoopやHDFSを使った経験あるけどGISを対象としていたし,そもそもHadoopってバッチ処理じゃん!という認識が強いので,全文検索に特化した良いものないかと調査していたらElasticsearchに出会った(今更感イッパイ).

試しに触ってみるために,まずはElasticsearchの構築をしてみようと思う.
ただ,docker-composeの記載方法を完全に忘れてしまったため,構築方法は分割して投稿する...(オッサンになると記憶力が...)

ということで,今回はElasticsearchのDockerイメージ作成まで.

参考にさせて頂いたサイト

この構築では,以下のサイトを参考させていただきました.とても感謝.

初めてのElasticsearch with Docker
ElasticsearchのGetting started with Elasticsearch

構築環境

CPU Intel(R) Core(TM) i7-9700K CPU @ 3.60GHz
メモリ容量 32GB
OS Ubuntu 18.04.4 LTS (Bionic Beaver)
Docker Docker version 20.10.5, build 55c4c88
docker-compose version 1.16.1, build 6d1ac21

Elasticsearchの構築開始

ElasticsearchのDockerイメージ作成

まずは,なによりDockerfileの作成ですね.
DockerHubのelasticsearchを見てみると,現在の最新バージョンは7.13.2なので,Dockerfileの1行目に7.13.2のdockerイメージをダウンロードするように記載.
日本人なので,日本語対応するためにkuromojiのプラグインもインストール(2行目).
さらに,最近の用語も対応したいので,Neologdのプラグインもインストール(3行目).

FROM docker.elastic.co/elasticsearch/elasticsearch:7.13.2
RUN elasticsearch-plugin install analysis-kuromoji
RUN elasticsearch-plugin install org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0

(追記ここから)
以前,↑では「RUN elasticsearch-plugin install org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0」を記載していたが,次の記事に書くdocker-compose upでJavaのエラーが出現したため削除.エラー内容は記事の最後に追加.
(追記ここまで)

Dockerfileもできたので,docker buildを実行.

$ docker build -f ./Dockerfile .
Sending build context to Docker daemon  4.096kB
Step 1/3 : FROM docker.elastic.co/elasticsearch/elasticsearch:7.13.2
7.13.2: Pulling from elasticsearch/elasticsearch
ddf49b9115d7: Pull complete
815a15889ec1: Pull complete
ba5d33fc5cc5: Pull complete
976d4f887b1a: Pull complete
9b5ee4563932: Pull complete
ef11e8f17d0c: Pull complete
3c5ad4db1e24: Pull complete
Digest: sha256:1cecc2c7419a4f917a88c83180335bd491d623f28ac43ca7e0e69b4eca25fbd5
Status: Downloaded newer image for docker.elastic.co/elasticsearch/elasticsearch:7.13.2
 ---> 11a830014f7c
Step 2/3 : RUN elasticsearch-plugin install analysis-kuromoji
 ---> Running in 0cdc1f7f3102
-> Installing analysis-kuromoji
-> Downloading analysis-kuromoji from elastic
[=================================================] 100%??
-> Installed analysis-kuromoji
-> Please restart Elasticsearch to activate any plugins installed
Removing intermediate container 0cdc1f7f3102
 ---> 144040a82003
Step 3/3 : RUN elasticsearch-plugin install org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0
 ---> Running in 71997e0aca6e
-> Installing org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0
-> Downloading org.codelibs:elasticsearch-analysis-kuromoji-ipadic-neologd:7.1.0 from maven central
[=================================================] 100%??
Warning: sha512 not found, falling back to sha1. This behavior is deprecated and will be removed in a future release. Please update the plugin to use a sha512 checksum.
-> Installed analysis-kuromoji-ipadic-neologd
-> Please restart Elasticsearch to activate any plugins installed
Removing intermediate container 71997e0aca6e
 ---> f53988aa2593
Successfully built f53988aa2593

ログを見る限り問題なさそう.でも,docker imagesで確認.

$ docker images
REPOSITORY                                      TAG                             IMAGE ID       CREATED          SIZE
docker.elastic.co/elasticsearch/elasticsearch   7.13.2                          11a830014f7c   3 weeks ago      1.02GB

次は,docker-compose.ymlを書いて起動まで行こうと思う.

(追記ここから)
追記で書いたJavaのエラー内容.

es01    | "stacktrace": ["java.lang.NoSuchMethodError: 'void org.elasticsearch.index.analysis.AbstractTokenizerFactory.<init>(org.elasticsearch.index.IndexSettings, org.elasticsearch.common.settings.Settings)'",
es01    | "at org.codelibs.elasticsearch.kuromoji.ipadic.neologd.index.analysis.KuromojiTokenizerFactory.<init>(KuromojiTokenizerFactory.java:50) ~[?:?]",
es01    | "at org.elasticsearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:433) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:275) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:203) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:431) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:663) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:566) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.validateTemplate(MetadataIndexTemplateService.java:1288) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.access$300(MetadataIndexTemplateService.java:83) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService$6.execute(MetadataIndexTemplateService.java:775) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:48) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:691) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:313) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:208) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:62) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:140) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:139) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:177) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204) ~[elasticsearch-7.13.2.jar:7.13.2]",
es01    | "at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) ~[?:?]",
es01    | "at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) ~[?:?]",
es01    | "at java.lang.Thread.run(Thread.java:831) [?:?]"] }
es01    | fatal error in thread [elasticsearch[es01][masterService#updateTask][T#1]], exiting
es01    | java.lang.NoSuchMethodError: 'void org.elasticsearch.index.analysis.AbstractTokenizerFactory.<init>(org.elasticsearch.index.IndexSettings, org.elasticsearch.common.settings.Settings)'
es01    |       at org.codelibs.elasticsearch.kuromoji.ipadic.neologd.index.analysis.KuromojiTokenizerFactory.<init>(KuromojiTokenizerFactory.java:50)
es01    |       at org.elasticsearch.index.analysis.AnalysisRegistry.buildMapping(AnalysisRegistry.java:433)
es01    |       at org.elasticsearch.index.analysis.AnalysisRegistry.buildTokenizerFactories(AnalysisRegistry.java:275)
es01    |       at org.elasticsearch.index.analysis.AnalysisRegistry.build(AnalysisRegistry.java:203)
es01    |       at org.elasticsearch.index.IndexModule.newIndexService(IndexModule.java:431)
es01    |       at org.elasticsearch.indices.IndicesService.createIndexService(IndicesService.java:663)
es01    |       at org.elasticsearch.indices.IndicesService.createIndex(IndicesService.java:566)
es01    |       at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.validateTemplate(MetadataIndexTemplateService.java:1288)
es01    |       at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService.access$300(MetadataIndexTemplateService.java:83)
es01    |       at org.elasticsearch.cluster.metadata.MetadataIndexTemplateService$6.execute(MetadataIndexTemplateService.java:775)
es01    |       at org.elasticsearch.cluster.ClusterStateUpdateTask.execute(ClusterStateUpdateTask.java:48)
es01    |       at org.elasticsearch.cluster.service.MasterService.executeTasks(MasterService.java:691)
es01    |       at org.elasticsearch.cluster.service.MasterService.calculateTaskOutputs(MasterService.java:313)
es01    |       at org.elasticsearch.cluster.service.MasterService.runTasks(MasterService.java:208)
es01    |       at org.elasticsearch.cluster.service.MasterService.access$000(MasterService.java:62)
es01    |       at org.elasticsearch.cluster.service.MasterService$Batcher.run(MasterService.java:140)
es01    |       at org.elasticsearch.cluster.service.TaskBatcher.runIfNotProcessed(TaskBatcher.java:139)
es01    |       at org.elasticsearch.cluster.service.TaskBatcher$BatchedTask.run(TaskBatcher.java:177)
es01    |       at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingRunnable.run(ThreadContext.java:673)
es01    |       at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.runAndClean(PrioritizedEsThreadPoolExecutor.java:241)
es01    |       at org.elasticsearch.common.util.concurrent.PrioritizedEsThreadPoolExecutor$TieBreakingPrioritizedRunnable.run(PrioritizedEsThreadPoolExecutor.java:204)
es01    |       at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
es01    |       at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
es01    |       at java.base/java.lang.Thread.run(Thread.java:831)
es01 exited with code 1

(追記ここまで)

2
6
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
2
6