LoginSignup
9
8

More than 5 years have passed since last update.

第18回Lucene/Solr勉強会 #SolrJP @Yahoo! JAPAN BASE6 発表資料 デモ構築手順

Last updated at Posted at 2016-06-10

第18回Lucene/Solr勉強会 #SolrJP @Yahoo! JAPAN BASE6 発表資料 デモ構築手順

第18回Lucene/Solr勉強会で、デモンストレーションに使った環境構築手順を記載します。
設定ファイルなどは、各環境に合わせて修正が必要になるかもしれませんが、参考にしていただければと思います。

ZooKeeper

Solr の Parallel SQL は SolrCloud 環境でのみ動作します。SolrCloud モードで Solr を起動するために ZooKeeper をインストールします。

# Install ZooKeeper.
$ mkdir -p ${HOME}/zookeeper
$ curl -L -o ${HOME}/zookeeper/zookeeper-3.4.6.tar.gz http://archive.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz
$ tar -C ${HOME}/zookeeper -xf ${HOME}/zookeeper/zookeeper-3.4.6.tar.gz

# Download configuration files from GitHub.
$ curl -L -o ${HOME}/zookeeper/zookeeper-3.4.6/conf/zoo.cfg https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/zookeeper/conf/zoo.cfg
$ curl -L -o ${HOME}/zookeeper/zookeeper-3.4.6/conf/zookeeper-env.sh https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/zookeeper/conf/zookeeper-env.sh

# Start ZooKeeper.
$ ${HOME}/zookeeper/zookeeper-3.4.6/bin/zkServer.sh start

Solr

インストールした ZooKeeper を参照する、SolrCloud を起動します。

# Install Solr.
$ mkdir -p ${HOME}/solr
$ curl -L -o ${HOME}/solr/solr-6.1.0.tar.gz https://archive.apache.org/dist/lucene/solr/6.1.0/solr-6.1.0.tgz
$ tar -C ${HOME}/solr -xf ${HOME}/solr/solr-6.1.0.tar.gz

# Download configuration file for Enabling CORS from GitHub.
$ curl -L -o ${HOME}/solr/solr-6.1.0/server/etc/webdefault.xml https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/solr/server/etc/webdefault.xml

# Create a znode to ZooKeeper for SolrCloud.
$ ${HOME}/solr/solr-6.1.0/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181 -cmd makepath /solr

# Start Solr in SolrCloud mode.
$ ${HOME}/solr/solr-6.1.0/bin/solr start -h localhost -p 8983 -d ${HOME}/solr/solr-6.1.0/server -z localhost:2181/solr -m 1g -s ${HOME}/solr/solr-6.1.0/server/solr -a "-Dsolr.autoCommit.maxTime=30 -Dsolr.autoSoftCommit.maxTime=10"

# Create configsets for realtime_data_driven_schema_configs.
$ cp -pr ${HOME}/solr/solr-6.1.0/server/solr/configsets/data_driven_schema_configs ${HOME}/solr/solr-6.1.0/server/solr/configsets/realtime_data_driven_schema_configs
$ curl -L -o ${HOME}/solr/solr-6.1.0/server/solr/configsets/realtime_data_driven_schema_configs/conf/solrconfig.xml https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/solr/server/solr/configsets/realtime_data_driven_schema_configs/conf/solrconfig.xml

# Upload configsets for access_log.
$ ${HOME}/solr/solr-6.1.0/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd upconfig -confdir ${HOME}/solr/solr-6.1.0/server/solr/configsets/realtime_data_driven_schema_configs/conf -confname access_log_configs

# Create collection for access_log.
$ curl -s "http://localhost:8983/solr/admin/collections?action=CREATE&name=access_log&numShards=1&replicationFactor=1&maxShardsPerNode=1&createNodeSet=localhost:8983_solr&collection.configName=access_log_configs" | xmllint --format -

# Add require fields for access_log.
$ curl -L -o /tmp/access_log.json https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/solr/access_log.json
$ curl -X POST -H "Content-type:application/json" "http://localhost:8983/solr/access_log/schema" -d @/tmp/access_log.json

Flume

アクセスログなどのデータをストリーミングで、Solr へ転送するための Flume をインストールします。
Flume 1.6.0 では Solr 6.x に未対応のため、Solr 6.x 対応した、GitHub からソースコードを取得し、パッケージを作成してインストールを行います。

# Build Flume.
$ mkdir -p ${HOME}/git
$ git clone https://github.com/mosuka/flume.git ${HOME}/git/flume
$ mvn clean compile -DskipTests -f ${HOME}/git/flume/pom.xml
$ mvn clean install -DskipTests -f ${HOME}/git/flume/pom.xml

# Install Flume.
$ mkdir -p ${HOME}/flume
$ cp -r ${HOME}/git/flume/flume-ng-dist/target/apache-flume-1.7.0-SNAPSHOT-bin.tar.gz ${HOME}/flume/.
$ tar -C ${HOME}/flume -xf ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin.tar.gz

# Download configuration files from GitHub.
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/conf/flume-env.sh https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/flume/conf/flume-env.sh
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/conf/flume-conf.properties https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/flume/conf/flume-conf.properties
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/conf/morphline.conf https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/flume/conf/morphline.conf

# Download grok dictionaries from GitHub.
$ mkdir -p ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/firewalls https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/firewalls
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/grok-patterns https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/grok-patterns
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/java https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/java
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/linux-syslog https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/linux-syslog
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/mcollective https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/mcollective
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/mcollective-patterns https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/mcollective-patterns
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/nagios https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/nagios
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/postgresql https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/postgresql
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/redis https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/redis
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/grok-dictionaries/ruby https://raw.githubusercontent.com/kite-sdk/kite/master/kite-morphlines/kite-morphlines-core/src/test/resources/grok-dictionaries/ruby

# Download GeoIP database from MaxMind.
$ mkdir -p ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/geoip
$ curl -L -o ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/geoip/GeoLite2-City.mmdb.gz http://geolite.maxmind.com/download/geoip/database/GeoLite2-City.mmdb.gz
$ gzip -d -c ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/geoip/GeoLite2-City.mmdb.gz > ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/geoip/GeoLite2-City.mmdb
$ rm ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/resources/geoip/GeoLite2-City.mmdb.gz

# The log file to prepare in advance
$ touch /tmp/access.log

# Start Flume.
$ ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/bin/flume-ng agent --conf ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/conf --name agent --conf-file ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/conf/flume-conf.properties -Dflume.log.dir=${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/logs &
$ echo $! > ${HOME}/flume/apache-flume-1.7.0-SNAPSHOT-bin/flume.pid

Zeppelin

Solr にインデックスされたデータを分析するために、Zeppelin をインストールします。
Solr に対して JDBC Driver を使用して接続を行いますが、現在リリースされている Zeppelin は JDBC Driver 未対応のため、GitHub の master をビルドして、パッケージ作成を行います。

# Install ZooKeeper.
$ mkdir -p ${HOME}/zeppelin
$ curl -L -o ${HOME}/zeppelin/zeppelin-0.6.0-bin-all.tgz https://archive.apache.org/dist/zeppelin/zeppelin-0.6.0/zeppelin-0.6.0-bin-all.tgz
$ tar -C ${HOME}/zeppelin -xf ${HOME}/zeppelin/zeppelin-0.6.0-bin-all.tgz

# Download configuration file for changing port number to 8082 from GitHub.
$ curl -L -o ${HOME}/zeppelin/zeppelin-0.6.0-bin-all/conf/zeppelin-site.xml https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/zeppelin/conf/zeppelin-site.xml

# Start Zeppelin.
$ ${HOME}/zeppelin/zeppelin-0.6.0-bin-all/bin/zeppelin-daemon.sh start

Zeppelin 設定
- [shared] : Interpreter for note
- [] Connect to existing process
- Properties
- solr.url = jdbc:solr://localhost:2181/solr?collection=access_log
- solr.driver = org.apache.solr.client.solrj.io.sql.DriverImpl
- Dependencies
- artifact = org.apache.solr:solr-solrj:6.1.0

Banana

Solr にインデックスされるデータをリアルタイムで可視化するために、Banana をインストールします。
Banana は基本的に Solr に組み込む形ですが、今回は、Solr とは別のノードで起動 (Multiple node) するため、Jetty を別に立てて、そちらにデプロイします。
その際、Solr 側で CORS を有効にしておく必要があります。(上記 Solr のインストール手順で行っています。)
設定情報を保存するために Solr を利用することができますが、リリースされている1.6.0では、Multiple node の環境において、リモートの Solr に設定情報を保存できないバグがあるため、その問題を修正したGitHubのソースコードからビルドして、パッケージを作成します。

# Install Jetty.
$ mkdir -p ${HOME}/jetty
$ curl -L -o ${HOME}/jetty/jetty-distribution-9.3.8.v20160314.tar.gz http://download.eclipse.org/jetty/9.3.8.v20160314/dist/jetty-distribution-9.3.8.v20160314.tar.gz
$ tar -C ${HOME}/jetty -xf ${HOME}/jetty/jetty-distribution-9.3.8.v20160314.tar.gz

# Download configuration file for changing server port to 8081 from GitHub.
$ curl -L -o ${HOME}/jetty/jetty-distribution-9.3.8.v20160314/start.ini https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/jetty/start.ini

# Build Banana.
$ mkdir -p ${HOME}/banana
$ curl -L -o ${HOME}/banana/banana-release.zip https://github.com/lucidworks/banana/archive/release.zip
$ unzip ${HOME}/banana/banana-release.zip -d ${HOME}/banana
$ ant -f ${HOME}/banana/banana-release/build.xml -Dfinal.name=banana

# Install Banana.
$ cp ${HOME}/banana/banana-release/build/banana.war ${HOME}/jetty/jetty-distribution-9.3.8.v20160314/webapps/.

# Upload configsets for banana-int.
$ ${HOME}/solr/solr-6.1.0/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd upconfig -confdir ${HOME}/solr/solr-6.1.0/server/solr/configsets/data_driven_schema_configs/conf -confname banana-int_configs

# Create collection for banana-int.
$ curl -s "http://localhost:8983/solr/admin/collections?action=CREATE&name=banana-int&numShards=1&replicationFactor=1&maxShardsPerNode=1&createNodeSet=localhost:8983_solr&collection.configName=banana-int_configs" | xmllint --format -

# Add require fields for banana-int.
$ curl -L -o /tmp/banana-int.json https://raw.githubusercontent.com/mosuka/the-18th-lucene-solr-meetup/master/solr/banana-int.json
$ curl -X POST -H "Content-type:application/json" "http://localhost:8983/solr/banana-int/schema" -d @/tmp/banana-int.json

# Start Banana with Jetty.
$ ${HOME}/jetty/jetty-distribution-9.3.8.v20160314/bin/jetty.sh start

Silk

Solr にインデックスされるデータをリアルタイムで可視化するために、Silk をインストールします。
設定情報をリモートの Solr に保存できるのですが、スキーマ定義が Solr 6.0.0 から変更されたため、起動に失敗します。
この問題を修正した GitHub からソースコードを取得し、ビルドを行います。

# Build Silk.
$ mkdir -p ${HOME}/silk
$ curl -L -o ${HOME}/silk/silk-dev.zip https://github.com/mosuka/silk.git
$ unzip ${HOME}/silk/silk-dev.zip -d ${HOME}/silk
$ npm install ${HOME}/silk/silk-dev
$ bower install ${HOME}/silk/silk-dev
$ cd ${HOME}/silk/silk-dev
$ grunt build --force
$ cd ${HOME}

# Upload configsets for silkconfig.
$ ${HOME}/solr/solr-7.0.0-SNAPSHOT/server/scripts/cloud-scripts/zkcli.sh -zkhost localhost:2181/solr -cmd upconfig -confdir ${HOME}/silk/silk-dev/silkconfig/conf -confname silkconfig_configs

# Create collection for silkconfig.
$ curl -s "http://localhost:8983/solr/admin/collections?action=CREATE&name=silkconfig&numShards=1&replicationFactor=1&maxShardsPerNode=1&createNodeSet=localhost:8983_solr&collection.configName=silkconfig_configs" | xmllint --format -

# Start Silk.
$ node ${HOME}/silk/silk-dev/src/server/bin/kibana.js > ${HOME}/silk/silk-dev/silk.log &
$ echo $! > ${HOME}/silk/silk-dev/silk.pid

9
8
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
9
8