Hadoopクラスタを構築してみた。
以下記事で5台のシングルノードを構築し、
お互いにhosts,dnsで名前解決できる状態を構築した後にクラスタ構築しました。
→全台対象
→workerノードは不要(入れなかった場合一部手順の読み替えは必要。)
→ansibleを使用してシングルノードHadoop/Hiveを構築できます。
構成と役割
master1 : ZooKeeper / JournalNode / NameNode(nn1) / ZKFC
JobHistoryServer / ATSv2 / Hive / PostgreSQL
master2 : ZooKeeper / JournalNode / NameNode(nn2) / ZKFC
ResourceManager(rm1)
master3 : ZooKeeper / JournalNode
ResourceManager(rm2)
worker1 : DataNode / NodeManager
worker2 : DataNode / NodeManager
- Hadoop : 3.3.6(BigTop)
- Java : 8
- ユーザー : hadoop
- HDFS nameservice : cluster1
- YARN cluster-id : ycluster
- OS : Ubuntu24.04
1. 事前準備(全ノード共通)
以下は master1, master2, master3, worker1, worker2 全部で実行。
1-1. サービス完全停止
sudo systemctl stop zookeeper \
hadoop-hdfs-namenode hadoop-hdfs-datanode \
hadoop-hdfs-journalnode hadoop-hdfs-zkfc \
hadoop-yarn-resourcemanager hadoop-yarn-nodemanager \
hadoop-mapreduce-historyserver hadoop-yarn-timelineserver \
hive-metastore hiveserver2 postgresql \
2>/dev/null || true
sudo -u hadoop jps
→Java プロセスが何も出ないことを確認(Jpsのみであること。)
1-2. HDFS ディレクトリ(永続)作成
sudo mkdir -p /var/lib/hadoop-hdfs/namenode
sudo mkdir -p /var/lib/hadoop-hdfs/datanode
sudo mkdir -p /var/lib/hadoop-hdfs/journalnode
sudo chown -R hadoop:hadoop /var/lib/hadoop-hdfs
sudo chmod 700 /var/lib/hadoop-hdfs/journalnode
1-3. Hadoop tmp作成
sudo mkdir -p /var/lib/hadoop/tmp
sudo chown -R hadoop:hadoop /var/lib/hadoop/tmp
2. SSH 鍵作成・配布(master1,master2のみ)
2-1. master1 で鍵作成 → master2に配布
master1:
sudo -u hadoop ssh-keygen -t rsa -N "" -f /home/hadoop/.ssh/id_rsa
sudo -u hadoop cat /home/hadoop/.ssh/id_rsa.pub
→情報を控える。
master2:
sudo -u hadoop mkdir -p /home/hadoop/.ssh
sudo -u hadoop chmod 700 /home/hadoop/.ssh
sudo -u hadoop touch /home/hadoop/.ssh/authorized_keys
sudo -u hadoop chmod 600 /home/hadoop/.ssh/authorized_keys
sudo -u hadoop vi /home/hadoop/.ssh/authorized_keys
→控えた情報を書き込む。
master1:
sudo -u hadoop ssh master2 hostname
→パスワードについて聞かれず"master2"と返ってくること。
2-2 master2 でも鍵作成 → master1 に配布
master2:
sudo -u hadoop ssh-keygen -t rsa -N "" -f /home/hadoop/.ssh/id_rsa
sudo -u hadoop cat /home/hadoop/.ssh/id_rsa.pub
→情報を控える。
master1:
sudo -u hadoop mkdir -p /home/hadoop/.ssh
sudo -u hadoop chmod 700 /home/hadoop/.ssh
sudo -u hadoop touch /home/hadoop/.ssh/authorized_keys
sudo -u hadoop chmod 600 /home/hadoop/.ssh/authorized_keys
sudo -u hadoop vi /home/hadoop/.ssh/authorized_keys
→控えた情報を書き込む。
master2:
sudo -u hadoop ssh master1 hostname
→パスワードについて聞かれず"master1"と返ってくること。
3. ZooKeeper(master1/master2/master3)
以下は master1/master2/master3 のみ。
3-1. インストール
sudo apt update
sudo apt install -y zookeeperd
3-2. myid
# master1
echo 1 | sudo tee /etc/zookeeper/conf/myid
# master2
echo 2 | sudo tee /etc/zookeeper/conf/myid
# master3
echo 3 | sudo tee /etc/zookeeper/conf/myid
3-3. zoo.cfg編集
sudo tee /etc/zookeeper/conf/zoo.cfg << 'EOF'
tickTime=2000
initLimit=10
syncLimit=5
dataDir=/var/lib/zookeeper
clientPort=2181
4lw.commands.whitelist=ruok,stat,conf,isro
server.1=master1:2888:3888
server.2=master2:2888:3888
server.3=master3:2888:3888
EOF
3-4.起動・確認
sudo systemctl restart zookeeper
echo ruok | nc localhost 2181
→imokという応答が返ること。
4. Hadoop 設定ファイルを全ノード同一に配布
以下の 5 ファイルを 全ノード共通で tee で上書きします。
- /etc/hadoop/conf/core-site.xml
- /etc/hadoop/conf/hdfs-site.xml
- /etc/hadoop/conf/yarn-site.xml
- /etc/hadoop/conf/mapred-site.xml
- /etc/hadoop/conf/workers
4-1. core-site.xml(全ノード)
sudo tee /etc/hadoop/conf/core-site.xml << 'EOF'
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>master1:2181,master2:2181,master3:2181</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/lib/hadoop/tmp</value>
</property>
</configuration>
EOF
4-2. hdfs-site.xml(全ノード)
sudo tee /etc/hadoop/conf/hdfs-site.xml << 'EOF'
<configuration>
<property>
<name>dfs.nameservices</name>
<value>cluster1</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.nn1</name>
<value>master1:8020</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.nn2</name>
<value>master2:8020</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.nn1</name>
<value>master1:9870</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.nn2</name>
<value>master2:9870</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///var/lib/hadoop-hdfs/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>file:///var/lib/hadoop-hdfs/datanode</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://master1:8485;master2:8485;master3:8485/cluster1</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/var/lib/hadoop-hdfs/journalnode</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
</configuration>
EOF
4-3. yarn-site.xml
sudo tee /etc/hadoop/conf/yarn-site.xml << 'EOF'
<configuration>
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>ycluster</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>master3</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>master1:2181,master2:2181,master3:2181</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>512</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>128</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>256</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.version</name>
<value>2.0</value>
</property>
<property>
<name>yarn.timeline-service.fs-writer.root-dir</name>
<value>/atsv2</value>
</property>
<property>
<name>yarn.timeline-service.reader.webapp.address</name>
<value>master1:8188</value>
</property>
<property>
<name>yarn.webapp.ui2.enable</name>
<value>true</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://master1:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>master2:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>master3:8088</value>
</property>
</configuration>
EOF
4-4. mapred-site.xml(全ノード)
sudo tee /etc/hadoop/conf/mapred-site.xml << 'EOF'
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
<property>
<name>yarn.app.mapreduce.am.env</name>
<value>HADOOP_COMMON_HOME=/usr/lib/hadoop,HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs,HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce</value>
</property>
<property>
<name>mapreduce.map.env</name>
<value>HADOOP_COMMON_HOME=/usr/lib/hadoop,HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs,HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce</value>
</property>
<property>
<name>mapreduce.reduce.env</name>
<value>HADOOP_COMMON_HOME=/usr/lib/hadoop,HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs,HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce</value>
</property>
<property>
<name>mapreduce.application.classpath</name>
<value>/etc/hadoop/conf,/etc/hadoop/conf/*,/usr/lib/hadoop/*,/usr/lib/hadoop/lib/*,/usr/lib/hadoop-hdfs/*,/usr/lib/hadoop-hdfs/lib/*,/usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*,/usr/lib/hadoop-yarn/*,/usr/lib/hadoop-yarn/lib/*</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>master1:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>master1:19888</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/mr-history/done</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/mr-history/tmp</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Dlog4j.configuration=file:/etc/hadoop/conf/log4j.properties</value>
</property>
<property>
<name>mapreduce.map.java.opts</name>
<value>-Dlog4j.configuration=file:/etc/hadoop/conf/log4j.properties</value>
</property>
<property>
<name>mapreduce.reduce.java.opts</name>
<value>-Dlog4j.configuration=file:/etc/hadoop/conf/log4j.properties</value>
</property>
<property>
<name>mapreduce.job.am.webapp.address</name>
<value>0.0.0.0:0</value>
</property>
<property>
<name>mapreduce.job.am.webapp.https.address</name>
<value>0.0.0.0:0</value>
</property>
<property>
<name>mapreduce.shuffle.port</name>
<value>13562</value>
</property>
</configuration>
EOF
4-5. workers(全ノード)
sudo tee /etc/hadoop/conf/workers << 'EOF'
worker1
worker2
EOF
5. systemd設定作成
5-1. JournalNode service(master1/2/3)
sudo tee /etc/systemd/system/hadoop-hdfs-journalnode.service << 'EOF'
[Unit]
Description=Hadoop HDFS JournalNode
After=network.target zookeeper.service
Requires=zookeeper.service
[Service]
Type=simple
User=hadoop
Group=hadoop
EnvironmentFile=-/etc/default/hadoop
ExecStart=/usr/bin/hdfs --config ${HADOOP_CONF_DIR} journalnode
Restart=always
RestartSec=5
LimitNOFILE=100000
[Install]
WantedBy=multi-user.target
EOF
5-2. ZKFC service(master1/master2)
sudo tee /etc/systemd/system/hadoop-hdfs-zkfc.service << 'EOF'
[Unit]
Description=Hadoop HDFS ZK Failover Controller
After=network.target zookeeper.service hadoop-hdfs-namenode.service
Requires=zookeeper.service hadoop-hdfs-namenode.service
[Service]
Type=simple
User=hadoop
Group=hadoop
EnvironmentFile=-/etc/default/hadoop
ExecStart=/usr/bin/hdfs --config ${HADOOP_CONF_DIR} zkfc
Restart=always
RestartSec=5
LimitNOFILE=100000
[Install]
WantedBy=multi-user.target
EOF
5-3. 反映:(master1,master2,master3)
sudo systemctl daemon-reload
6. サービス自動起動設定
6-1. 全ノード:一旦全部サービス disable(存在しないのは無視)
sudo systemctl disable --now hadoop-hdfs-namenode 2>/dev/null || true
sudo systemctl disable --now hadoop-hdfs-datanode 2>/dev/null || true
sudo systemctl disable --now hadoop-yarn-resourcemanager 2>/dev/null || true
sudo systemctl disable --now hadoop-yarn-nodemanager 2>/dev/null || true
sudo systemctl disable --now hadoop-yarn-timelineserver 2>/dev/null || true
sudo systemctl disable --now hadoop-mapreduce-historyserver 2>/dev/null || true
sudo systemctl disable --now hive-metastore 2>/dev/null || true
sudo systemctl disable --now hiveserver2 2>/dev/null || true
sudo systemctl disable --now hive-server2 2>/dev/null || true
sudo systemctl disable --now postgresql 2>/dev/null || true
6-2. master1サービス自動起動設定
sudo systemctl enable zookeeper
sudo systemctl enable hadoop-hdfs-journalnode
sudo systemctl enable hadoop-hdfs-namenode
sudo systemctl enable hadoop-hdfs-zkfc
sudo systemctl enable hadoop-mapreduce-historyserver
sudo systemctl enable hadoop-yarn-timelineserver
sudo systemctl enable hive-metastore
sudo systemctl enable hiveserver2
sudo systemctl enable postgresql
6-3. master2サービス自動起動設定
sudo systemctl enable zookeeper
sudo systemctl enable hadoop-hdfs-journalnode
sudo systemctl enable hadoop-hdfs-namenode
sudo systemctl enable hadoop-hdfs-zkfc
sudo systemctl enable hadoop-yarn-resourcemanager
6-4. master3サービス自動起動設定
sudo systemctl enable zookeeper
sudo systemctl enable hadoop-hdfs-journalnode
sudo systemctl enable hadoop-yarn-resourcemanager
6-5. worker1/worker2サービス自動起動設定
sudo systemctl enable hadoop-hdfs-datanode
sudo systemctl enable hadoop-yarn-nodemanager
7. 初期化・起動
7-1. JournalNode(master1/2/3)
sudo systemctl start hadoop-hdfs-journalnode
確認:
ss -lntp | grep 8485
→0.0.0.0:8485が返ればOK
7-2. NameNode format(master1)
sudo -u hadoop hdfs namenode -format -nonInteractive
7-3. ZKFC 初期化(master1)
sudo -u hadoop hdfs zkfc -formatZK
7-4. NameNode 起動
master1(nn1 起動):
sudo systemctl start hadoop-hdfs-namenode
master2(nn2 bootstrap → 起動):
sudo -u hadoop hdfs namenode -bootstrapStandby
sudo systemctl start hadoop-hdfs-namenode
7-5. ZKFC 起動(master1/2)
sudo systemctl start hadoop-hdfs-zkfc
確認(master1):
sudo -u hadoop hdfs haadmin -getServiceState nn1
sudo -u hadoop hdfs haadmin -getServiceState nn2
→それぞれで以下の返答が出ること。(どちらかactiveになればよい。)
- nn1 active
- nn2 standby
7-6. YARN RM 起動(master2/3)
sudo systemctl start hadoop-yarn-resourcemanager
確認(master2):
yarn rmadmin -getServiceState rm1
yarn rmadmin -getServiceState rm2
→それぞれで以下の返答が出ること。(どちらかactiveになればよい。)
- rm1 active
- rm2 standby
7-7. Worker 起動(worker1/2)
sudo systemctl start hadoop-hdfs-datanode
sudo systemctl start hadoop-yarn-nodemanager
7-8. JobHistory / ATSv2(master1)
sudo systemctl start hadoop-mapreduce-historyserver
sudo systemctl start hadoop-yarn-timelineserver
7-9.UI2 確認
http://master2:8088/ui2
8. HDFS 初期ディレクトリ作成(master1)
sudo -u hadoop hdfs dfs -mkdir -p /mr-history/{done,tmp}
sudo -u hadoop hdfs dfs -mkdir -p /app-logs
sudo -u hadoop hdfs dfs -mkdir -p /atsv2
sudo -u hadoop hdfs dfs -chown -R hadoop:hadoop /mr-history /app-logs /atsv2
sudo -u hadoop hdfs dfs -chmod 1777 /app-logs
9. Hive再設定(master1)
9-1. hive-site.xml(master1)
sudo tee /etc/hive/conf/hive-site.xml << 'EOF'
<configuration>
<!-- Metastore DB(既存を使用) -->
<property>
<name>javax.jdo.option.ConnectionURL</name>
<value>jdbc:postgresql://localhost:5432/metastore</value>
</property>
<property>
<name>javax.jdo.option.ConnectionDriverName</name>
<value>org.postgresql.Driver</value>
</property>
<property>
<name>javax.jdo.option.ConnectionUserName</name>
<value>hive</value>
</property>
<property>
<name>javax.jdo.option.ConnectionPassword</name>
<value>HiveStrongPassword</value>
</property>
<!-- Metastore サービス -->
<property>
<name>hive.metastore.uris</name>
<value>thrift://master1:9083</value>
</property>
<!-- HDFS warehouse -->
<property>
<name>hive.metastore.warehouse.dir</name>
<value>/user/hive/warehouse</value>
</property>
<property>
<name>hive.server2.transport.mode</name>
<value>binary</value>
</property>
<property>
<name>hive.server2.thrift.bind.host</name>
<value>0.0.0.0</value>
</property>
<property>
<name>hive.server2.thrift.port</name>
<value>10000</value>
</property>
<property>
<name>hive.metastore.client.notification.event.poll.interval</name>
<value>0s</value>
</property>
<property>
<name>hive.metastore.event.db.notification.api.auth</name>
<value>false</value>
</property>
<property>
<name>hive.server2.enable.doAs</name>
<value>false</value>
</property>
</configuration>
EOF
9-2. HDFS 上に Hive warehouse 作成(master1)
sudo -u hadoop hdfs dfs -mkdir -p /user/hive/warehouse
sudo -u hadoop hdfs dfs -chown -R hive:hive /user/hive
sudo -u hadoop hdfs dfs -chmod -R 1777 /user/hive/warehouse
9-3. Hive services 起動(master1)
sudo systemctl start postgresql
sudo systemctl start hive-metastore
sudo systemctl start hiveserver2
ポート確認:
ss -lntp | egrep ':9083|:10000'
→0.0.0.0:9083,0.0.0.0:10000が返ればOK
10. 動作確認(MapReduce / Hive)
10-1. Web UI(nn,rmはactive側で確認する。)
以下のサイトを参照できること。
- NameNode
- RM UI2
- JobHistory
- ATSv2 API
10-2. MR動作(master1実施)
sudo -u hadoop hdfs dfs -mkdir -p /input
echo "hello hadoop" | sudo -u hadoop hdfs dfs -put - /input/test.txt
sudo -u hadoop hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-3.3.6.jar \
wordcount \
-D mapreduce.map.memory.mb=256 \
-D mapreduce.reduce.memory.mb=256 \
-D yarn.app.mapreduce.am.resource.mb=256 \
/input /output
sudo -u hadoop hdfs dfs -cat /output/part-r-00000
→それぞれの単語と数が出てくること。
RM UI2(Application),JobHistoryを参照して該当のジョブが出ること。
10-3. Hive動作(master1実施)
sudo -u hadoop /usr/lib/hive/bin/beeline \
-u 'jdbc:hive2://master1:10000/default' \
-n hadoop \
--hiveconf mapreduce.map.memory.mb=256 \
--hiveconf mapreduce.reduce.memory.mb=256 \
--hiveconf yarn.app.mapreduce.am.resource.mb=256 \
--hiveconf hive.exec.reducers.max=1
→”0: jdbc:hive2://master1:10000/default>”というプロンプトがでてくること。
CREATE TABLE IF NOT EXISTS t1 (col1 int, col2 string);
INSERT INTO t1 VALUES (1,'a'),(2,'b');
SELECT * FROM t1;
→下記のレスポンスが返ってくること。
1 a
2 b
RM UI2(Application),JobHistoryを参照して該当のジョブが出ること。
11. 運用確認コマンド
11-1. 稼働サービス一覧(ノードごと)
sudo -u hadoop jps
11-2. HA 状態確認
hdfs haadmin -getServiceState nn1
hdfs haadmin -getServiceState nn2
yarn rmadmin -getServiceState rm1
yarn rmadmin -getServiceState rm2
→active/stanby状態の確認
11-3. HDFS safemode
hdfs dfsadmin -safemode get
→HDFSがセーフモードになってないか確認。
セーフモードになった場合、Hadoop操作不可