1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

シングルノードhadoopを構築してみた。

Last updated at Posted at 2025-12-24

Hadoop (HDFS + YARN + MapReduce) Single Node 構築手順

Ubuntu 24.04 + BigTop + Java 11によるシングルノードHadoop構築

前提条件

OS: Ubuntu 24.04

Hadoop: BigTop 提供パッケージ(Hadoop 3.3.6)

Java: OpenJDK 8

構成: Single Node(NameNode / DataNode / YARN / MR 全部同一ホスト)

HDFS NameNode: localhost:9000

Hadoop 実行ユーザー: hadoop

メモリは2GBあれば動きます。

1. BigTop APT リポジトリ追加

sudo tee /etc/apt/sources.list.d/bigtop.list << 'EOF'
deb [trusted=yes] http://repos.bigtop.apache.org/releases/3.3.0/ubuntu/22.04/amd64 bigtop contrib
EOF

sudo apt update

2. Java 8 & Hadoop パッケージインストール

sudo apt install -y \
  openjdk-8-jdk \
  hadoop \
  hadoop-hdfs \
  hadoop-yarn \
  hadoop-mapreduce

3. hadoop ユーザー作成

sudo useradd -m -g hadoop -s /bin/bash hadoop
sudo passwd hadoop

4. Hadoop ディレクトリ作成

sudo mkdir -p \
  /var/lib/hadoop/tmp \
  /var/lib/hadoop/yarn/local \
  /var/lib/hadoop/yarn/logs

sudo chown -R hadoop:hadoop /var/lib/hadoop

5. core-site.xml 設定

sudo tee /etc/hadoop/conf/core-site.xml << 'EOF'
<configuration>
  <property>
    <name>fs.defaultFS</name>
    <value>hdfs://localhost:9000</value>
  </property>

  <property>
    <name>hadoop.tmp.dir</name>
    <value>/var/lib/hadoop/tmp</value>
  </property>

  <property>
    <name>yarn.timeline-service.fs-writer.root-dir</name>
    <value>/atsv2</value>
  </property>
</configuration>
EOF

6. yarn-site 設定

sudo tee /etc/hadoop/conf/yarn-site.xml << 'EOF'
<configuration>
  <property>
    <name>yarn.resourcemanager.hostname</name>
    <value>localhost</value>
  </property>

  <property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
  </property>

  <!-- ★ 無いと MapReduce が失敗する -->
  <property>
    <name>yarn.nodemanager.local-dirs</name>
    <value>/var/lib/hadoop/yarn/local</value>
  </property>

  <property>
    <name>yarn.nodemanager.log-dirs</name>
    <value>/var/lib/hadoop/yarn/logs</value>
  </property>

  <property>
    <name>yarn.nodemanager.resource.memory-mb</name>
    <value>1024</value>
  </property>

  <property>
    <name>yarn.scheduler.maximum-allocation-mb</name>
    <value>1024</value>
  </property>

  <property>
    <name>yarn.scheduler.minimum-allocation-mb</name>
    <value>128</value>
  </property>

  <property>
    <name>yarn.nodemanager.vmem-check-enabled</name>
    <value>false</value>
  </property>

  <property>
    <name>yarn.application.classpath</name>
    <value>
/etc/hadoop/conf,
/etc/hadoop/conf/*,
/usr/lib/hadoop/*,
/usr/lib/hadoop/lib/*,
/usr/lib/hadoop-hdfs/*,
/usr/lib/hadoop-hdfs/lib/*,
/usr/lib/hadoop-mapreduce/*,
/usr/lib/hadoop-mapreduce/lib/*,
/usr/lib/hadoop-yarn/*,
/usr/lib/hadoop-yarn/lib/*
    </value>
  </property>

  <property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>0.0.0.0:8088</value>
  </property>

  <property>
    <name>yarn.timeline-service.enabled</name>
    <value>true</value>
  </property>
  <property>
    <name>yarn.timeline-service.version</name>
    <value>2.0</value>
  </property>

  <!-- 外部ホストからもUIを見たいなら 0.0.0.0 推奨 -->
  <property>
    <name>yarn.timeline-service.hostname</name>
    <value>localhost</value>
  </property>

  <!-- RPC(既定 10200)  -->
  <property>
    <name>yarn.timeline-service.address</name>
    <value>0.0.0.0:10200</value>
  </property>

  <property>
    <name>yarn.log-aggregation-enable</name>
    <value>true</value>
  </property>

  <property>
    <name>yarn.nodemanager.remote-app-log-dir</name>
    <value>/app-logs</value>
  </property>

  <property>
    <name>yarn.log.server.url</name>
    <value>http://0.0.0.0:19888/jobhistory/logs</value>
  </property>
 
  <property>
    <name>yarn.webapp.ui2.enable</name>
    <value>true</value>
  </property>
 
  <property>
    <name>yarn.timeline-service.writer.class</name>
    <value>org.apache.hadoop.yarn.server.timelineservice.storage.FileSystemTimelineWriterImpl</value>
  </property>
 
  <property>
    <name>yarn.timeline-service.reader.webapp.address</name>
    <value>0.0.0.0:8188</value>
  </property>

</configuration>
EOF
sudo tee -a /etc/hadoop/conf/yarn-env.sh >/dev/null <<'EOF'
 
# ATS v2 timelineservice jars
export HADOOP_CLASSPATH="${HADOOP_CLASSPATH}:/usr/lib/hadoop-yarn/timelineservice/*"
export HADOOP_USER_CLASSPATH_FIRST=true
EOF

7. mapred-site.xml 設定

sudo tee /etc/hadoop/conf/mapred-site.xml << 'EOF'
<configuration>
  <property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
  </property>

  <property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>
HADOOP_COMMON_HOME=/usr/lib/hadoop,
HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs,
HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
    </value>
  </property>

  <property>
    <name>mapreduce.map.env</name>
    <value>
HADOOP_COMMON_HOME=/usr/lib/hadoop,
HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs,
HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
    </value>
  </property>

  <property>
    <name>mapreduce.reduce.env</name>
    <value>
HADOOP_COMMON_HOME=/usr/lib/hadoop,
HADOOP_HDFS_HOME=/usr/lib/hadoop-hdfs,
HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce
    </value>
  </property>

  <property>
    <name>mapreduce.application.classpath</name>
    <value>
/etc/hadoop/conf,
/etc/hadoop/conf/*,
/usr/lib/hadoop/*,
/usr/lib/hadoop/lib/*,
/usr/lib/hadoop-hdfs/*,
/usr/lib/hadoop-hdfs/lib/*,
/usr/lib/hadoop-mapreduce/*,
/usr/lib/hadoop-mapreduce/lib/*,
/usr/lib/hadoop-yarn/*,
/usr/lib/hadoop-yarn/lib/*
    </value>
  </property>

  <!-- JobHistoryServer RPC -->
  <property>
    <name>mapreduce.jobhistory.address</name>
    <value>localhost:10020</value>
  </property>

  <!-- JobHistoryServer Web UI -->
  <property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>0.0.0.0:19888</value>
  </property>

  <!-- Job history 送信先(HDFS) -->
  <property>
    <name>mapreduce.jobhistory.done-dir</name>
    <value>/mr-history/done</value>
  </property>

  <property>
    <name>mapreduce.jobhistory.intermediate-done-dir</name>
    <value>/mr-history/tmp</value>
  </property>

</configuration>
EOF

8. hadoop ユーザーの .bashrc 設定

sudo tee -a /home/hadoop/.bashrc << 'EOF'
export HADOOP_CONF_DIR=/etc/hadoop/conf
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64
export PATH=$PATH:/usr/lib/hadoop/bin
EOF

sudo chown hadoop:hadoop /home/hadoop/.bashrc

9. hadoop ユーザーへ切り替え

sudo su - hadoop

10. HDFS フォーマット

hdfs namenode -format -nonInteractive
exit

11. サービス作成

共通ファイル

sudo tee /etc/default/hadoop <<'EOF'
HADOOP_HOME=/usr/lib/hadoop
HADOOP_CONF_DIR=/etc/hadoop/conf
YARN_CONF_DIR=/etc/hadoop/conf
MAPRED_CONF_DIR=/etc/hadoop/conf

# もし必要なら Java を明示
JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64

# tput 警告回避(任意)
TERM=dumb
EOF

NameNode

sudo tee /etc/systemd/system/hadoop-hdfs-namenode.service <<'EOF'
[Unit]
Description=Hadoop HDFS NameNode
After=network.target
Wants=network.target

[Service]
Type=simple
User=hadoop
Group=hadoop
EnvironmentFile=-/etc/default/hadoop

# 例: hadoop のデータディレクトリが /var/lib/hadoop-hdfs などなら合わせる
WorkingDirectory=/var/lib/hadoop-hdfs

ExecStart=/usr/bin/hdfs --config ${HADOOP_CONF_DIR} namenode
Restart=on-failure
RestartSec=5
LimitNOFILE=100000

[Install]
WantedBy=multi-user.target
EOF

DataNode

sudo tee /etc/systemd/system/hadoop-hdfs-datanode.service <<'EOF'
[Unit]
Description=Hadoop HDFS DataNode
After=network.target
Wants=network.target

[Service]
Type=simple
User=hadoop
Group=hadoop
EnvironmentFile=-/etc/default/hadoop
WorkingDirectory=/var/lib/hadoop-hdfs

ExecStart=/usr/bin/hdfs --config ${HADOOP_CONF_DIR} datanode
Restart=on-failure
RestartSec=5
LimitNOFILE=100000

[Install]
WantedBy=multi-user.target
EOF

ResourceManager

sudo tee /etc/systemd/system/hadoop-yarn-resourcemanager.service <<'EOF'
[Unit]
Description=Hadoop YARN ResourceManager
After=network.target
Wants=network.target

[Service]
Type=simple
User=hadoop
Group=hadoop
EnvironmentFile=-/etc/default/hadoop
WorkingDirectory=/var/lib/hadoop-yarn

ExecStart=/usr/bin/yarn --config ${HADOOP_CONF_DIR} resourcemanager
Restart=on-failure
RestartSec=5
LimitNOFILE=100000

[Install]
WantedBy=multi-user.target
EOF

NodeManager

sudo tee /etc/systemd/system/hadoop-yarn-nodemanager.service <<'EOF'
[Unit]
Description=Hadoop YARN NodeManager
After=network.target
Wants=network.target

[Service]
Type=simple
User=hadoop
Group=hadoop
EnvironmentFile=-/etc/default/hadoop
WorkingDirectory=/var/lib/hadoop-yarn

ExecStart=/usr/bin/yarn --config ${HADOOP_CONF_DIR} nodemanager
Restart=on-failure
RestartSec=5
LimitNOFILE=100000

[Install]
WantedBy=multi-user.target
EOF

timelineserver

sudo tee /etc/systemd/system/hadoop-yarn-timelineserver.service << 'EOF'
[Unit]
Description=Hadoop YARN Timeline Server (ATS v2)
After=network.target
Wants=network.target

[Service]
Type=forking
User=hadoop
Group=hadoop
EnvironmentFile=/etc/default/hadoop
ExecStart=/usr/lib/hadoop-yarn/bin/yarn --daemon start timelinereader
ExecStop=/usr/lib/hadoop-yarn/bin/yarn --daemon stop timelinereader
Restart=on-failure
RestartSec=5

[Install]
WantedBy=multi-user.target
EOF

historyserver

sudo tee /etc/systemd/system/hadoop-mapreduce-historyserver.service <<'EOF'
[Unit]
Description=Hadoop MapReduce JobHistoryServer
After=network.target
Wants=network.target

[Service]
Type=simple
User=hadoop
Group=hadoop
EnvironmentFile=-/etc/default/hadoop

# conf を明示して起動
ExecStart=/usr/bin/mapred --config ${HADOOP_CONF_DIR} historyserver

Restart=on-failure
RestartSec=5
LimitNOFILE=100000

[Install]
WantedBy=multi-user.target
EOF

サービス立ち上げ(HDFS)

sudo systemctl daemon-reload
sudo systemctl enable --now hadoop-hdfs-namenode
sudo systemctl enable --now hadoop-hdfs-datanode

確認:

sudo su - hadoop
jps

以下が見えればOK:

NameNode

DataNode

JobHistoryディレクトリ,YARNログ収集ディレクトリ作成、TimelineServerディレクトリ作成

# JobHistoryディレクトリ作成
hdfs dfs -mkdir -p /mr-history/done /mr-history/tmp
hdfs dfs -chown -R hadoop:hadoop /mr-history
hdfs dfs -chmod -R 1777 /mr-history
# YARNログ収集ディレクトリ作成
hdfs dfs -mkdir -p /app-logs
hdfs dfs -chown -R hadoop:hadoop /app-logs
hdfs dfs -chmod 1777 /app-logs
# TimelineServerディレクトリ作成
hdfs dfs -mkdir -p /atsv2
hdfs dfs -chown yarn:yarn /atsv2
hdfs dfs -chmod 1777 /atsv2

exit

サービス立ち上げ(YARN)

sudo systemctl enable --now hadoop-yarn-resourcemanager
sudo systemctl enable --now hadoop-yarn-nodemanager
sudo systemctl enable --now hadoop-yarn-timelineserver
sudo systemctl enable --now hadoop-mapreduce-historyserver

確認:

sudo su - hadoop
jps

以下が見えればOK:

NameNode

DataNode

ResourceManager

NodeManager

TimelineReaderServer

JobHistoryServer

12. 動作確認(WordCount)

hdfs dfs -mkdir /input
echo "hello hadoop hadoop yarn" | hdfs dfs -put - /input/test.txt

hadoop jar /usr/lib/hadoop-mapreduce/hadoop-mapreduce-examples-3.3.6.jar \
  wordcount \
  -D mapreduce.map.memory.mb=256 \
  -D mapreduce.reduce.memory.mb=256 \
  -D yarn.app.mapreduce.am.resource.mb=512 \
  -D mapreduce.map.java.opts="-Xmx200m" \
  -D mapreduce.reduce.java.opts="-Xmx200m" \
  -D yarn.app.mapreduce.am.command-opts="-Xmx400m" \
  /input /output

結果確認:

hdfs dfs -cat /output/part-r-00000
→それぞれの単語と数が出てくること。

やり直す際は以下のコマンドを入れてまた実施する。
hdfs dfs -rm -r -skipTrash /output

13. 動作確認(管理画面)

以下の管理画面が参照できること。

管理画面 URL
NameNode http://ホストIPアドレス:9870
DataNode http://ホストIPアドレス:9864
ResourceManager http://ホストIPアドレス:8088
NodeManager http://ホストIPアドレス:8042
JobHistory http://ホストIPアドレス:19888
TimelineService(API応答) http://ホストIPアドレス:8188/ws/v2/timeline
YARN UI2 http://ホストIPアドレス:8088/ui2
1
0
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
0

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?