1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

hadoopを疑似分散で動かす

Last updated at Posted at 2017-08-11

n番煎じだが,既存手法を動かすのにhadoopが必要だったのでいれた

インストール

binファイルを http://hadoop.apache.org/releases.html からダウンロード
適所に配置(私は /usr/local以下においた)

wget http://ftp.tsukuba.wide.ad.jp/software/apache/hadoop/common/hadoop-2.7.4/hadoop-2.7.4.tar.gz
tar zxvf hadoop-2.7.4.tar.gz
sudo mv hadoop-2.7.4 /usr/local/hadoop

Hadoop各種設定

/etc/profile.d/hadoop.sh

  • JAVA_HOMEは自分のJAVA_HOMEに合わせる
export HADOOP_HOME=/usr/local/hadoop
export PATH="$HADOOP_HOME/bin:$PATH"
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib/native"
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
export JAVA_HOME=/usr/lib/jvm/java-8-oracle

/etc/hadoop/core-site.xml

  • HDFS関連の記述
  • 今回は疑似分散なので,fs.default.nameはlocalhostに
  • tmp.dirでデータ保存領域を指定
[/etc/hadoop/core-site.xml]
<configuration>
    <property>
        <name>fs.default.name</name>
        <value>hdfs://localhost:9000</value>
    </property>
    <property>
        <name>hadoop.tmp.dir</name>
        <value>/home/[username]/data/hadoop</value>
    </property>
</configuration>

hdfs-site.xml

[/etc/hadoop/hdfs-site.xml]
<configuration>
 <property>
  <name>dfs.replication</name>
  <value>1</value>
 </property>
</configuration>

mapred-site.xml

$ cp /etc/hadoop/mapred-site.xml.templete /etc/hadoop/mapred-site.xml
[/etc/hadoop/mapred-site.xml]
<configuration>
 <property>
  <name>mapred.job.tracker</name>
  <value>localhost:9001</value>
 </property>
</configuration>

localhostにパスフレーズ無しでsshで入れるか確認

  • ssh localhost をした時に,パスフレーズなしでログインできるか確認
  • できなければ,秘密鍵をauthorized_keysに追加
$ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

ファイルシステムをフォーマット

$ hdfs namenode -format

HDFSデーモンを起動

sbin/start-dfs.sh で起動
jpsで起動したかどうかを確認 (DataNode, NameNodeがあることを確認)

$ sbin/start-dfs.sh
$ jps
14560 DataNode
14375 NameNode
14791 SecondaryNameNode
15421 Jps

MapReduceの実行

ワークディレクトリをHDFS上に作成

$ hadoop fs -mkdir /user
$ hadoop fs -mkdir /user/<username>

#exampleを実行

$ hadoop fs -put etc/hadoop input
$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.4.jar grep input output 'dfs[a-z.]+'

結果の確認

$ hadoop fs -cat output/*
8	dfs.audit.logger
4	dfs.class
3	dfs.server.namenode.
2	dfs.period
2	dfs.audit.log.maxfilesize
2	dfs.audit.log.maxbackupindex
1	dfsmetrics.log
1	dfsadmin
1	dfs.servers
1	dfs.replication
1	dfs.file

Yarnの設定

[/etc/hadoop/yarn-site.xml]
<configuration>
    <property>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
    </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    <property>
      <name>yarn.nodemanager.resource.memory-mb</name>
      <value>4096</value>
    </property>
    <property>
      <name>yarn.nodemanager.resource.cpu-vcores</name>
      <value>2</value>
    </property>
</configuration>

##mapred-siteにメモリ情報を追記

[/etc/hadoop/mapred-site.xml]
<configuration>
  // 追記
  <property>
    <name>yarn.app.mapreduce.am.staging-dir</name>
    <value>/user</value>
  </property>
  <property>
    <name>mapreduce.map.memory.mb</name>
    <value>1024</value>
  </property>
  <property>
    <name>mapreduce.reduce.memory.mb</name>
    <value>1024</value>
  </property>
</configuration>

メモリ情報がないと,NodeManagerがうまく起動しなかった(ここで大変苦労した)

[logs/yarn--nodemanager-milk.log]
YarnRuntimeException: Recieved SHUTDOWN signal from Resourcemanager ,Registration of NodeManager failed, Message from ResourceManager: NodeManager from [hostname] doesn't satisfy minimum allocations, Sending SHUTDOWN signal to the NodeManager.

適切にメモリを設定しないと,こういうログが出る

参考サイト

1
1
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
1
1

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?