LoginSignup
4
4

More than 5 years have passed since last update.

MongoDB3.0.7+WiredTiger+ReplicaSet+NewRelicでサーバー構築

Last updated at Posted at 2015-11-18

サーバーのリプレイスが必要なったので、久々に作業をした。
プロダクションで使えるような内容かなと思う。

以下、メモ。

環境

OS: ubuntu 14.04 LTS
Instance: m4.4xlarge vCPU16 Memory64GiB gp2-SSD-EBS-[サイズ]GiBx4(Raid10用)
EBS-300GiBだったら600GiB確保できる。

リファレンス

https://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/
https://docs.mongodb.org/manual/administration/production-checklist/
https://docs.mongodb.org/manual/administration/production-notes/
https://docs.mongodb.org/manual/faq/diagnostics/#faq-keepalive
https://docs.mongodb.org/manual/reference/ulimit/
https://mongodb-documentation.readthedocs.org/en/latest/ecosystem/tutorial/install-mongodb-on-amazon-ec2.html
http://stackoverflow.com/questions/28911634/how-to-avoid-transparent-hugepage-defrag-warning-from-mongodb

サーバー構築

時間の設定

vpc にサーバーを置くので、vpc内ntpサーバーを参照する

$ sudo apt-get install ntp

tcp_keepaliveの設定

$ cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
$ echo 'net.ipv4.tcp_keepalive_time=300' | sudo tee -a /etc/sysctl.conf
$ sudo sysctl -w net.ipv4.tcp_keepalive_time=300
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
300

ulimitの設定

cat <<EOF | sudo tee -a /etc/security/limits.conf
* hard nofile 65536
* soft nofile 65536
* soft nproc 65536
* hard nproc 65536
EOF

transparent_hugepageの設定

$ vi /etc/init/mongod.conf
...
...
chown $DEAMONUSER /var/run/mongodb.pid

# 上記、直下の以下の行を追加

if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
   echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
   echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi

raid10の設定

$ sudo apt-get install mdadm lvm2
$ sudo mdadm --verbose --create /dev/md0 --level=10 --chunk=256 --raid-devices=4 /dev/xvdb /dev/xvdc /dev/xvdd /dev/xvde
$ echo 'DEVICE /dev/xvdb /dev/xvdc /dev/xvdd /dev/xvde' | sudo tee -a /etc/mdadm/mdadm.conf
$ sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
$ sudo blockdev --setra 128 /dev/md0
$ sudo blockdev --setra 128 /dev/xvdb
$ sudo blockdev --setra 128 /dev/xvdc
$ sudo blockdev --setra 128 /dev/xvdd
$ sudo blockdev --setra 128 /dev/xvde
$ sudo dd if=/dev/zero of=/dev/md0 bs=512 count=1
$ sudo pvcreate /dev/md0
$ sudo vgcreate vg0 /dev/md0

ロジカルボリュームを設定
logとjournalに5%割り当てる
運用しているサーバーを見る限り5%で大丈夫そう

$ sudo lvcreate -l 90%vg -n data vg0
$ sudo lvcreate -l 5%vg -n log vg0
$ sudo lvcreate -l 5%vg -n journal vg0

ボリュームをフォーマットしてマウントする

$ sudo mke2fs -t ext4 -F /dev/vg0/data
$ sudo mke2fs -t ext4 -F /dev/vg0/log
$ sudo mke2fs -t ext4 -F /dev/vg0/journal

$ sudo mkdir /data
$ sudo mkdir /log
$ sudo mkdir /journal

$ echo '/dev/vg0/data /data ext4 defaults,auto,noatime,noexec 0 0' | sudo tee -a /etc/fstab
$ echo '/dev/vg0/log /log ext4 defaults,auto,noatime,noexec 0 0' | sudo tee -a /etc/fstab
$ echo '/dev/vg0/journal /journal ext4 defaults,auto,noatime,noexec 0 0' | sudo tee -a /etc/fstab

$ sudo mount /data
$ sudo mount /log
$ sudo mount /journal

$ sudo ln -s /journal /data/journal

ベンチマーク

  • sudo fio -filename=/data/test2g -direct=1 -rw=read -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=64): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [100.0% done] [42685KB/0KB/0KB /s] [10.7K/0/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=14839: Mon Nov 16 17:53:33 2015
  read : io=396392KB, bw=39552KB/s, iops=9888, runt= 10022msec
    clat (usec): min=169, max=45337, avg=6461.15, stdev=8144.68
     lat (usec): min=169, max=45337, avg=6461.37, stdev=8144.70
    clat percentiles (usec):
     |  1.00th=[  187],  5.00th=[  195], 10.00th=[  201], 20.00th=[  231],
     | 30.00th=[  298], 40.00th=[  980], 50.00th=[ 2608], 60.00th=[ 4640],
     | 70.00th=[ 7968], 80.00th=[12608], 90.00th=[21632], 95.00th=[24192],
     | 99.00th=[27520], 99.50th=[28032], 99.90th=[32384], 99.95th=[35584],
     | 99.99th=[39680]
    bw (KB  /s): min=  122, max= 5168, per=1.57%, avg=622.89, stdev=670.75
    lat (usec) : 250=22.93%, 500=11.81%, 750=4.96%, 1000=0.32%
    lat (msec) : 2=7.59%, 4=10.08%, 10=17.55%, 20=12.40%, 50=12.36%
  cpu          : usr=0.00%, sys=0.25%, ctx=99704, majf=0, minf=1997
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=99098/w=0/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=396392KB, aggrb=39552KB/s, minb=39552KB/s, maxb=39552KB/s, mint=10022msec, maxt=10022msec

Disk stats (read/write):
    dm-0: ios=98301/69, merge=0/0, ticks=630344/1064, in_queue=631676, util=99.97%, aggrios=99098/72, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
    md0: ios=99098/72, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=24081/34, aggrmerge=728/1, aggrticks=155786/262, aggrin_queue=146501, aggrutil=95.07%
  xvdb: ios=23968/36, merge=1653/2, ticks=223996/344, in_queue=213304, util=95.07%
  xvdc: ios=24994/36, merge=0/2, ticks=90672/136, in_queue=83920, util=88.84%
  xvdd: ios=23766/32, merge=1262/0, ticks=251956/432, in_queue=242256, util=89.63%
  xvde: ios=23597/32, merge=0/0, ticks=56520/136, in_queue=46524, util=83.43%
  • sudo fio -filename=/data/test2g -direct=1 -rw=write -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=10): [WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW] [100.0% done] [0KB/6669KB/0KB /s] [0/1667/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=14910: Mon Nov 16 17:54:29 2015
  write: io=66740KB, bw=6647.5KB/s, iops=1661, runt= 10040msec
    clat (usec): min=454, max=10014K, avg=5983.49, stdev=229869.61
     lat (usec): min=454, max=10014K, avg=5983.84, stdev=229869.61
    clat percentiles (usec):
     |  1.00th=[  470],  5.00th=[  486], 10.00th=[  494], 20.00th=[  506],
     | 30.00th=[  516], 40.00th=[  524], 50.00th=[  540], 60.00th=[  548],
     | 70.00th=[  572], 80.00th=[  612], 90.00th=[  716], 95.00th=[  892],
     | 99.00th=[ 1784], 99.50th=[ 2800], 99.90th=[24192], 99.95th=[8978432],
     | 99.99th=[10027008]
    bw (KB  /s): min=    0, max= 7056, per=23.20%, avg=1542.24, stdev=2828.77
    lat (usec) : 500=13.74%, 750=77.66%, 1000=4.90%
    lat (msec) : 2=2.94%, 4=0.37%, 10=0.12%, 20=0.07%, 50=0.16%
    lat (msec) : >=2000=0.05%
  cpu          : usr=0.00%, sys=0.13%, ctx=33637, majf=0, minf=1902
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=16685/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=66740KB, aggrb=6647KB/s, minb=6647KB/s, maxb=6647KB/s, mint=10040msec, maxt=10040msec

Disk stats (read/write):
    dm-0: ios=0/16518, merge=0/0, ticks=0/9696, in_queue=9696, util=95.25%, aggrios=0/16759, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
    md0: ios=0/16759, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/8376, aggrmerge=0/2, aggrticks=0/4400, aggrin_queue=4399, aggrutil=46.64%
  xvdb: ios=0/8419, merge=0/4, ticks=0/4820, in_queue=4820, util=46.64%
  xvdc: ios=0/8419, merge=0/4, ticks=0/4792, in_queue=4788, util=46.40%
  xvdd: ios=0/8334, merge=0/0, ticks=0/3732, in_queue=3732, util=36.75%
  xvde: ios=0/8334, merge=0/0, ticks=0/4256, in_queue=4256, util=41.24%
  • sudo fio -filename=/data/test2g -direct=1 -rw=randread -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=64): [rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr] [100.0% done] [47364KB/0KB/0KB /s] [11.9K/0/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=15053: Mon Nov 16 17:56:00 2015
  read : io=480728KB, bw=47962KB/s, iops=11990, runt= 10023msec
    clat (usec): min=180, max=73261, avg=5328.61, stdev=8309.87
     lat (usec): min=180, max=73261, avg=5328.81, stdev=8309.87
    clat percentiles (usec):
     |  1.00th=[  205],  5.00th=[  270], 10.00th=[  314], 20.00th=[  354],
     | 30.00th=[  370], 40.00th=[  398], 50.00th=[  466], 60.00th=[  820],
     | 70.00th=[ 4704], 80.00th=[11200], 90.00th=[20096], 95.00th=[23680],
     | 99.00th=[31104], 99.50th=[34048], 99.90th=[40704], 99.95th=[47360],
     | 99.99th=[58112]
    bw (KB  /s): min=  378, max= 1229, per=1.57%, avg=750.94, stdev=123.63
    lat (usec) : 250=4.75%, 500=51.10%, 750=3.81%, 1000=1.11%
    lat (msec) : 2=3.14%, 4=4.64%, 10=9.95%, 20=11.31%, 50=10.16%
    lat (msec) : 100=0.04%
  cpu          : usr=0.00%, sys=0.32%, ctx=120351, majf=0, minf=2060
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=120182/w=0/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
   READ: io=480728KB, aggrb=47962KB/s, minb=47962KB/s, maxb=47962KB/s, mint=10023msec, maxt=10023msec

Disk stats (read/write):
    dm-0: ios=118923/68, merge=0/0, ticks=631564/784, in_queue=632708, util=98.93%, aggrios=120182/69, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
    md0: ios=120182/69, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=30044/33, aggrmerge=0/0, aggrticks=159872/247, aggrin_queue=160119, aggrutil=98.34%
  xvdb: ios=30276/33, merge=1/0, ticks=269808/436, in_queue=270244, util=98.34%
  xvdc: ios=29990/33, merge=2/0, ticks=308140/496, in_queue=308636, util=98.34%
  xvdd: ios=30153/34, merge=0/1, ticks=36340/8, in_queue=36348, util=86.04%
  xvde: ios=29760/34, merge=0/1, ticks=25200/48, in_queue=25248, util=71.45%
  • sudo fio -filename=/data/test2g -direct=1 -rw=randwrite -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=4): [wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww] [100.0% done] [0KB/6529KB/0KB /s] [0/1632/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=15123: Mon Nov 16 17:56:59 2015
  write: io=60532KB, bw=6030.3KB/s, iops=1507, runt= 10038msec
    clat (usec): min=457, max=9942.4K, avg=2708.25, stdev=139970.30
     lat (usec): min=457, max=9942.4K, avg=2708.63, stdev=139970.31
    clat percentiles (usec):
     |  1.00th=[  482],  5.00th=[  498], 10.00th=[  506], 20.00th=[  524],
     | 30.00th=[  532], 40.00th=[  540], 50.00th=[  556], 60.00th=[  572],
     | 70.00th=[  604], 80.00th=[  660], 90.00th=[  828], 95.00th=[ 1128],
     | 99.00th=[ 2608], 99.50th=[ 7904], 99.90th=[29312], 99.95th=[33536],
     | 99.99th=[9895936]
    bw (KB  /s): min=    0, max= 6536, per=23.07%, avg=1390.90, stdev=2551.96
    lat (usec) : 500=6.33%, 750=80.74%, 1000=6.48%
    lat (msec) : 2=5.09%, 4=0.68%, 10=0.28%, 20=0.15%, 50=0.22%
    lat (msec) : 100=0.01%, >=2000=0.02%
  cpu          : usr=0.00%, sys=0.12%, ctx=30378, majf=0, minf=1901
  IO depths    : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
     submit    : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     complete  : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
     issued    : total=r=0/w=15133/d=0, short=r=0/w=0/d=0

Run status group 0 (all jobs):
  WRITE: io=60532KB, aggrb=6030KB/s, minb=6030KB/s, maxb=6030KB/s, mint=10038msec, maxt=10038msec

Disk stats (read/write):
    dm-0: ios=0/14970, merge=0/0, ticks=0/9980, in_queue=9888, util=96.85%, aggrios=0/15206, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
    md0: ios=0/15206, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=82/7600, aggrmerge=0/1, aggrticks=21655/4331, aggrin_queue=11324, aggrutil=51.30%
  xvdb: ios=86/7599, merge=0/0, ticks=22784/5116, in_queue=12536, util=51.30%
  xvdc: ios=76/7599, merge=0/0, ticks=18564/4376, in_queue=10172, util=43.85%
  xvdd: ios=84/7602, merge=0/3, ticks=22656/3692, in_queue=11224, util=37.75%
  xvde: ios=85/7602, merge=0/3, ticks=22616/4140, in_queue=11364, util=41.45%

まとめ、iopsはreadの場合、10000前後、writeは1600前後
今の所、iostatで監視する限り足りているので gp2 ssdで行けそう

mongodb関連の設定

$ sudo chown mongodb:mongodb /data /log /journal

mongod.confサンプル

Snappyを使っているので、その設定

storage: # default snappy
    dbPath: "/data"
    engine: "wiredTiger"
    directoryPerDB: true
    # storage.wiredTiger.engineConfig.cacheSizeGB
    # Default: the maximum of half of physical RAM or 1 gigabyte
    wiredTiger:
        engineConfig:
            cacheSizeGB: 55
    journal:
        enabled: true
processManagement:
    pidFilePath: /var/run/mongodb.pid
    fork: true
systemLog:
    destination: file
    path: "/log/mongod.log"
    logAppend: true
replication:
    replSetName: "****"
net:
    bindIp: *.*.*.*
    port: 27017
setParameter:
   enableLocalhostAuthBypass: false

ログローテート

$ cat <<\EOF | sudo tee /etc/logrotate.d/mongod
/log/mongod.log {
    daily
    missingok
    rotate 4
    compress
    delaycompress
    notifempty
    create 640 mongodb mongodb
    postrotate
        killall -SIGUSR1 mongod
        find /log -type f -regex ".*\.\(log.[0-9].*-[0-9].*\)" -exec rm {} \;
    endscript
}
EOF
$ sudo touch /log/mongod.log
$ sudo chown mongodb:mongodb /log/mongod.log
$ sudo logrotate -dv /etc/logrotate.d/mongod
$ sudo logrotate -fv /etc/logrotate.d/mongod

mongodb start/stop

$ sudo service mongod start/stop

sysstatを入れておく

$ sudo apt-get install sysstat
# ioをチェック
$ iostat 1

newrelicで監視

newrelicのインストール

$ echo 'deb http://apt.newrelic.com/debian/ newrelic non-free' | sudo tee /etc/apt/sources.list.d/newrelic.list
$ wget -O- https://download.newrelic.com/548C16BF.gpg | sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get install newrelic-sysmond
$ sudo nrsysmond-config --set license_key=<your_license_key_here>

newrelicの設定

$ sudo vi /etc/newrelic/nrsysmond.cfg
proxy=10.0.0.10:3128
disable_docker=true
$ sudo service newrelic-sysmond restart 

newrelic proxyの設定

DBはawsのvpcにあるので、proxy使ってnewrelicにつなぐ必要がある

例えば、squidを入れる

10.0.0.10> sudo apt-get install squid
10.0.0.10> sudo vi /etc/squid3/squid.conf
http_access allow all

awsのセキュリティーグループでacl制御をする

mongodb pluginをインストール

$ sudo apt-get install python-pip python-dev
$ sudo pip install newrelic-plugin-agent
$ sudo pip install newrelic_plugin_agent[mongodb] 
$ cat <<\EOF | sudo tee /etc/newrelic/mongo.yml
%YAML 1.2
---
Application:
  license_key: ********
  mongodb:
    - name: pri
      host: *.*.*.*
      port: 27017
      databases:
        - database1
        - database2
EOF

pluginを起動

newrelic-plugin-agent -c /etc/newrelic/mongo.yml

後書

3.0.5だと負荷がかかると不安定になるということなので、3.0.7する必要がある
新たにレプリカセットにメンバーを追加して、新たなサーバーにリプレイスをした
レプリカセットのコンフィグは簡単なので、この辺りの作業は比較的楽だった
しばらくこの設定でいけそう
今後増えて行くディスク容量だったりの管理が課題
データーを逃したり、いろいろとやる必要がある気がする

4
4
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
4
4