サーバーのリプレイスが必要なったので、久々に作業をした。
プロダクションで使えるような内容かなと思う。
以下、メモ。
環境
OS: ubuntu 14.04 LTS
Instance: m4.4xlarge vCPU16 Memory64GiB gp2-SSD-EBS-[サイズ]GiBx4(Raid10用)
EBS-300GiBだったら600GiB確保できる。
リファレンス
https://docs.mongodb.org/manual/tutorial/install-mongodb-on-ubuntu/
https://docs.mongodb.org/manual/administration/production-checklist/
https://docs.mongodb.org/manual/administration/production-notes/
https://docs.mongodb.org/manual/faq/diagnostics/#faq-keepalive
https://docs.mongodb.org/manual/reference/ulimit/
https://mongodb-documentation.readthedocs.org/en/latest/ecosystem/tutorial/install-mongodb-on-amazon-ec2.html
http://stackoverflow.com/questions/28911634/how-to-avoid-transparent-hugepage-defrag-warning-from-mongodb
サーバー構築
時間の設定
vpc にサーバーを置くので、vpc内ntpサーバーを参照する
$ sudo apt-get install ntp
tcp_keepaliveの設定
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
7200
$ echo 'net.ipv4.tcp_keepalive_time=300' | sudo tee -a /etc/sysctl.conf
$ sudo sysctl -w net.ipv4.tcp_keepalive_time=300
$ cat /proc/sys/net/ipv4/tcp_keepalive_time
300
ulimitの設定
cat <<EOF | sudo tee -a /etc/security/limits.conf
* hard nofile 65536
* soft nofile 65536
* soft nproc 65536
* hard nproc 65536
EOF
transparent_hugepageの設定
$ vi /etc/init/mongod.conf
...
...
chown $DEAMONUSER /var/run/mongodb.pid
# 上記、直下の以下の行を追加
if test -f /sys/kernel/mm/transparent_hugepage/enabled; then
echo never > /sys/kernel/mm/transparent_hugepage/enabled
fi
if test -f /sys/kernel/mm/transparent_hugepage/defrag; then
echo never > /sys/kernel/mm/transparent_hugepage/defrag
fi
raid10の設定
$ sudo apt-get install mdadm lvm2
$ sudo mdadm --verbose --create /dev/md0 --level=10 --chunk=256 --raid-devices=4 /dev/xvdb /dev/xvdc /dev/xvdd /dev/xvde
$ echo 'DEVICE /dev/xvdb /dev/xvdc /dev/xvdd /dev/xvde' | sudo tee -a /etc/mdadm/mdadm.conf
$ sudo mdadm --detail --scan | sudo tee -a /etc/mdadm/mdadm.conf
$ sudo blockdev --setra 128 /dev/md0
$ sudo blockdev --setra 128 /dev/xvdb
$ sudo blockdev --setra 128 /dev/xvdc
$ sudo blockdev --setra 128 /dev/xvdd
$ sudo blockdev --setra 128 /dev/xvde
$ sudo dd if=/dev/zero of=/dev/md0 bs=512 count=1
$ sudo pvcreate /dev/md0
$ sudo vgcreate vg0 /dev/md0
ロジカルボリュームを設定
logとjournalに5%割り当てる
運用しているサーバーを見る限り5%で大丈夫そう
$ sudo lvcreate -l 90%vg -n data vg0
$ sudo lvcreate -l 5%vg -n log vg0
$ sudo lvcreate -l 5%vg -n journal vg0
ボリュームをフォーマットしてマウントする
$ sudo mke2fs -t ext4 -F /dev/vg0/data
$ sudo mke2fs -t ext4 -F /dev/vg0/log
$ sudo mke2fs -t ext4 -F /dev/vg0/journal
$ sudo mkdir /data
$ sudo mkdir /log
$ sudo mkdir /journal
$ echo '/dev/vg0/data /data ext4 defaults,auto,noatime,noexec 0 0' | sudo tee -a /etc/fstab
$ echo '/dev/vg0/log /log ext4 defaults,auto,noatime,noexec 0 0' | sudo tee -a /etc/fstab
$ echo '/dev/vg0/journal /journal ext4 defaults,auto,noatime,noexec 0 0' | sudo tee -a /etc/fstab
$ sudo mount /data
$ sudo mount /log
$ sudo mount /journal
$ sudo ln -s /journal /data/journal
ベンチマーク
- sudo fio -filename=/data/test2g -direct=1 -rw=read -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=read, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=64): [RRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRRR] [100.0% done] [42685KB/0KB/0KB /s] [10.7K/0/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=14839: Mon Nov 16 17:53:33 2015
read : io=396392KB, bw=39552KB/s, iops=9888, runt= 10022msec
clat (usec): min=169, max=45337, avg=6461.15, stdev=8144.68
lat (usec): min=169, max=45337, avg=6461.37, stdev=8144.70
clat percentiles (usec):
| 1.00th=[ 187], 5.00th=[ 195], 10.00th=[ 201], 20.00th=[ 231],
| 30.00th=[ 298], 40.00th=[ 980], 50.00th=[ 2608], 60.00th=[ 4640],
| 70.00th=[ 7968], 80.00th=[12608], 90.00th=[21632], 95.00th=[24192],
| 99.00th=[27520], 99.50th=[28032], 99.90th=[32384], 99.95th=[35584],
| 99.99th=[39680]
bw (KB /s): min= 122, max= 5168, per=1.57%, avg=622.89, stdev=670.75
lat (usec) : 250=22.93%, 500=11.81%, 750=4.96%, 1000=0.32%
lat (msec) : 2=7.59%, 4=10.08%, 10=17.55%, 20=12.40%, 50=12.36%
cpu : usr=0.00%, sys=0.25%, ctx=99704, majf=0, minf=1997
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=99098/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=396392KB, aggrb=39552KB/s, minb=39552KB/s, maxb=39552KB/s, mint=10022msec, maxt=10022msec
Disk stats (read/write):
dm-0: ios=98301/69, merge=0/0, ticks=630344/1064, in_queue=631676, util=99.97%, aggrios=99098/72, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md0: ios=99098/72, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=24081/34, aggrmerge=728/1, aggrticks=155786/262, aggrin_queue=146501, aggrutil=95.07%
xvdb: ios=23968/36, merge=1653/2, ticks=223996/344, in_queue=213304, util=95.07%
xvdc: ios=24994/36, merge=0/2, ticks=90672/136, in_queue=83920, util=88.84%
xvdd: ios=23766/32, merge=1262/0, ticks=251956/432, in_queue=242256, util=89.63%
xvde: ios=23597/32, merge=0/0, ticks=56520/136, in_queue=46524, util=83.43%
- sudo fio -filename=/data/test2g -direct=1 -rw=write -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=write, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=10): [WWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWWW] [100.0% done] [0KB/6669KB/0KB /s] [0/1667/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=14910: Mon Nov 16 17:54:29 2015
write: io=66740KB, bw=6647.5KB/s, iops=1661, runt= 10040msec
clat (usec): min=454, max=10014K, avg=5983.49, stdev=229869.61
lat (usec): min=454, max=10014K, avg=5983.84, stdev=229869.61
clat percentiles (usec):
| 1.00th=[ 470], 5.00th=[ 486], 10.00th=[ 494], 20.00th=[ 506],
| 30.00th=[ 516], 40.00th=[ 524], 50.00th=[ 540], 60.00th=[ 548],
| 70.00th=[ 572], 80.00th=[ 612], 90.00th=[ 716], 95.00th=[ 892],
| 99.00th=[ 1784], 99.50th=[ 2800], 99.90th=[24192], 99.95th=[8978432],
| 99.99th=[10027008]
bw (KB /s): min= 0, max= 7056, per=23.20%, avg=1542.24, stdev=2828.77
lat (usec) : 500=13.74%, 750=77.66%, 1000=4.90%
lat (msec) : 2=2.94%, 4=0.37%, 10=0.12%, 20=0.07%, 50=0.16%
lat (msec) : >=2000=0.05%
cpu : usr=0.00%, sys=0.13%, ctx=33637, majf=0, minf=1902
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=16685/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=66740KB, aggrb=6647KB/s, minb=6647KB/s, maxb=6647KB/s, mint=10040msec, maxt=10040msec
Disk stats (read/write):
dm-0: ios=0/16518, merge=0/0, ticks=0/9696, in_queue=9696, util=95.25%, aggrios=0/16759, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md0: ios=0/16759, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=0/8376, aggrmerge=0/2, aggrticks=0/4400, aggrin_queue=4399, aggrutil=46.64%
xvdb: ios=0/8419, merge=0/4, ticks=0/4820, in_queue=4820, util=46.64%
xvdc: ios=0/8419, merge=0/4, ticks=0/4792, in_queue=4788, util=46.40%
xvdd: ios=0/8334, merge=0/0, ticks=0/3732, in_queue=3732, util=36.75%
xvde: ios=0/8334, merge=0/0, ticks=0/4256, in_queue=4256, util=41.24%
- sudo fio -filename=/data/test2g -direct=1 -rw=randread -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randread, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=64): [rrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr] [100.0% done] [47364KB/0KB/0KB /s] [11.9K/0/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=15053: Mon Nov 16 17:56:00 2015
read : io=480728KB, bw=47962KB/s, iops=11990, runt= 10023msec
clat (usec): min=180, max=73261, avg=5328.61, stdev=8309.87
lat (usec): min=180, max=73261, avg=5328.81, stdev=8309.87
clat percentiles (usec):
| 1.00th=[ 205], 5.00th=[ 270], 10.00th=[ 314], 20.00th=[ 354],
| 30.00th=[ 370], 40.00th=[ 398], 50.00th=[ 466], 60.00th=[ 820],
| 70.00th=[ 4704], 80.00th=[11200], 90.00th=[20096], 95.00th=[23680],
| 99.00th=[31104], 99.50th=[34048], 99.90th=[40704], 99.95th=[47360],
| 99.99th=[58112]
bw (KB /s): min= 378, max= 1229, per=1.57%, avg=750.94, stdev=123.63
lat (usec) : 250=4.75%, 500=51.10%, 750=3.81%, 1000=1.11%
lat (msec) : 2=3.14%, 4=4.64%, 10=9.95%, 20=11.31%, 50=10.16%
lat (msec) : 100=0.04%
cpu : usr=0.00%, sys=0.32%, ctx=120351, majf=0, minf=2060
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=120182/w=0/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
READ: io=480728KB, aggrb=47962KB/s, minb=47962KB/s, maxb=47962KB/s, mint=10023msec, maxt=10023msec
Disk stats (read/write):
dm-0: ios=118923/68, merge=0/0, ticks=631564/784, in_queue=632708, util=98.93%, aggrios=120182/69, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md0: ios=120182/69, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=30044/33, aggrmerge=0/0, aggrticks=159872/247, aggrin_queue=160119, aggrutil=98.34%
xvdb: ios=30276/33, merge=1/0, ticks=269808/436, in_queue=270244, util=98.34%
xvdc: ios=29990/33, merge=2/0, ticks=308140/496, in_queue=308636, util=98.34%
xvdd: ios=30153/34, merge=0/1, ticks=36340/8, in_queue=36348, util=86.04%
xvde: ios=29760/34, merge=0/1, ticks=25200/48, in_queue=25248, util=71.45%
- sudo fio -filename=/data/test2g -direct=1 -rw=randwrite -bs=4k -size=2G -numjobs=64 -runtime=10 -group_reporting -name=file1
file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
...
file1: (g=0): rw=randwrite, bs=4K-4K/4K-4K/4K-4K, ioengine=sync, iodepth=1
fio-2.1.3
Starting 64 processes
Jobs: 64 (f=4): [wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww] [100.0% done] [0KB/6529KB/0KB /s] [0/1632/0 iops] [eta 00m:00s]
file1: (groupid=0, jobs=64): err= 0: pid=15123: Mon Nov 16 17:56:59 2015
write: io=60532KB, bw=6030.3KB/s, iops=1507, runt= 10038msec
clat (usec): min=457, max=9942.4K, avg=2708.25, stdev=139970.30
lat (usec): min=457, max=9942.4K, avg=2708.63, stdev=139970.31
clat percentiles (usec):
| 1.00th=[ 482], 5.00th=[ 498], 10.00th=[ 506], 20.00th=[ 524],
| 30.00th=[ 532], 40.00th=[ 540], 50.00th=[ 556], 60.00th=[ 572],
| 70.00th=[ 604], 80.00th=[ 660], 90.00th=[ 828], 95.00th=[ 1128],
| 99.00th=[ 2608], 99.50th=[ 7904], 99.90th=[29312], 99.95th=[33536],
| 99.99th=[9895936]
bw (KB /s): min= 0, max= 6536, per=23.07%, avg=1390.90, stdev=2551.96
lat (usec) : 500=6.33%, 750=80.74%, 1000=6.48%
lat (msec) : 2=5.09%, 4=0.68%, 10=0.28%, 20=0.15%, 50=0.22%
lat (msec) : 100=0.01%, >=2000=0.02%
cpu : usr=0.00%, sys=0.12%, ctx=30378, majf=0, minf=1901
IO depths : 1=100.0%, 2=0.0%, 4=0.0%, 8=0.0%, 16=0.0%, 32=0.0%, >=64=0.0%
submit : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
complete : 0=0.0%, 4=100.0%, 8=0.0%, 16=0.0%, 32=0.0%, 64=0.0%, >=64=0.0%
issued : total=r=0/w=15133/d=0, short=r=0/w=0/d=0
Run status group 0 (all jobs):
WRITE: io=60532KB, aggrb=6030KB/s, minb=6030KB/s, maxb=6030KB/s, mint=10038msec, maxt=10038msec
Disk stats (read/write):
dm-0: ios=0/14970, merge=0/0, ticks=0/9980, in_queue=9888, util=96.85%, aggrios=0/15206, aggrmerge=0/0, aggrticks=0/0, aggrin_queue=0, aggrutil=0.00%
md0: ios=0/15206, merge=0/0, ticks=0/0, in_queue=0, util=0.00%, aggrios=82/7600, aggrmerge=0/1, aggrticks=21655/4331, aggrin_queue=11324, aggrutil=51.30%
xvdb: ios=86/7599, merge=0/0, ticks=22784/5116, in_queue=12536, util=51.30%
xvdc: ios=76/7599, merge=0/0, ticks=18564/4376, in_queue=10172, util=43.85%
xvdd: ios=84/7602, merge=0/3, ticks=22656/3692, in_queue=11224, util=37.75%
xvde: ios=85/7602, merge=0/3, ticks=22616/4140, in_queue=11364, util=41.45%
まとめ、iopsはreadの場合、10000前後、writeは1600前後
今の所、iostatで監視する限り足りているので gp2 ssdで行けそう
mongodb関連の設定
$ sudo chown mongodb:mongodb /data /log /journal
mongod.confサンプル
Snappyを使っているので、その設定
storage: # default snappy
dbPath: "/data"
engine: "wiredTiger"
directoryPerDB: true
# storage.wiredTiger.engineConfig.cacheSizeGB
# Default: the maximum of half of physical RAM or 1 gigabyte
wiredTiger:
engineConfig:
cacheSizeGB: 55
journal:
enabled: true
processManagement:
pidFilePath: /var/run/mongodb.pid
fork: true
systemLog:
destination: file
path: "/log/mongod.log"
logAppend: true
replication:
replSetName: "****"
net:
bindIp: *.*.*.*
port: 27017
setParameter:
enableLocalhostAuthBypass: false
ログローテート
$ cat <<\EOF | sudo tee /etc/logrotate.d/mongod
/log/mongod.log {
daily
missingok
rotate 4
compress
delaycompress
notifempty
create 640 mongodb mongodb
postrotate
killall -SIGUSR1 mongod
find /log -type f -regex ".*\.\(log.[0-9].*-[0-9].*\)" -exec rm {} \;
endscript
}
EOF
$ sudo touch /log/mongod.log
$ sudo chown mongodb:mongodb /log/mongod.log
$ sudo logrotate -dv /etc/logrotate.d/mongod
$ sudo logrotate -fv /etc/logrotate.d/mongod
mongodb start/stop
$ sudo service mongod start/stop
sysstatを入れておく
$ sudo apt-get install sysstat
# ioをチェック
$ iostat 1
newrelicで監視
newrelicのインストール
$ echo 'deb http://apt.newrelic.com/debian/ newrelic non-free' | sudo tee /etc/apt/sources.list.d/newrelic.list
$ wget -O- https://download.newrelic.com/548C16BF.gpg | sudo apt-key add -
$ sudo apt-get update
$ sudo apt-get install newrelic-sysmond
$ sudo nrsysmond-config --set license_key=<your_license_key_here>
newrelicの設定
$ sudo vi /etc/newrelic/nrsysmond.cfg
proxy=10.0.0.10:3128
disable_docker=true
$ sudo service newrelic-sysmond restart
newrelic proxyの設定
DBはawsのvpcにあるので、proxy使ってnewrelicにつなぐ必要がある
例えば、squidを入れる
10.0.0.10> sudo apt-get install squid
10.0.0.10> sudo vi /etc/squid3/squid.conf
http_access allow all
awsのセキュリティーグループでacl制御をする
mongodb pluginをインストール
$ sudo apt-get install python-pip python-dev
$ sudo pip install newrelic-plugin-agent
$ sudo pip install newrelic_plugin_agent[mongodb]
$ cat <<\EOF | sudo tee /etc/newrelic/mongo.yml
%YAML 1.2
---
Application:
license_key: ********
mongodb:
- name: pri
host: *.*.*.*
port: 27017
databases:
- database1
- database2
EOF
pluginを起動
newrelic-plugin-agent -c /etc/newrelic/mongo.yml
後書
3.0.5だと負荷がかかると不安定になるということなので、3.0.7する必要がある
新たにレプリカセットにメンバーを追加して、新たなサーバーにリプレイスをした
レプリカセットのコンフィグは簡単なので、この辺りの作業は比較的楽だった
しばらくこの設定でいけそう
今後増えて行くディスク容量だったりの管理が課題
データーを逃したり、いろいろとやる必要がある気がする