3
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?

More than 5 years have passed since last update.

MHA on EC2でフェイルオーバースクリプトを実装した

Last updated at Posted at 2016-01-02

俺です。
MHAをEC2で稼働させるにあたり、FailoverスクリプトをGoで書きました。

kusocommitとkusourceですがここにあげときました。

mysqlfailgover

MySQL MasterインスタンスのVIPをルーティングするRouteTableを書き換えるスクリプトです。

とりあえずMySQL Masterのmysqldダウン時にRouteTableの設定が新Masterへ変わることが確認できました。

使い方の詳細は後で書きます。

引数判定にcli.goを使いましたが、
MHA on EC2では未使用の引数も書いとかないと行けないのがめんどい(未定義の引数を無視する方法ってあるのかなー)

build方法(後で書く)

go getとかでもできるかな...

goxでビルドして完成したバイナリをMHAマネージャノードへポイーするだけです。

goxすごく便利ですねー。

$ cd $GOPATH/src
$ git clone https://github.com/gamisan9999/mysqlfailgover.git
$ glide in
$ gox --os=linux

mha実行例

  • 正常時
[root@ip-172-16-3-23 ~]# masterha_manager --conf=/etc/mha.conf
[root@ip-172-16-3-23 ~]# masterha_manager --conf=/etc/mha.conf
Sat Jan  2 13:32:24 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jan  2 13:32:24 2016 - [info] Reading application default configuration from /etc/mha.conf..
Sat Jan  2 13:32:24 2016 - [info] Reading server configuration from /etc/mha.conf..
Sat Jan  2 13:32:24 2016 - [info] MHA::MasterMonitor version 0.56.
Sat Jan  2 13:32:25 2016 - [info] GTID failover mode = 1
Sat Jan  2 13:32:25 2016 - [info] Dead Servers:
Sat Jan  2 13:32:25 2016 - [info] Alive Servers:
Sat Jan  2 13:32:25 2016 - [info]   172.16.3.207(172.16.3.207:3306)
Sat Jan  2 13:32:25 2016 - [info]   172.16.2.4(172.16.2.4:3306)
Sat Jan  2 13:32:25 2016 - [info]   172.16.3.99(172.16.3.99:3306)
Sat Jan  2 13:32:25 2016 - [info] Alive Slaves:
Sat Jan  2 13:32:25 2016 - [info]   172.16.3.207(172.16.3.207:3306)  Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sat Jan  2 13:32:25 2016 - [info]     GTID ON
Sat Jan  2 13:32:25 2016 - [info]     Replicating from 172.16.2.4(172.16.2.4:3306)
Sat Jan  2 13:32:25 2016 - [info]   172.16.3.99(172.16.3.99:3306)  Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sat Jan  2 13:32:25 2016 - [info]     GTID ON
Sat Jan  2 13:32:25 2016 - [info]     Replicating from 172.16.2.4(172.16.2.4:3306)
Sat Jan  2 13:32:25 2016 - [info] Current Alive Master: 172.16.2.4(172.16.2.4:3306)
Sat Jan  2 13:32:25 2016 - [info] Checking slave configurations..
Sat Jan  2 13:32:25 2016 - [info]  read_only=1 is not set on slave 172.16.3.207(172.16.3.207:3306).
Sat Jan  2 13:32:25 2016 - [info]  read_only=1 is not set on slave 172.16.3.99(172.16.3.99:3306).
..省略..
Sat Jan  2 13:32:25 2016 - [info] HealthCheck: SSH to 172.16.2.4 is reachable.
Sat Jan  2 13:32:25 2016 - [info]
172.16.2.4(172.16.2.4:3306) (current master)
 +--172.16.3.207(172.16.3.207:3306)
 +--172.16.3.99(172.16.3.99:3306)

Sat Jan  2 13:32:25 2016 - [info] Checking master_ip_failover_script status:
Sat Jan  2 13:32:25 2016 - [info]   /root/mysqlfailgover --mysql_master_vip=192.168.0.10/32 --command=status --ssh_user=root --orig_master_host=172.16.2.4 --orig_master_ip=172.16.2.4 --orig_master_port=3306
8
orig_master_ip registration route table id:  rtb-764d3b13
Sat Jan  2 13:32:26 2016 - [info]  OK.
Sat Jan  2 13:32:26 2016 - [warning] shutdown_script is not defined.
Sat Jan  2 13:32:26 2016 - [info] Set master ping interval 3 seconds.
Sat Jan  2 13:32:26 2016 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sat Jan  2 13:32:26 2016 - [info] Starting ping health check on 172.16.2.4(172.16.2.4:3306)..
Sat Jan  2 13:32:26 2016 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
  • Masterのmysqld停止時
Sat Jan  2 13:32:50 2016 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
Sat Jan  2 13:32:50 2016 - [info] Executing SSH check script: exit 0
Sat Jan  2 13:32:50 2016 - [info] HealthCheck: SSH to 172.16.2.4 is reachable.
Sat Jan  2 13:32:53 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.16.2.4' (111))
Sat Jan  2 13:32:53 2016 - [warning] Connection failed 2 time(s)..
Sat Jan  2 13:32:56 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.16.2.4' (111))
Sat Jan  2 13:32:56 2016 - [warning] Connection failed 3 time(s)..
Sat Jan  2 13:32:59 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.16.2.4' (111))
Sat Jan  2 13:32:59 2016 - [warning] Connection failed 4 time(s)..
Sat Jan  2 13:32:59 2016 - [warning] Master is not reachable from health checker!
Sat Jan  2 13:32:59 2016 - [warning] Master 172.16.2.4(172.16.2.4:3306) is not reachable!
Sat Jan  2 13:32:59 2016 - [warning] SSH is reachable.
Sat Jan  2 13:32:59 2016 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha.conf again, and trying to connect to all servers to check server status..
Sat Jan  2 13:32:59 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jan  2 13:32:59 2016 - [info] Reading application default configuration from /etc/mha.conf..
Sat Jan  2 13:32:59 2016 - [info] Reading server configuration from /etc/mha.conf..
Sat Jan  2 13:33:00 2016 - [info] GTID failover mode = 1
Sat Jan  2 13:33:00 2016 - [info] Dead Servers:
Sat Jan  2 13:33:00 2016 - [info]   172.16.2.4(172.16.2.4:3306)
Sat Jan  2 13:33:00 2016 - [info] Alive Servers:
Sat Jan  2 13:33:00 2016 - [info]   172.16.3.207(172.16.3.207:3306)
Sat Jan  2 13:33:00 2016 - [info]   172.16.3.99(172.16.3.99:3306)
Sat Jan  2 13:33:00 2016 - [info] Alive Slaves:
Sat Jan  2 13:33:00 2016 - [info]   172.16.3.207(172.16.3.207:3306)  Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sat Jan  2 13:33:01 2016 - [info] Executing master IP deactivation script:
Sat Jan  2 13:33:01 2016 - [info]   /root/mysqlfailgover --mysql_master_vip=192.168.0.10/32 --orig_master_host=172.16.2.4 --orig_master_ip=172.16.2.4 --orig_master_port=3306 --command=stopssh --ssh_user=root
8
※stopsshは何も実行しない
Sat Jan  2 13:33:01 2016 - [info]  done.
Sat Jan  2 13:33:01 2016 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sat Jan  2 13:33:02 2016 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sat Jan  2 13:33:02 2016 - [info] * Phase 3.3: New Master Recovery Phase..
Sat Jan  2 13:33:02 2016 - [info]
Sat Jan  2 13:33:02 2016 - [info]  Waiting all logs to be applied..
Sat Jan  2 13:33:02 2016 - [info]   done.
Sat Jan  2 13:33:02 2016 - [info] Getting new master's binlog name and position..
Sat Jan  2 13:33:02 2016 - [info]  mysqld-bin.000014:489
Sat Jan  2 13:33:02 2016 - [info]  All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.3.207', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xx
x';
Sat Jan  2 13:33:02 2016 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysqld-bin.000014, 489, 8c410b0f-b114-11e5-903c-0a9edda6c5c3:12,
dee4e0d2-a3d2-11e5-b9ca-0a9edda6c5c3:1-5,
dee52e9c-a3d2-11e5-b9ca-0ae712148d43:1-3
Sat Jan  2 13:33:02 2016 - [info] Executing master IP activate script:
※フェーズ3のfailover start時にMaster用EC2のRouteTableが書き換わる
Sat Jan  2 13:33:02 2016 - [info]   /root/mysqlfailgover --mysql_master_vip=192.168.0.10/32 --command=start --ssh_user=root --orig_master_host=172.16.2.4 --orig_master_ip=172.16.2.4 --orig_master_port=3306 --new_master_host=172.16.3.20
7 --new_master_ip=172.16.3.207 --new_master_port=3306 --new_master_user='mha' --new_master_password='mhapassword'
13
mysql master vip: 192.168.0.10/32
master_host: 172.16.2.4
master_host_ip:
172.16.2.4 = i-3072a3bf
172.16.3.207 = i-a574a52a
orig_master_ip registration route table id:  rtb-764d3b13
route table id rtb-764d3b13 ,replace destination instance  i-3072a3bf to i-a574a52a
route table, rtb-764d3b13 replaced.
Sat Jan  2 13:33:02 2016 - [info]  OK.
Sat Jan  2 13:33:02 2016 - [info] ** Finished master recovery successfully.
Sat Jan  2 13:33:02 2016 - [info] * Phase 3: Master Recovery Phase completed.
..省略..
  • 切り替わり前後のmysql master host

MHA Managerから接続確認しました

※切替前
[ec2-user@ip-172-16-3-23 ~]$ mysql -u root -h 192.168.0.10 -e "show variables like 'hostname'"
+---------------+---------------+
| Variable_name | Value         |
+---------------+---------------+
| hostname      | ip-172-16-2-4 |
+---------------+---------------+
※ここでフェイルオーバーが発生して172.16.2.4->172.16.3.207へMasterが切り替わる
[ec2-user@ip-172-16-3-23 ~]$ mysql -u root -h 192.168.0.10 -e "show slave status\G"
[ec2-user@ip-172-16-3-23 ~]$ mysql -u root -h 192.168.0.10 -e "show variables like 'hostname'"
+---------------+-----------------+
| Variable_name | Value           |
+---------------+-----------------+
| hostname      | ip-172-16-3-207 |
+---------------+-----------------+

mha.conf

定義例

  • /etc/mha.conf
[server default]
user=mha
password=mhapassword
manager_workdir=/var/lib/mha
remote_workdir=/var/lib/mha
repl_user=repl
repl_password=repl
ssh_user=root
master_ip_failover_script="/root/mysqlfailgover --mysql_master_vip=192.168.0.10/32"


[server1]
hostname=172.16.3.207
[server2]
hostname=172.16.2.4
[server3]
hostname=172.16.3.99
3
3
0

Register as a new user and use Qiita more conveniently

  1. You get articles that match your needs
  2. You can efficiently read back useful information
  3. You can use dark theme
What you can do with signing up
3
3

Delete article

Deleted articles cannot be recovered.

Draft of this article would be also deleted.

Are you sure you want to delete this article?