俺です。
MHAをEC2で稼働させるにあたり、FailoverスクリプトをGoで書きました。
kusocommitとkusourceですがここにあげときました。
MySQL MasterインスタンスのVIPをルーティングするRouteTableを書き換えるスクリプトです。
とりあえずMySQL Masterのmysqldダウン時にRouteTableの設定が新Masterへ変わることが確認できました。
使い方の詳細は後で書きます。
引数判定にcli.goを使いましたが、
MHA on EC2では未使用の引数も書いとかないと行けないのがめんどい(未定義の引数を無視する方法ってあるのかなー)
build方法(後で書く)
go getとかでもできるかな...
goxでビルドして完成したバイナリをMHAマネージャノードへポイーするだけです。
goxすごく便利ですねー。
$ cd $GOPATH/src
$ git clone https://github.com/gamisan9999/mysqlfailgover.git
$ glide in
$ gox --os=linux
mha実行例
- 正常時
[root@ip-172-16-3-23 ~]# masterha_manager --conf=/etc/mha.conf
[root@ip-172-16-3-23 ~]# masterha_manager --conf=/etc/mha.conf
Sat Jan 2 13:32:24 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jan 2 13:32:24 2016 - [info] Reading application default configuration from /etc/mha.conf..
Sat Jan 2 13:32:24 2016 - [info] Reading server configuration from /etc/mha.conf..
Sat Jan 2 13:32:24 2016 - [info] MHA::MasterMonitor version 0.56.
Sat Jan 2 13:32:25 2016 - [info] GTID failover mode = 1
Sat Jan 2 13:32:25 2016 - [info] Dead Servers:
Sat Jan 2 13:32:25 2016 - [info] Alive Servers:
Sat Jan 2 13:32:25 2016 - [info] 172.16.3.207(172.16.3.207:3306)
Sat Jan 2 13:32:25 2016 - [info] 172.16.2.4(172.16.2.4:3306)
Sat Jan 2 13:32:25 2016 - [info] 172.16.3.99(172.16.3.99:3306)
Sat Jan 2 13:32:25 2016 - [info] Alive Slaves:
Sat Jan 2 13:32:25 2016 - [info] 172.16.3.207(172.16.3.207:3306) Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sat Jan 2 13:32:25 2016 - [info] GTID ON
Sat Jan 2 13:32:25 2016 - [info] Replicating from 172.16.2.4(172.16.2.4:3306)
Sat Jan 2 13:32:25 2016 - [info] 172.16.3.99(172.16.3.99:3306) Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sat Jan 2 13:32:25 2016 - [info] GTID ON
Sat Jan 2 13:32:25 2016 - [info] Replicating from 172.16.2.4(172.16.2.4:3306)
Sat Jan 2 13:32:25 2016 - [info] Current Alive Master: 172.16.2.4(172.16.2.4:3306)
Sat Jan 2 13:32:25 2016 - [info] Checking slave configurations..
Sat Jan 2 13:32:25 2016 - [info] read_only=1 is not set on slave 172.16.3.207(172.16.3.207:3306).
Sat Jan 2 13:32:25 2016 - [info] read_only=1 is not set on slave 172.16.3.99(172.16.3.99:3306).
..省略..
Sat Jan 2 13:32:25 2016 - [info] HealthCheck: SSH to 172.16.2.4 is reachable.
Sat Jan 2 13:32:25 2016 - [info]
172.16.2.4(172.16.2.4:3306) (current master)
+--172.16.3.207(172.16.3.207:3306)
+--172.16.3.99(172.16.3.99:3306)
Sat Jan 2 13:32:25 2016 - [info] Checking master_ip_failover_script status:
Sat Jan 2 13:32:25 2016 - [info] /root/mysqlfailgover --mysql_master_vip=192.168.0.10/32 --command=status --ssh_user=root --orig_master_host=172.16.2.4 --orig_master_ip=172.16.2.4 --orig_master_port=3306
8
orig_master_ip registration route table id: rtb-764d3b13
Sat Jan 2 13:32:26 2016 - [info] OK.
Sat Jan 2 13:32:26 2016 - [warning] shutdown_script is not defined.
Sat Jan 2 13:32:26 2016 - [info] Set master ping interval 3 seconds.
Sat Jan 2 13:32:26 2016 - [warning] secondary_check_script is not defined. It is highly recommended setting it to check master reachability from two or more routes.
Sat Jan 2 13:32:26 2016 - [info] Starting ping health check on 172.16.2.4(172.16.2.4:3306)..
Sat Jan 2 13:32:26 2016 - [info] Ping(SELECT) succeeded, waiting until MySQL doesn't respond..
- Masterのmysqld停止時
Sat Jan 2 13:32:50 2016 - [warning] Got error on MySQL select ping: 2013 (Lost connection to MySQL server during query)
Sat Jan 2 13:32:50 2016 - [info] Executing SSH check script: exit 0
Sat Jan 2 13:32:50 2016 - [info] HealthCheck: SSH to 172.16.2.4 is reachable.
Sat Jan 2 13:32:53 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.16.2.4' (111))
Sat Jan 2 13:32:53 2016 - [warning] Connection failed 2 time(s)..
Sat Jan 2 13:32:56 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.16.2.4' (111))
Sat Jan 2 13:32:56 2016 - [warning] Connection failed 3 time(s)..
Sat Jan 2 13:32:59 2016 - [warning] Got error on MySQL connect: 2003 (Can't connect to MySQL server on '172.16.2.4' (111))
Sat Jan 2 13:32:59 2016 - [warning] Connection failed 4 time(s)..
Sat Jan 2 13:32:59 2016 - [warning] Master is not reachable from health checker!
Sat Jan 2 13:32:59 2016 - [warning] Master 172.16.2.4(172.16.2.4:3306) is not reachable!
Sat Jan 2 13:32:59 2016 - [warning] SSH is reachable.
Sat Jan 2 13:32:59 2016 - [info] Connecting to a master server failed. Reading configuration file /etc/masterha_default.cnf and /etc/mha.conf again, and trying to connect to all servers to check server status..
Sat Jan 2 13:32:59 2016 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.
Sat Jan 2 13:32:59 2016 - [info] Reading application default configuration from /etc/mha.conf..
Sat Jan 2 13:32:59 2016 - [info] Reading server configuration from /etc/mha.conf..
Sat Jan 2 13:33:00 2016 - [info] GTID failover mode = 1
Sat Jan 2 13:33:00 2016 - [info] Dead Servers:
Sat Jan 2 13:33:00 2016 - [info] 172.16.2.4(172.16.2.4:3306)
Sat Jan 2 13:33:00 2016 - [info] Alive Servers:
Sat Jan 2 13:33:00 2016 - [info] 172.16.3.207(172.16.3.207:3306)
Sat Jan 2 13:33:00 2016 - [info] 172.16.3.99(172.16.3.99:3306)
Sat Jan 2 13:33:00 2016 - [info] Alive Slaves:
Sat Jan 2 13:33:00 2016 - [info] 172.16.3.207(172.16.3.207:3306) Version=5.6.27-log (oldest major version between slaves) log-bin:enabled
Sat Jan 2 13:33:01 2016 - [info] Executing master IP deactivation script:
Sat Jan 2 13:33:01 2016 - [info] /root/mysqlfailgover --mysql_master_vip=192.168.0.10/32 --orig_master_host=172.16.2.4 --orig_master_ip=172.16.2.4 --orig_master_port=3306 --command=stopssh --ssh_user=root
8
※stopsshは何も実行しない
Sat Jan 2 13:33:01 2016 - [info] done.
Sat Jan 2 13:33:01 2016 - [warning] shutdown_script is not set. Skipping explicit shutting down of the dead master.
Sat Jan 2 13:33:02 2016 - [info] * Phase 2: Dead Master Shutdown Phase completed.
Sat Jan 2 13:33:02 2016 - [info] * Phase 3.3: New Master Recovery Phase..
Sat Jan 2 13:33:02 2016 - [info]
Sat Jan 2 13:33:02 2016 - [info] Waiting all logs to be applied..
Sat Jan 2 13:33:02 2016 - [info] done.
Sat Jan 2 13:33:02 2016 - [info] Getting new master's binlog name and position..
Sat Jan 2 13:33:02 2016 - [info] mysqld-bin.000014:489
Sat Jan 2 13:33:02 2016 - [info] All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='172.16.3.207', MASTER_PORT=3306, MASTER_AUTO_POSITION=1, MASTER_USER='repl', MASTER_PASSWORD='xx
x';
Sat Jan 2 13:33:02 2016 - [info] Master Recovery succeeded. File:Pos:Exec_Gtid_Set: mysqld-bin.000014, 489, 8c410b0f-b114-11e5-903c-0a9edda6c5c3:12,
dee4e0d2-a3d2-11e5-b9ca-0a9edda6c5c3:1-5,
dee52e9c-a3d2-11e5-b9ca-0ae712148d43:1-3
Sat Jan 2 13:33:02 2016 - [info] Executing master IP activate script:
※フェーズ3のfailover start時にMaster用EC2のRouteTableが書き換わる
Sat Jan 2 13:33:02 2016 - [info] /root/mysqlfailgover --mysql_master_vip=192.168.0.10/32 --command=start --ssh_user=root --orig_master_host=172.16.2.4 --orig_master_ip=172.16.2.4 --orig_master_port=3306 --new_master_host=172.16.3.20
7 --new_master_ip=172.16.3.207 --new_master_port=3306 --new_master_user='mha' --new_master_password='mhapassword'
13
mysql master vip: 192.168.0.10/32
master_host: 172.16.2.4
master_host_ip:
172.16.2.4 = i-3072a3bf
172.16.3.207 = i-a574a52a
orig_master_ip registration route table id: rtb-764d3b13
route table id rtb-764d3b13 ,replace destination instance i-3072a3bf to i-a574a52a
route table, rtb-764d3b13 replaced.
Sat Jan 2 13:33:02 2016 - [info] OK.
Sat Jan 2 13:33:02 2016 - [info] ** Finished master recovery successfully.
Sat Jan 2 13:33:02 2016 - [info] * Phase 3: Master Recovery Phase completed.
..省略..
- 切り替わり前後のmysql master host
MHA Managerから接続確認しました
※切替前
[ec2-user@ip-172-16-3-23 ~]$ mysql -u root -h 192.168.0.10 -e "show variables like 'hostname'"
+---------------+---------------+
| Variable_name | Value |
+---------------+---------------+
| hostname | ip-172-16-2-4 |
+---------------+---------------+
※ここでフェイルオーバーが発生して172.16.2.4->172.16.3.207へMasterが切り替わる
[ec2-user@ip-172-16-3-23 ~]$ mysql -u root -h 192.168.0.10 -e "show slave status\G"
[ec2-user@ip-172-16-3-23 ~]$ mysql -u root -h 192.168.0.10 -e "show variables like 'hostname'"
+---------------+-----------------+
| Variable_name | Value |
+---------------+-----------------+
| hostname | ip-172-16-3-207 |
+---------------+-----------------+
mha.conf
定義例
- /etc/mha.conf
[server default]
user=mha
password=mhapassword
manager_workdir=/var/lib/mha
remote_workdir=/var/lib/mha
repl_user=repl
repl_password=repl
ssh_user=root
master_ip_failover_script="/root/mysqlfailgover --mysql_master_vip=192.168.0.10/32"
[server1]
hostname=172.16.3.207
[server2]
hostname=172.16.2.4
[server3]
hostname=172.16.3.99