部署MHA(CentOS7.3 64bit+MySQL5.6.27 )

MHA是什么?

MHA是由日本Mysql專家用Perl寫的一套Mysql故障切換方案,來保障數據庫的高可用性,它的功能是能在0-30s之內實現主Mysql故障轉移(failover),MHA故障轉移可以很好的幫我們解決從庫數據的一致性問題,同時最大化挽回故障發生后數據的一致性。

MHA里有兩個角色一個是node節點?一個是manager節點,要實現這個MHA,必須最少要三臺數據庫服務器,一主多備,即一臺充當master,一臺充當master的備份機,另外一臺是從屬機,這里實驗為了實現更好的效果使用四臺機器,需要說明的是一旦主服務器宕機,備份機即開始充當master提供服務,如果主服務器上線也不會再成為master了,因為如果這樣數據庫的一致性就被改變了。MHA有兩個重要的角色,一個是manager,另外一個是node

192.168.10.94 ? ?manager????管理節點

192.168.10.91 ? ?master?????主庫

192.168.10.92 ? ?slave01????從庫?+?備庫

192.168.10.93 ? ?slave02????從庫

一、環境初始化

1、在master、slave01及slave02上部署mysql(參考文檔《【MySQL5.6.27安裝規范】-【運維部】-張偉科》)

2、修改主機名

主機:?manager執行命令

#?sed?-i?'s/HOSTNAME=.*/HOSTNAME=manager/g'?/etc/sysconfig/network?&&?hostname?manager

主機:?master執行命令

#?sed?-i?'s/HOSTNAME=.*/HOSTNAME=master/g'?/etc/sysconfig/network?&&?hostname?master

主機:?slave01執行命令

#?sed?-i?'s/HOSTNAME=.*/HOSTNAME=slave01/g'?/etc/sysconfig/network?&&?hostname?slave01

主機:?slave02執行命令

#?sed?-i?'s/HOSTNAME=.*/HOSTNAME=slave02/g'?/etc/sysconfig/network?&&?hostname?slave02

2、主機名解析

在manager上執行如下命令

[root@manager?~]#?cat?>>?/etc/hosts?<<?EOF

192.168.10.94????? manager

192.168.10.91????? master

192.168.10.92???? ?slave01

192.168.10.93?? ? slave02

>?EOF

[root@manager?~]#?scp?-o?StrictHostKeyChecking=no?/etc/hosts?root@master:/etc/

[root@manager?~]#?scp?-o?StrictHostKeyChecking=no?/etc/hosts?root@slave01:/etc/

[root@manager?~]#?scp?-o?StrictHostKeyChecking=no?/etc/hosts?root@slave02:/etc/

3、ssh無密碼登錄

主機:?manager執行命令

[root@manager?~]#?ssh-keygen?-t?rsa

[root@manager?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@master

[root@manager?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@slave01

[root@manager?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@slave02

主機:?master執行命令

[root@master?~]#?ssh-keygen?-t?rsa

[root@master?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@manager

[root@master?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@slave01

[root@master?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@slave02

主機:?slave01執行命令

[root@slave01?~]#?ssh-keygen?-t?rsa

[root@slave01?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@manager

[root@slave01?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@master

[root@slave01?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@slave02

主機:?slave02執行命令

[root@slave02?~]#?ssh-keygen?-t?rsa

[root@slave02?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@manager

[root@slave02?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@master

[root@slave02?~]#?ssh-copy-id?-i?~/.ssh/id_rsa.pub?root@slave01

二、規劃mysql

1、配置master、slave01和slave02之間的主從復制

在MySQL5.6 的Replication配置中,master端同樣要開啟兩個重要的選項,server-id和log-bin,并且選項server-id在全局架構中并且唯一,不能被其它主機使用,這里采用主機ip地址的最后一位充當server-id的值;slave端要開啟relay-log;然后重啟mysql。

主機:?master執行命令

[root@master?~]#?egrep?"log_bin|server_id|relay_log_purge"?/app/mysql/my.cnf

server-id = 91

log-bin = master-bin

relay-log-purge=0

[root@master?~]#??/sbin/iptables -I INPUT -p tcp --dport 3306 -j ACCEPT

[root@master?~]#??/etc/rc.d/init.d/iptables save??

主機:?slave01執行命令

[root@slave01?~]#??egrep?"log_bin|server_id|relay_log_purge"?/app/mysql/my.cnf

server-id = 92

log-bin = master-bin

relay-log-purge=0

[root@master?~]#??/sbin/iptables -I INPUT -p tcp --dport 3306 -j ACCEPT

[root@master?~]#??/etc/rc.d/init.d/iptables save?

主機:?slave02執行命令

[root@slave02?~]#?egrep?"log_bin|server_id|relay_log_purge"?/app/mysql/my.cnf

server-id = 93

log-bin = master-bin

read-only=1

relay-log-purge=0

[root@master?~]#??/sbin/iptables -I INPUT -p tcp --dport 3306 -j ACCEPT

[root@master?~]#??/etc/rc.d/init.d/iptables save?

2、在master、slave01上創建主從同步的賬號。slave01是備用master,這個也需要建立授權用戶

[root@master?~]# ?mysql -e?"grant?all?privileges?on?*.*?to?'rep'@'%'?identified?by?'20151012';flush?privileges"

[root@slave01?~]#?mysql -e?"grant?all?privileges?on?*.*?to?'rep'@'%'?identified?by?'20151012';flush?privileges"

4、在master上執行命令,查看master狀態信息

[root@master?~]#?mysql -e?'show?master?status;'

+-------------------+----------+--------------+------------------+-------------------+

| File? ? ? ? ? ? ? | Position | Binlog_Do_DB | Binlog_Ignore_DB | Executed_Gtid_Set |

+-------------------+----------+--------------+------------------+-------------------+

| master-bin.000001 |? ? ? 120 |? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? |? ? ? ? ? ? ? ? ? ?|

+-------------------+----------+--------------+------------------+-------------------+

5、在slave01和slave02上執行主從同步

[root@slave01?~]#?mysql

mysql>CHANGE MASTER TO

????MASTER_HOST='192.168.10.91',

????MASTER_USER='rep',

????MASTER_PASSWORD='20151012',

????MASTER_PORT=3306,

????MASTER_LOG_FILE='mysql_bin.000001',

????MASTER_LOG_POS=120;

Query OK, 0 rows affected, 2 warnings (0.02 sec)


mysql>?start?slave;


mysql> show slave status\G

*************************** 1. row ***************************

???????????????Slave_IO_State: Waiting for master to send event

??????????????????Master_Host: 192.168.10.91

??????????????????Master_User: rep

??????????????????Master_Port: 3306

????????????????Connect_Retry: 60

??????????????Master_Log_File: mysql_bin.000001

??????????Read_Master_Log_Pos: 120

???????????????Relay_Log_File: slave01-relay-bin.000002

????????????????Relay_Log_Pos: 283

????????Relay_Master_Log_File: mysql_bin.000001

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

??????????????Replicate_Do_DB:

??????????Replicate_Ignore_DB:

???????????Replicate_Do_Table:

???????Replicate_Ignore_Table:

??????Replicate_Wild_Do_Table:

??Replicate_Wild_Ignore_Table:

???????????????????Last_Errno: 0

???????????????????Last_Error:

?????????????????Skip_Counter: 0

??????????Exec_Master_Log_Pos: 120

??????????????Relay_Log_Space: 458

??????????????Until_Condition: None

???????????????Until_Log_File:

????????????????Until_Log_Pos: 0

???????????Master_SSL_Allowed: No

???????????Master_SSL_CA_File:

???????????Master_SSL_CA_Path:

??????????????Master_SSL_Cert:

????????????Master_SSL_Cipher:

???????????????Master_SSL_Key:

????????Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

????????????????Last_IO_Errno: 0

????????????????Last_IO_Error:

???????????????Last_SQL_Errno: 0

???????????????Last_SQL_Error:

??Replicate_Ignore_Server_Ids:

?????????????Master_Server_Id: 91

??????????????????Master_UUID: 9b1eb4a5-00f3-11e8-a3ba-ce006127c972

?????????????Master_Info_File: /app/mysql/data/master.info

????????????????????SQL_Delay: 0

??????????SQL_Remaining_Delay: NULL

??????Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

???????????Master_Retry_Count: 86400

??????????????????Master_Bind:

??????Last_IO_Error_Timestamp:

?????Last_SQL_Error_Timestamp:

???????????????Master_SSL_Crl:

???????????Master_SSL_Crlpath:

???????????Retrieved_Gtid_Set:

????????????Executed_Gtid_Set:

????????????????Auto_Position: 0

1 row in set (0.00 sec)

實驗到這里表示主從已經配置完成!接下來我們就開始規劃mha

三、規劃mha

1、創建mha管理用的復制賬號,每臺數據庫上都要創建4個賬號,在這里以其中master為例

[root@master?~]#?mysql

mysql>?grant?all?privileges?on?*.*?to?'mha_rep'@'%'?identified?by?'20151012';?flush?privileges;

mysql> select user,host,password from mysql.user where user='mha_rep';

+---------+------+-------------------------------------------+

| user????| host | password??????????????????????????????????|

+---------+------+-------------------------------------------+

| mha_rep | %????| *F9B93BD42F62D26FD094239C15535EE045F7BB22 |

+---------+------+-------------------------------------------+

2、在3臺主機上(master、slave01和slave02)上分別安裝mha4mysql-node包,這里以master為例,其它主機同理。

準確的來講的話,應該是所有的節點包括manager和node的所有節點都要安裝mha4mysql-node包,只不過等會manager要安裝node節點也要安裝manager節點,所以把manager單獨在下面安裝了。

[root@master?~]#?yum?install?perl-DBD-MySQL?-y

[root@master?~]#?cd /usr/local/src/

[root@master?~]#?wget?https://downloads.mariadb.com/files/MHA/mha4mysql-node-0.54-0.el6.noarch.rpm

[root@master?~]#?rpm?-ivh?mha4mysql-node-0.54-0.el6.noarch.rpm

/usr/bin/apply_diff_relay_logs? ? //識別差異的中繼日志事件并將其差異的事件應用于其他的slave

/usr/bin/filter_mysqlbinlog? ? //去除不必要的ROLLBACK事件(MHA已不再使用這個工具)

/usr/bin/purge_relay_logs? ? //清除中繼日志(不會阻塞SQL線程)

/usr/bin/save_binary_logs//保存和復制master的二進制日志

3、在manager上安裝mha4mysql-manager和mha4mysql-node包

[root@manager?~]#?yum?install?perl?cpan?perl-DBD-MySQL?perl-Config-Tiny?perl-Log-Dispatch?perl-Parallel-ForkManager?perl-Net-Telnet?-y

注釋:由于yum源里沒有這四個安裝包,因此我們需要單獨下載來安裝。

[root@master?~]#?cd /usr/local/src/

[root@manager?~]#?wget?http://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Log-Dispatch-2.26-1.el6.rf.noarch.rpm

[root@manager?~]#?wget?ftp://rpmfind.net/linux/dag/redhat/el6/en/x86_64/dag/RPMS/perl-Parallel-ForkManager-0.7.5-2.2.el6.rf.noarch.rpm

[root@manager?~]#?yum?localinstall?*.rpm?-y

安裝manager和node包

[root@manager?~]#cd /usr/local/src/

[root@manager?~]#?wget ?https://downloads.mariadb.com/MHA/mha4mysql-manager-0.55-0.el6.noarch.rpm

[root@manager?~]#?wget ?https://downloads.mariadb.com/MHA/mha4mysql-node-0.54-0.el6.noarch.rpm

[root@manager?~]#?yum?localinstall??mha4mysql-node-0.54-0.el6.noarch.rpm?-y

[root@manager?~]#?yum?localinstall??mha4mysql-manager-0.55-0.el6.noarch.rpm?-y

4、查看mha4mysql-manager安裝了哪些工具

[root@manager?~]#?rpm?-ql?mha4mysql-manager?|grep?bin

/usr/bin/masterha_check_repl? ? //檢查MySQL復制狀況

/usr/bin/masterha_check_ssh? ? //檢查MHA的SSH配置狀況

/usr/bin/masterha_check_status? ? //檢測當前MHA運行狀態

/usr/bin/masterha_conf_host? ? //添加或刪除配置的server信息

/usr/bin/masterha_manager? ? //啟動MHA

/usr/bin/masterha_master_monitor? ? //檢測master是否宕機

/usr/bin/masterha_master_switch? ? //控制故障轉移(自動或者手動)

/usr/bin/masterha_secondary_check

/usr/bin/masterha_stop

5、修改腳本/usr/bin/masterha_secondary_check配置的ssh端口


6、在manager主機上下載mha4mysql-manager的源碼包

#?wget?https://downloads.mariadb.com/MHA/mha4mysql-manager-0.56.tar.gz

7、在manager主機上從mha4mysql-manager的源碼包中提取mha的配置配置文件和腳本

[root@manager?~]#?tar?xf?mha4mysql-manager-0.56.tar.gz?

[root@manager?~]#?mkdir?-p?/app/mha/scripts

[root@manager?~]#?cp?mha4mysql-manager-0.56/samples/scripts/*?/app/mha/scripts/

[root@manager?~]#?cp?mha4mysql-manager-0.56/samples/conf/app1.cnf?/app/mha/mha.cnf

[root@manager?~]#?tree?/app/mha/

/app/mha/

├── mha.cnf

└── scripts

├── master_ip_failover? ? //故障自動切換時對vip管理的腳本,不是必須。如果我們使用keepalived的,我們可以自己編寫腳本完成對vip的管理,比如監控mysql,如果mysql異常,我們停止keepalived就行,這樣vip就會自動漂移

├── master_ip_online_change? ? //在線切換時對vip的管理,不是必須,同樣可以自行編寫簡單的shell完成。

├── power_manager? ? //故障發生后關閉主機的腳本,不是必須

└── send_report? ? //因故障切換后發送報警的腳本,不是必須,可自行編寫簡單的shell完成

8、修改manager端mha的配置文件,如下

[root@manager?~]#?cat?/app/mha/mha.cnf?

[server default]

#監控用戶

user=mha_rep

#監控用戶的密碼

password=20151012

#ssh登錄用戶名

ssh_user=root

#復制用戶名

repl_user=rep

#復制用戶的密碼

repl_password=20151012

#設置監控主庫,發送ping包的時間間隔,默認是3秒,嘗試三次沒有回應的時候自動進行failover

ping_interval=1

#設置manager的工作目錄和日志目錄

manager_workdir=/app/mha?????

manager_log=/app/mha/manager.log

# monitor mysql

#一旦MHA到master的監控之間出現問題,MHA Manager將會嘗試從mysql01,mysql02登錄到master

secondary_check_script= masterha_secondary_check -s 192.168.10.91 -s 192.168.10.92 -s 192.168.10.93

#設置發生切換后發送的報警的腳本

report_script= /app/mha/scripts/send_report

#設置手動切換時候的切換腳本(腳本有瑕疵,需要自行修改)

master_ip_online_change_script= /app/mha/scripts/master_ip_online_change

#設置自動failover時候的切換腳本(腳本有瑕疵,需要自行修改)

master_ip_failover_script=/app/mha/scripts/master_ip_failover

#設置故障發生后關閉故障主機腳本(該腳本的主要作用是關閉主機發生腦裂,這里沒有使用)

#shutdown_script= /app/mha/scripts/power_manager

#check_repl_delay=0

[server1]

hostname=master

ssh_port=60022

candidate_master=1

check_repl_delay=0

master_binlog_dir=/app/mysql/data

[server2]

hostname=slave01

ssh_port=60022

candidate_master=1

check_repl_delay=0

master_binlog_dir=/app/mysql/data

[server3]

hostname=slave02

ssh_port=60022

no_master=1

master_binlog_dir=/app/mysql/data

9、檢查ssh是否暢通

[root@manager?~]#masterha_check_ssh --conf=/app/mha/mha.cnf

Thu Jan 25 11:50:58 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Jan 25 11:50:58 2018 - [info] Reading application default configurations from /app/mha/mha.cnf..

Thu Jan 25 11:50:58 2018 - [info] Reading server configurations from /app/mha/mha.cnf..

Thu Jan 25 11:50:58 2018 - [info] Starting SSH connection tests..

Thu Jan 25 11:51:01 2018 - [debug]

Thu Jan 25 11:50:59 2018 - [debug]??Connecting via SSH from root@slave02(192.168.10.93:60022) to root@master(192.168.10.91:60022)..

Warning: Permanently added '[192.168.10.93]:60022' (ECDSA) to the list of known hosts.

Warning: Permanently added '[192.168.10.91]:60022' (ECDSA) to the list of known hosts.

Thu Jan 25 11:51:00 2018 - [debug]???ok.

Thu Jan 25 11:51:00 2018 - [debug]??Connecting via SSH from root@slave02(192.168.10.93:60022) to root@slave01(192.168.10.92:60022)..

Thu Jan 25 11:51:01 2018 - [debug]???ok.

Thu Jan 25 11:51:01 2018 - [debug]

Thu Jan 25 11:50:59 2018 - [debug]??Connecting via SSH from root@slave01(192.168.10.92:60022) to root@master(192.168.10.91:60022)..

Warning: Permanently added '[192.168.10.91]:60022' (ECDSA) to the list of known hosts.

Thu Jan 25 11:51:00 2018 - [debug]???ok.

Thu Jan 25 11:51:00 2018 - [debug]??Connecting via SSH from root@slave01(192.168.10.92:60022) to root@slave02(192.168.10.93:60022)..

Warning: Permanently added '[192.168.10.93]:60022' (ECDSA) to the list of known hosts.

Thu Jan 25 11:51:00 2018 - [debug]???ok.

Thu Jan 25 11:51:01 2018 - [debug]

Thu Jan 25 11:50:58 2018 - [debug]??Connecting via SSH from root@master(192.168.10.91:60022) to root@slave01(192.168.10.92:60022)..

Warning: Permanently added '[192.168.10.91]:60022' (ECDSA) to the list of known hosts.

Warning: Permanently added '[192.168.10.92]:60022' (ECDSA) to the list of known hosts.

Thu Jan 25 11:51:00 2018 - [debug]???ok.

Thu Jan 25 11:51:00 2018 - [debug]??Connecting via SSH from root@master(192.168.10.91:60022) to root@slave02(192.168.10.93:60022)..

Warning: Permanently added '[192.168.10.93]:60022' (ECDSA) to the list of known hosts.

Thu Jan 25 11:51:00 2018 - [debug]???ok.

Thu Jan 25 11:51:01 2018 - [info] All SSH connection tests passed successfully.

如果得到以上結果,表明主機之間ssh互信是暢通的

10、檢查主從復制是否正常

執行主從復制檢查的時候,這個由于我是用源碼編譯的mysql會出現路徑找不到的問題;比如

(1)?Can't?exec?"mysqlbinlog":?No?such?file?or?directory?at?/usr/local/perl5/MHA/BinlogManager.pm?line?99.

解決辦法:

在master、slave01和slave02上分別執行如下命令

#?ln -s /app/mysql/bin/mysqlbinlog /usr/bin/mysqlbinlog


(2)mysqlbinlog:?unknown?variable?'default-character-set=utf8'

解決辦法:

在master、slave01和slave02上分別執行注釋client部分的default-character-set=utf8選項,并重啟mysqld服務


(3)Testing?mysql?connection?and?privileges..sh:?mysql:?command?not?found

解決辦法:

在master、slave01和slave02上分別執行如下命令

#?ln?-s?/app/mysql/bin/mysql?/usr/bin/mysql

[root@manager?~]#?masterha_check_repl?--conf=/app/mha/mha.cnf?

Thu Jan 25 12:11:16 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Jan 25 12:11:16 2018 - [info] Reading application default configurations from /app/mha/mha.cnf..

Thu Jan 25 12:11:16 2018 - [info] Reading server configurations from /app/mha/mha.cnf..

Thu Jan 25 12:11:16 2018 - [info] MHA::MasterMonitor version 0.55.

Thu Jan 25 12:11:17 2018 - [info] Dead Servers:

Thu Jan 25 12:11:17 2018 - [info] Alive Servers:

Thu Jan 25 12:11:17 2018 - [info]???master(192.168.10.91:3306)

Thu Jan 25 12:11:17 2018 - [info]???slave01(192.168.10.92:3306)

Thu Jan 25 12:11:17 2018 - [info]???slave02(192.168.10.93:3306)

Thu Jan 25 12:11:17 2018 - [info] Alive Slaves:

Thu Jan 25 12:11:17 2018 - [info]???slave01(192.168.10.92:3306)??Version=5.6.27-log (oldest major version between slaves) log-bin:enabled

Thu Jan 25 12:11:17 2018 - [info]?????Replicating from 192.168.10.91(192.168.10.91:3306)

Thu Jan 25 12:11:17 2018 - [info]?????Primary candidate for the new Master (candidate_master is set)

Thu Jan 25 12:11:17 2018 - [info]???slave02(192.168.10.93:3306)??Version=5.6.27-log (oldest major version between slaves) log-bin:enabled

Thu Jan 25 12:11:17 2018 - [info]?????Replicating from 192.168.10.91(192.168.10.91:3306)

Thu Jan 25 12:11:17 2018 - [info]?????Not candidate for the new Master (no_master is set)

Thu Jan 25 12:11:17 2018 - [info] Current Alive Master: master(192.168.10.91:3306)

Thu Jan 25 12:11:17 2018 - [info] Checking slave configurations..

Thu Jan 25 12:11:17 2018 - [info]??read_only=1 is not set on slave slave01(192.168.10.92:3306).

Thu Jan 25 12:11:17 2018 - [info] Checking replication filtering settings..

Thu Jan 25 12:11:17 2018 - [info]??binlog_do_db= , binlog_ignore_db=

Thu Jan 25 12:11:17 2018 - [info]??Replication filtering check ok.

Thu Jan 25 12:11:17 2018 - [info] Starting SSH connection tests..

Thu Jan 25 12:11:20 2018 - [info] All SSH connection tests passed successfully.

Thu Jan 25 12:11:20 2018 - [info] Checking MHA Node version..

Thu Jan 25 12:11:20 2018 - [info]??Version check ok.

Thu Jan 25 12:11:20 2018 - [info] Checking SSH publickey authentication settings on the current master..

Thu Jan 25 12:11:21 2018 - [info] HealthCheck: SSH to master is reachable.

Thu Jan 25 12:11:21 2018 - [info] Master MHA Node version is 0.54.

Thu Jan 25 12:11:21 2018 - [info] Checking recovery script configurations on the current master..

Thu Jan 25 12:11:21 2018 - [info]???Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/app/mysql/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.55 --start_file=mysql_bin.000001

Thu Jan 25 12:11:21 2018 - [info]???Connecting to root@master(master)..

??Creating /var/tmp if not exists..????ok.

??Checking output directory is accessible or not..

???ok.

??Binlog found at /app/mysql/data, up to mysql_bin.000001

Thu Jan 25 12:11:22 2018 - [info] Master setting check done.

Thu Jan 25 12:11:22 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Jan 25 12:11:22 2018 - [info]???Executing command : apply_diff_relay_logs --command=test --slave_user='mha_rep' --slave_host=slave01 --slave_ip=192.168.10.92 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.27-log --manager_version=0.55 --relay_log_info=/app/mysql/data/relay-log.info??--relay_dir=/app/mysql/data/??--slave_pass=xxx

Thu Jan 25 12:11:22 2018 - [info]???Connecting to root@192.168.10.92(slave01:60022)..

??Checking slave recovery environment settings..

????Opening /app/mysql/data/relay-log.info ... ok.

????Relay log found at /app/mysql/data, up to slave01-relay-bin.000002

????Temporary relay log file is /app/mysql/data/slave01-relay-bin.000002

????Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.

done.

????Testing mysqlbinlog output.. done.

????Cleaning up test file(s).. done.

Thu Jan 25 12:11:22 2018 - [info]???Executing command : apply_diff_relay_logs --command=test --slave_user='mha_rep' --slave_host=slave02 --slave_ip=192.168.10.93 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.27-log --manager_version=0.55 --relay_log_info=/app/mysql/data/relay-log.info??--relay_dir=/app/mysql/data/??--slave_pass=xxx

Thu Jan 25 12:11:22 2018 - [info]???Connecting to root@192.168.10.93(slave02:60022)..

??Checking slave recovery environment settings..

????Opening /app/mysql/data/relay-log.info ... ok.

????Relay log found at /app/mysql/data, up to slave02-relay-bin.000002

????Temporary relay log file is /app/mysql/data/slave02-relay-bin.000002

????Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.

done.

????Testing mysqlbinlog output.. done.

????Cleaning up test file(s).. done.

Thu Jan 25 12:11:22 2018 - [info] Slaves settings check done.

Thu Jan 25 12:11:22 2018 - [info]

master (current master)

+--slave01

+--slave02

Thu Jan 25 12:11:22 2018 - [info] Checking replication health on slave01..

Thu Jan 25 12:11:22 2018 - [info]??ok.

Thu Jan 25 12:11:22 2018 - [info] Checking replication health on slave02..

Thu Jan 25 12:11:22 2018 - [info]??ok.

Thu Jan 25 12:11:22 2018 - [warning] master_ip_failover_script is not defined.

Thu Jan 25 12:11:22 2018 - [warning] shutdown_script is not defined.

Thu Jan 25 12:11:22 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

或者在命令執行后大家會看到有警告信息;比如

Tue?Sep?15?23:45:35?2015?-?[warning]?Global?configuration?file?/etc/masterha_default.cnf?not?found.?Skipping.

在命令執行后輸出結果的第一行有這樣的警告信息,找不到masterha_default.cnf,其實這個文件是mha的全局默認配置文件,由于我們沒有使用全局,所以就跳過了這項,不過不妨礙真個環境。如果大家想用,其實也是可以的,在源碼包里就有這個默認的模板配置文件,大家只需要稍作修改就可以排查這個警告信息

Tue?Sep?15?23:45:38?2015?-?[warning]?master_ip_failover_script?is?not?defined.

Tue?Sep?15?23:45:38?2015?-?[warning]?shutdown_script?is?not?defined.

在命令執行后輸出結果的最后幾行中,提示未定義,大家看看/app/mha/mha.cnf文件中,我們正好注釋了這兩行代碼,其中master_ip_failover_script是后期做vip的時候才用到的。

四、mha實驗模擬

1、在每次做mha實驗的時候,我們都最好先執行如下命令做檢測

[root@manager?~]#?masterha_check_ssh?--conf=/app/mha/mha.cnf

[root@manager?~]#?masterha_check_repl?--conf=/app/mha/mha.cnf

確定兩條命令的返回結果都是無異常的,然后啟動mha服務

2、在manager端啟動mha服務并時刻監控日志文件的輸出變化

[root@manager?~]#?nohup?masterha_manager?--conf=/app/mha/mha.cnf?>?/app/mha/log/mha_manager.log?2>&1?&

[root@manager?~]#?ps?-ef?|grep?masterha?|grep?-v?'grep'

root??????799234??788691??1 14:09 pts/1????00:00:00 perl /bin/masterha_manager --conf=/app/mha/mha.cnf

3、實驗流程第一階段

準備,先來檢查主從是否都均已正常

首先,停止master端的mysqld服務進程,然后查看備庫也就是slave01是否已經提升到主庫

其次,登錄slave02端查看主從是否正常,是否更新到新的master的ip上也就是是否執行slave01的ip地址

最后,啟動master端的mysqld服務進程,并將其加入到主從模式中

準備,實驗開始

在slave01和slave02上執行,檢查主從同步是否都正常,這里以slave01為例,slave02同理

[root@slave01?~]#?mysql -e?'show?slave?status\G'?|egrep?'Slave_IO_Running:|Slave_SQL_Running:'

? ? ? ? ? ? ?Slave_IO_Running: Yes

? ? ? ? ? ? Slave_SQL_Running: Yes

首先,實驗開始

(1)在master端上執行命令來停止mysqld服務進程

[root@master?~]#?/etc/init.d/mysql stop

Shutting?down?MySQL....?SUCCESS!

(2)查看manager端的mha輸出日志,在這里只截取了一部分日志信息

[root@manager?~]#?tail?-f?/app/mha/manager.log

----- Failover Report -----

mha: MySQL Master failover master to slave01 succeeded

Master master is down!

Check MHA Manager logs at manager:/app/mha/manager.log for details.

Started automated(non-interactive) failover.

The latest slave slave01(192.168.10.92:3306) has all relay logs for recovery.

Selected slave01 as a new master.

slave01: OK: Applying all logs succeeded.

slave02: This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

slave02: OK: Applying all logs succeeded. Slave started, replicating from slave01.

slave01: Resetting slave info succeeded.

Master failover to slave01(192.168.10.92:3306) completed successfully.

Thu Jan 25 13:52:52 2018 - [info] Sending mail..

Unknown option: conf

其次,實驗開始

登錄slave02查看主從同步是否正常,查看是否已經轉移到新的master的ip上

[root@slave02?~]#?mysql -e?'show?slave?status\G'?|egrep?'Master_Host|Slave_IO_Running:|Slave_SQL_Running:'?

??????????????????Master_Host: 192.168.10.92

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

最后,實驗開始

(1)在master端啟動mysqld服務

[root@master?~]#?/etc/init.d/mysql start

Starting?MySQL.?SUCCESS!?


(2)在manager端的mha日志文件中找到主從同步的sql語句,這條語句只需要修改密碼即可使用

[root@manager?~]#?grep?'MASTER_HOST'?/app/mha/manager.log?|tail?-n?1

Thu Jan 25 13:52:50 2018 - [info]??All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='slave01 or 192.168.10.92', MASTER_PORT=3306, MASTER_LOG_FILE='mysql_bin.000001', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='xxx';

注意:

MASTER_HOST='slave01 or 192.168.10.92'?這個位置需要注意一下,最好只寫一個并建議寫ip地址


(3)在master上啟動主從同步,密碼為20151012

[root@master?~]#?mysql -e?"CHANGE MASTER TO MASTER_HOST='192.168.10.92', MASTER_PORT=3306, MASTER_LOG_FILE='mysql_bin.000001', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='20151012';start slave;"

[root@master?~]#?mysql -e?"show?slave?status\G"

*************************** 1. row ***************************

???????????????Slave_IO_State: Waiting for master to send event

??????????????????Master_Host: 192.168.10.92

??????????????????Master_User: rep

??????????????????Master_Port: 3306

????????????????Connect_Retry: 60

??????????????Master_Log_File: mysql_bin.000001

??????????Read_Master_Log_Pos: 120

???????????????Relay_Log_File: master-relay-bin.000002

????????????????Relay_Log_Pos: 283

????????Relay_Master_Log_File: mysql_bin.000001

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

??????????????Replicate_Do_DB:

??????????Replicate_Ignore_DB:

???????????Replicate_Do_Table:

???????Replicate_Ignore_Table:

??????Replicate_Wild_Do_Table:

??Replicate_Wild_Ignore_Table:

???????????????????Last_Errno: 0

???????????????????Last_Error:

?????????????????Skip_Counter: 0

??????????Exec_Master_Log_Pos: 120

??????????????Relay_Log_Space: 457

??????????????Until_Condition: None

???????????????Until_Log_File:

????????????????Until_Log_Pos: 0

???????????Master_SSL_Allowed: No

???????????Master_SSL_CA_File:

???????????Master_SSL_CA_Path:

??????????????Master_SSL_Cert:

????????????Master_SSL_Cipher:

???????????????Master_SSL_Key:

????????Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

????????????????Last_IO_Errno: 0

????????????????Last_IO_Error:

???????????????Last_SQL_Errno: 0

???????????????Last_SQL_Error:

??Replicate_Ignore_Server_Ids:

?????????????Master_Server_Id: 92

??????????????????Master_UUID: 996b4343-00f3-11e8-a3ba-b6c824ce1080

?????????????Master_Info_File: /app/mysql/data/master.info

????????????????????SQL_Delay: 0

??????????SQL_Remaining_Delay: NULL

??????Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

???????????Master_Retry_Count: 86400

??????????????????Master_Bind:

??????Last_IO_Error_Timestamp:

?????Last_SQL_Error_Timestamp:

???????????????Master_SSL_Crl:

???????????Master_SSL_Crlpath:

???????????Retrieved_Gtid_Set:

????????????Executed_Gtid_Set:

????????????????Auto_Position: 0

4、實驗流程第二階段

準備,先來檢查主從是否都均已正常

首先,停止slave01端的mysqld服務進程,然后查看master是否已經提升到新的主庫

其次,登錄slave02端查看主從是否正常,是否更新到新的master的ip上也就是是否執行master的ip地址

最后,啟動slave01端的mysqld服務進程,并將其加入到主從模式中

這里強調下,默認情況下每次主備庫切換后,mha服務都會停止。

在這里我們需要重新啟動mha服務

[root@manager?~]#?rm?-rf?/app/mha/mha.failover.complete

[root@manager?~]#?nohup?masterha_manager?--conf=/app/mha/mha.cnf?>?/app/mha/log/mha_manager.log?2>&1?&

[1] 3606

[root@manager?~]#?ps?-ef?|grep?masterha?|grep?-v?'grep'

root??????799234??788691??1 14:09 pts/1????00:00:00 perl /bin/masterha_manager --conf=/app/mha/mha.cnf

[root@manager?~]#?masterha_check_status?--conf=/app/mha/mha.cnf

mha (pid:799234) is running(0:PING_OK), master:slave01?#表明現在的master是slave01主機

準備,實驗開始

在master和slave02上執行,檢查主從同步是否都正常,這里以master為例,slave02同理

[root@master?~]#?mysql -e?'show?slave?status\G'?|egrep?'Slave_IO_Running:|Slave_SQL_Running:'?

??????????????????Master_Host: 192.168.10.92

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

首先,實驗開始

(1)在slave01端上執行命令來停止mysqld服務進程

[root@slave01?~]#?/etc/init.d/mysql stop

Shutting?down?MySQL....?SUCCESS!?


(2)查看manager端的mha輸出日志,在這里只截取了一部分日志信息

[root@manager?~]#?tail?-f?/app/mha/manager.log

----- Failover Report -----

mha: MySQL Master failover slave01 to master succeeded

Master slave01 is down!

Check MHA Manager logs at manager:/app/mha/manager.log for details.

Started automated(non-interactive) failover.

The latest slave master(192.168.10.91:3306) has all relay logs for recovery.

Selected master as a new master.

master: OK: Applying all logs succeeded.

slave02: This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

slave02: OK: Applying all logs succeeded. Slave started, replicating from master.

master: Resetting slave info succeeded.

Master failover to master(192.168.10.91:3306) completed successfully.

Thu Jan 25 14:25:48 2018 - [info] Sending mail..

Unknown option: conf

其次,實驗開始

登錄slave02查看主從同步是否正常,查看是否已經轉移到新的master的ip上

[root@slave02?~]#mysql -e 'show slave status\G' |egrep 'Master_Host|Slave_IO_Running:|Slave_SQL_Running:'

??????????????????Master_Host: 192.168.10.91

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

最后,實驗開始

(1)在slave01端啟動mysqld服務

[root@slave01?~]#?/etc/init.d/mysql start

Starting?MySQL.?SUCCESS!?


(2)在manager端的mha日志文件中找到主從同步的sql語句,這條語句只需要修改密碼即可使用

[root@manager?~]#?grep?'MASTER_HOST'?/app/mha/manager.log?|?tail?-n?1

Thu Jan 25 14:25:46 2018 - [info]??All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='master or 192.168.10.91', MASTER_PORT=3306, MASTER_LOG_FILE='mysql_bin.000002', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='xxx';

注意:

MASTER_HOST='master or 192.168.10.91'?這個位置需要注意一下,最好只寫一個并建議寫ip地址


(3)在slave01上啟動主從同步,密碼為20151012

[root@slave01?~]#mysql -e "CHANGE MASTER TO MASTER_HOST='192.168.10.91', MASTER_PORT=3306, MASTER_LOG_FILE='mysql_bin.000002', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='20151012'; start slave;"

[root@slave01?~]#?mysql -e?"show?slave?status\G"

*************************** 1. row ***************************

???????????????Slave_IO_State: Waiting for master to send event

??????????????????Master_Host: 192.168.10.91

??????????????????Master_User: rep

??????????????????Master_Port: 3306

????????????????Connect_Retry: 60

??????????????Master_Log_File: mysql_bin.000002

??????????Read_Master_Log_Pos: 120

???????????????Relay_Log_File: slave01-relay-bin.000002

????????????????Relay_Log_Pos: 283

????????Relay_Master_Log_File: mysql_bin.000002

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

??????????????Replicate_Do_DB:

??????????Replicate_Ignore_DB:

???????????Replicate_Do_Table:

???????Replicate_Ignore_Table:

??????Replicate_Wild_Do_Table:

??Replicate_Wild_Ignore_Table:

???????????????????Last_Errno: 0

???????????????????Last_Error:

?????????????????Skip_Counter: 0

??????????Exec_Master_Log_Pos: 120

??????????????Relay_Log_Space: 458

??????????????Until_Condition: None

???????????????Until_Log_File:

????????????????Until_Log_Pos: 0

???????????Master_SSL_Allowed: No

???????????Master_SSL_CA_File:

???????????Master_SSL_CA_Path:

??????????????Master_SSL_Cert:

????????????Master_SSL_Cipher:

???????????????Master_SSL_Key:

????????Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

????????????????Last_IO_Errno: 0

????????????????Last_IO_Error:

???????????????Last_SQL_Errno: 0

???????????????Last_SQL_Error:

??Replicate_Ignore_Server_Ids:

?????????????Master_Server_Id: 91

??????????????????Master_UUID: 9b1eb4a5-00f3-11e8-a3ba-ce006127c972

?????????????Master_Info_File: /app/mysql/data/master.info

????????????????????SQL_Delay: 0

??????????SQL_Remaining_Delay: NULL

??????Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

???????????Master_Retry_Count: 86400

??????????????????Master_Bind:

??????Last_IO_Error_Timestamp:

?????Last_SQL_Error_Timestamp:

???????????????Master_SSL_Crl:

???????????Master_SSL_Crlpath:

???????????Retrieved_Gtid_Set:

????????????Executed_Gtid_Set:

????????????????Auto_Position: 0

實驗到這一步,已經完成了主庫down后,備用的從庫會自動提升到主庫,并且其它從庫也會重新指向新的master的ip地址。但是這里卻存在一個問題,就是主備庫確實實現了切換,但是對外提供的ip總不能是兩個吧!為了整合keepalived/heartbeat的功能,這里也引入了vip,實現無透明切換

至于如何實現vip的故障轉移,網上也有很多組合,有的是用keepalived實現的故障轉移,也有實現這篇文章中將要提供的腳本檢測功能。

說到這里,實驗過程中,大家會注意執行命令的輸出結果中的警告信息[warning],下面就來說說這個吧

首先,我們看下這個腳本,我們如果想用這個vip的功能,需要打開這個選項

[root@manager?~]#?grep?'^#master_ip_failover_script'?/app/mha/mha.cnf?

master_ip_failover_script=/app/mha/scripts/master_ip_failover

其次,修改里面幾處配置

[root@manager?~]#?mv?/app/mha/scripts/{master_ip_failover,master_ip_failover_bak}

[root@manager?~]#?cat?/app/mha/scripts/master_ip_failover

#!/usr/bin/env?perl

use?strict;

use?warnings?FATAL?=>?'all';

use?Getopt::Long;

my?(

$command,?$ssh_user,?$orig_master_host,?$orig_master_ip,

$orig_master_port,?$new_master_host,?$new_master_ip,?$new_master_port

);

my $vip = '192.168.10.90'; # Virtual IP????#可修改

my $gateway = '192.168.10.254';#Gateway IP????#可修改

my $interface = 'eth0';????????????????????#可修改

my?$key?=?"1";

my?$ssh_start_vip?=?"/sbin/ifconfig?$interface:$key?$vip;/sbin/arping?-I?$interface?-c?3?-s?$vip?$gateway?>/dev/null?2>&1";

my?$ssh_stop_vip?=?"/sbin/ifconfig?$interface:$key?down";

GetOptions(

'command=s'?=>?\$command,

'ssh_user=s'?=>?\$ssh_user,

'orig_master_host=s'?=>?\$orig_master_host,

'orig_master_ip=s'?=>?\$orig_master_ip,

'orig_master_port=i'?=>?\$orig_master_port,

'new_master_host=s'?=>?\$new_master_host,

'new_master_ip=s'?=>?\$new_master_ip,

'new_master_port=i'?=>?\$new_master_port,

);

exit?&main();

sub?main?{

print?"\n\nIN?SCRIPT?TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if?(?$command?eq?"stop"?||?$command?eq?"stopssh"?)?{

#?$orig_master_host,?$orig_master_ip,?$orig_master_port?are?passed.

#?If?you?manage?master?ip?address?at?global?catalog?database,

#?invalidate?orig_master_ip?here.

my?$exit_code?=?1;

eval?{

print?"Disabling?the?VIP?on?old?master:?$orig_master_host?\n";

&stop_vip();

$exit_code?=?0;

};

if?($@)?{

warn?"Got?Error:?$@\n";

exit?$exit_code;

}

exit?$exit_code;

}

elsif?(?$command?eq?"start"?)?{

#?all?arguments?are?passed.

#?If?you?manage?master?ip?address?at?global?catalog?database,

#?activate?new_master_ip?here.

#?You?can?also?grant?write?access?(create?user,?set?read_only=0,?etc)?here.

my?$exit_code?=?10;

eval?{

print?"Enabling?the?VIP?-?$vip?on?the?new?master?-?$new_master_host?\n";

&start_vip();

$exit_code?=?0;

};

if?($@)?{

warn?$@;

exit?$exit_code;

}

exit?$exit_code;

}

elsif?(?$command?eq?"status"?)?{

print?"Checking?the?Status?of?the?script..?OK?\n";

`ssh?$ssh_user\@$orig_master_host?\"?$ssh_start_vip?\"`;

exit?0;

}

else?{

&usage();

exit?1;

}

}

#?A?simple?system?call?that?enable?the?VIP?on?the?new?master

sub?start_vip()?{

`ssh?$ssh_user\@$new_master_host?\"?$ssh_start_vip?\"`;

}

#?A?simple?system?call?that?disable?the?VIP?on?the?old_master

sub?stop_vip()?{

`ssh?$ssh_user\@$orig_master_host?\"?$ssh_stop_vip?\"`;

}

sub?usage?{

print

"Usage:?master_ip_failover?--command=start|stop|stopssh|status?--orig_master_host=host?--orig_master_ip=ip?--orig_master_port=port?--new_master_host=host?--new_master_ip=ip?--new_master_port=port\n";

}

[root@manager?~]#?chmod?+x?/app/mha/scripts/master_ip_failover

進行ssh檢查

Thu Jan 25 15:56:56 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Jan 25 15:56:56 2018 - [info] Reading application default configurations from /app/mha/mha.cnf..

Thu Jan 25 15:56:56 2018 - [info] Reading server configurations from /app/mha/mha.cnf..

Thu Jan 25 15:56:56 2018 - [info] Starting SSH connection tests..

Thu Jan 25 15:56:58 2018 - [debug]

Thu Jan 25 15:56:56 2018 - [debug]??Connecting via SSH from root@master(192.168.10.91:60022) to root@slave01(192.168.10.92:60022)..

Thu Jan 25 15:56:57 2018 - [debug]???ok.

Thu Jan 25 15:56:57 2018 - [debug]??Connecting via SSH from root@master(192.168.10.91:60022) to root@slave02(192.168.10.93:60022)..

Thu Jan 25 15:56:58 2018 - [debug]???ok.

Thu Jan 25 15:56:59 2018 - [debug]

Thu Jan 25 15:56:57 2018 - [debug]??Connecting via SSH from root@slave02(192.168.10.93:60022) to root@master(192.168.10.91:60022)..

Thu Jan 25 15:56:58 2018 - [debug]???ok.

Thu Jan 25 15:56:58 2018 - [debug]??Connecting via SSH from root@slave02(192.168.10.93:60022) to root@slave01(192.168.10.92:60022)..

Thu Jan 25 15:56:58 2018 - [debug]???ok.

Thu Jan 25 15:56:59 2018 - [debug]

Thu Jan 25 15:56:57 2018 - [debug]??Connecting via SSH from root@slave01(192.168.10.92:60022) to root@master(192.168.10.91:60022)..

Thu Jan 25 15:56:58 2018 - [debug]???ok.

Thu Jan 25 15:56:58 2018 - [debug]??Connecting via SSH from root@slave01(192.168.10.92:60022) to root@slave02(192.168.10.93:60022)..

Thu Jan 25 15:56:58 2018 - [debug]???ok.

Thu Jan 25 15:56:59 2018 - [info] All SSH connection tests passed successfully.

進行主從復制檢查

[root@manager?~]#?masterha_check_repl?--conf=/app/mha/mha.cnf?

Thu Jan 25 15:55:56 2018 - [warning] Global configuration file /etc/masterha_default.cnf not found. Skipping.

Thu Jan 25 15:55:56 2018 - [info] Reading application default configurations from /app/mha/mha.cnf..

Thu Jan 25 15:55:56 2018 - [info] Reading server configurations from /app/mha/mha.cnf..

Thu Jan 25 15:55:56 2018 - [info] MHA::MasterMonitor version 0.55.

Thu Jan 25 15:55:57 2018 - [info] Dead Servers:

Thu Jan 25 15:55:57 2018 - [info] Alive Servers:

Thu Jan 25 15:55:57 2018 - [info]???master(192.168.10.91:3306)

Thu Jan 25 15:55:57 2018 - [info]???slave01(192.168.10.92:3306)

Thu Jan 25 15:55:57 2018 - [info]???slave02(192.168.10.93:3306)

Thu Jan 25 15:55:57 2018 - [info] Alive Slaves:

Thu Jan 25 15:55:57 2018 - [info]???slave01(192.168.10.92:3306)??Version=5.6.27-log (oldest major version between slaves) log-bin:enabled

Thu Jan 25 15:55:57 2018 - [info]?????Replicating from 192.168.10.91(192.168.10.91:3306)

Thu Jan 25 15:55:57 2018 - [info]?????Primary candidate for the new Master (candidate_master is set)

Thu Jan 25 15:55:57 2018 - [info]???slave02(192.168.10.93:3306)??Version=5.6.27-log (oldest major version between slaves) log-bin:enabled

Thu Jan 25 15:55:57 2018 - [info]?????Replicating from 192.168.10.91(192.168.10.91:3306)

Thu Jan 25 15:55:57 2018 - [info]?????Not candidate for the new Master (no_master is set)

Thu Jan 25 15:55:57 2018 - [info] Current Alive Master: master(192.168.10.91:3306)

Thu Jan 25 15:55:57 2018 - [info] Checking slave configurations..

Thu Jan 25 15:55:57 2018 - [info]??read_only=1 is not set on slave slave01(192.168.10.92:3306).

Thu Jan 25 15:55:57 2018 - [info] Checking replication filtering settings..

Thu Jan 25 15:55:57 2018 - [info]??binlog_do_db= , binlog_ignore_db=

Thu Jan 25 15:55:57 2018 - [info]??Replication filtering check ok.

Thu Jan 25 15:55:57 2018 - [info] Starting SSH connection tests..

Thu Jan 25 15:55:59 2018 - [info] All SSH connection tests passed successfully.

Thu Jan 25 15:55:59 2018 - [info] Checking MHA Node version..

Thu Jan 25 15:56:00 2018 - [info]??Version check ok.

Thu Jan 25 15:56:00 2018 - [info] Checking SSH publickey authentication settings on the current master..

Thu Jan 25 15:56:00 2018 - [info] HealthCheck: SSH to master is reachable.

Thu Jan 25 15:56:01 2018 - [info] Master MHA Node version is 0.54.

Thu Jan 25 15:56:01 2018 - [info] Checking recovery script configurations on the current master..

Thu Jan 25 15:56:01 2018 - [info]???Executing command: save_binary_logs --command=test --start_pos=4 --binlog_dir=/app/mysql/data --output_file=/var/tmp/save_binary_logs_test --manager_version=0.55 --start_file=mysql_bin.000004

Thu Jan 25 15:56:01 2018 - [info]???Connecting to root@master(master)..

??Creating /var/tmp if not exists..????ok.

??Checking output directory is accessible or not..

???ok.

??Binlog found at /app/mysql/data, up to mysql_bin.000004

Thu Jan 25 15:56:02 2018 - [info] Master setting check done.

Thu Jan 25 15:56:02 2018 - [info] Checking SSH publickey authentication and checking recovery script configurations on all alive slave servers..

Thu Jan 25 15:56:02 2018 - [info]???Executing command : apply_diff_relay_logs --command=test --slave_user='mha_rep' --slave_host=slave01 --slave_ip=192.168.10.92 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.27-log --manager_version=0.55 --relay_log_info=/app/mysql/data/relay-log.info??--relay_dir=/app/mysql/data/??--slave_pass=xxx

Thu Jan 25 15:56:02 2018 - [info]???Connecting to root@192.168.10.92(slave01:60022)..

??Checking slave recovery environment settings..

????Opening /app/mysql/data/relay-log.info ... ok.

????Relay log found at /app/mysql/data, up to slave01-relay-bin.000006

????Temporary relay log file is /app/mysql/data/slave01-relay-bin.000006

????Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.

done.

????Testing mysqlbinlog output.. done.

????Cleaning up test file(s).. done.

Thu Jan 25 15:56:02 2018 - [info]???Executing command : apply_diff_relay_logs --command=test --slave_user='mha_rep' --slave_host=slave02 --slave_ip=192.168.10.93 --slave_port=3306 --workdir=/var/tmp --target_version=5.6.27-log --manager_version=0.55 --relay_log_info=/app/mysql/data/relay-log.info??--relay_dir=/app/mysql/data/??--slave_pass=xxx

Thu Jan 25 15:56:02 2018 - [info]???Connecting to root@192.168.10.93(slave02:60022)..

??Checking slave recovery environment settings..

????Opening /app/mysql/data/relay-log.info ... ok.

????Relay log found at /app/mysql/data, up to slave02-relay-bin.000006

????Temporary relay log file is /app/mysql/data/slave02-relay-bin.000006

????Testing mysql connection and privileges..Warning: Using a password on the command line interface can be insecure.

done.

????Testing mysqlbinlog output.. done.

????Cleaning up test file(s).. done.

Thu Jan 25 15:56:02 2018 - [info] Slaves settings check done.

Thu Jan 25 15:56:02 2018 - [info]

master (current master)

+--slave01

+--slave02

Thu Jan 25 15:56:02 2018 - [info] Checking replication health on slave01..

Thu Jan 25 15:56:02 2018 - [info]??ok.

Thu Jan 25 15:56:02 2018 - [info] Checking replication health on slave02..

Thu Jan 25 15:56:02 2018 - [info]??ok.

Thu Jan 25 15:56:02 2018 - [info] Checking master_ip_failover_script status:

Thu Jan 25 15:56:02 2018 - [info]???/app/mha/scripts/master_ip_failover --command=status --ssh_user=root --orig_master_host=master --orig_master_ip=192.168.10.91 --orig_master_port=3306??--orig_master_ssh_port=60022

Unknown option: orig_master_ssh_port

IN SCRIPT TEST====/sbin/ifconfig eth0:1 down==/sbin/ifconfig eth0:1 192.168.10.90;/sbin/arping -I eth0 -c 3 -s 192.168.10.90 192.168.10.254 >/dev/null 2>&1===

Checking the Status of the script.. OK

Thu Jan 25 15:56:06 2018 - [info]??OK.

Thu Jan 25 15:56:06 2018 - [warning] shutdown_script is not defined.

Thu Jan 25 15:56:06 2018 - [info] Got exit code 0 (Not master dead).

MySQL Replication Health is OK.

如果在命令執行后的輸出結果中找不到[warning]?master_ip_failover_script?is?not?defined.表示已經啟動此功能

接下來,我們來啟動mha服務

接下來的流程大致可以這樣來做

準備,啟動mha服務

首先,停止master端的mysqld進程,讓slave01提供到主庫并獲取vip地址

其次,查看其它從庫slave02上主從同步是否正常,是否重新指向新的master的地址

最后,啟動master端的mysqld進程,重新加入到主從模式中

準備,實驗開始

[root@manager?~]#?rm?-rf?/app/mha/mha.failover.complete?

[root@manager?~]#?nohup?masterha_manager?--conf=/app/mha/mha.cnf?>?/app/mha/log/mha_manager.log?2>&1?&

[1] 4066

[root@manager?~]#?ps?-ef?|grep?masterha?|grep?-v?'grep'

root??????805559??805262??0 15:52 pts/2????00:00:00 perl /bin/masterha_check_repl --conf=/app/mha/mha.cnf

root??????806133??805710??0 15:58 pts/3????00:00:00 perl /bin/masterha_manager --conf=/app/mha/mha.cnf

[root@manager?~]#?masterha_check_status?--conf=/app/mha/mha.cnf

ha (pid:806133) is running(0:PING_OK), master:master

首先,實驗開始

(1)在master端上執行命令來停止mysqld服務進程

[root@master?~]#?/etc/init.d/mysql stop

Shutting?down?MySQL....?SUCCESS!?


(2)查看manager端的mha輸出日志,在這里只截取了一部分日志信息

[root@manager?~]#?tail?-f?/app/mha/manager.log

Enabling the VIP - 192.168.10.90 on the new master - slave01

#表示vip的地址是192.168.10.90已經在新的master上開啟,新的master是slave01

-----?Failover?Report?-----


mha: MySQL Master failover master to slave01 succeeded

#表示Master由master轉移到slave01

Master?master?is?down!

#表示master已經down機

Check MHA Manager logs at manager:/app/mha/manager.log for details.

Started automated(non-interactive) failover.

Invalidated master IP address on master.

The latest slave slave01(192.168.10.92:3306) has all relay logs for recovery.

Selected slave01 as a new master.

slave01: OK: Applying all logs succeeded.

slave01: OK: Activated master IP address.

slave02: This host has the latest relay log events.

Generating relay diff files from the latest slave succeeded.

slave02: OK: Applying all logs succeeded. Slave started, replicating from slave01.

slave01: Resetting slave info succeeded.

Master failover to slave01(192.168.10.92:3306) completed successfully.

Thu Jan 25 16:00:26 2018 - [info] Sending mail..

Unknown option: conf


(3)登錄slave01查看是否獲取到vip地址

[root@slave01?~]#?ip?addr list

1: lo: mtu 65536 qdisc noqueue state UNKNOWN qlen 1

????link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

????inet 127.0.0.1/8 scope host lo

???????valid_lft forever preferred_lft forever

2: eth0: mtu 1500 qdisc mq state UP qlen 1000

????link/ether b6:c8:24:ce:10:80 brd ff:ff:ff:ff:ff:ff

????inet 192.168.10.92/24 brd 192.168.10.255 scope global eth0

???????valid_lft forever preferred_lft forever

????inet 192.168.10.90/24 brd 192.168.10.255 scope global secondary eth0:1

???????valid_lft forever preferred_lft forever

其次,實驗開始

登錄slave02查看主從同步是否正常,查看是否已經轉移到新的master的ip上

[root@slave02?~]#?mysql -e?'show?slave?status\G'?|egrep?'Master_Host|Slave_IO_Running:|Slave_SQL_Running:'?

??????????????????Master_Host: 192.168.10.92

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

最后,實驗開始

(1)在master端啟動mysqld服務

[root@master?~]#?/etc/init.d/mysql start

Starting?MySQL.?SUCCESS!?


(2)在manager端的mha日志文件中找到主從同步的sql語句,這條語句只需要修改密碼即可使用

[root@manager?~]#?grep?'MASTER_HOST'?/app/mha/manager.log?|?tail?-n?1

Thu Jan 25 16:00:21 2018 - [info]??All other slaves should start replication from here. Statement should be: CHANGE MASTER TO MASTER_HOST='slave01 or 192.168.10.92', MASTER_PORT=3306, MASTER_LOG_FILE='mysql_bin.000002', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='xxx';

(3)在master上啟動主從同步,密碼為20151012

[root@master?~]#mysql -e "CHANGE MASTER TO MASTER_HOST='192.168.10.92', MASTER_PORT=3306, MASTER_LOG_FILE='mysql_bin.000002', MASTER_LOG_POS=120, MASTER_USER='rep', MASTER_PASSWORD='20151012'; start slave;"

[root@master?~]#?mysql -e?"show?slave?status\G"

*************************** 1. row ***************************

???????????????Slave_IO_State: Waiting for master to send event

??????????????????Master_Host: 192.168.10.92

??????????????????Master_User: rep

??????????????????Master_Port: 3306

????????????????Connect_Retry: 60

??????????????Master_Log_File: mysql_bin.000002

??????????Read_Master_Log_Pos: 120

???????????????Relay_Log_File: c0a80a5b-relay-bin.000002

????????????????Relay_Log_Pos: 283

????????Relay_Master_Log_File: mysql_bin.000002

?????????????Slave_IO_Running: Yes

????????????Slave_SQL_Running: Yes

??????????????Replicate_Do_DB:

??????????Replicate_Ignore_DB:

???????????Replicate_Do_Table:

???????Replicate_Ignore_Table:

??????Replicate_Wild_Do_Table:

??Replicate_Wild_Ignore_Table:

???????????????????Last_Errno: 0

???????????????????Last_Error:

?????????????????Skip_Counter: 0

??????????Exec_Master_Log_Pos: 120

??????????????Relay_Log_Space: 459

??????????????Until_Condition: None

???????????????Until_Log_File:

????????????????Until_Log_Pos: 0

???????????Master_SSL_Allowed: No

???????????Master_SSL_CA_File:

???????????Master_SSL_CA_Path:

??????????????Master_SSL_Cert:

????????????Master_SSL_Cipher:

???????????????Master_SSL_Key:

????????Seconds_Behind_Master: 0

Master_SSL_Verify_Server_Cert: No

????????????????Last_IO_Errno: 0

????????????????Last_IO_Error:

???????????????Last_SQL_Errno: 0

???????????????Last_SQL_Error:

??Replicate_Ignore_Server_Ids:

?????????????Master_Server_Id: 92

??????????????????Master_UUID: 996b4343-00f3-11e8-a3ba-b6c824ce1080

?????????????Master_Info_File: /app/mysql/data/master.info

????????????????????SQL_Delay: 0

??????????SQL_Remaining_Delay: NULL

??????Slave_SQL_Running_State: Slave has read all relay log; waiting for the slave I/O thread to update it

???????????Master_Retry_Count: 86400

??????????????????Master_Bind:

??????Last_IO_Error_Timestamp:

?????Last_SQL_Error_Timestamp:

???????????????Master_SSL_Crl:

???????????Master_SSL_Crlpath:

???????????Retrieved_Gtid_Set:

????????????Executed_Gtid_Set:

????????????????Auto_Position: 0

最后編輯于
?著作權歸作者所有,轉載或內容合作請聯系作者
平臺聲明:文章內容(如有圖片或視頻亦包括在內)由作者上傳并發布,文章內容僅代表作者本人觀點,簡書系信息發布平臺,僅提供信息存儲服務。
  • 序言:七十年代末,一起剝皮案震驚了整個濱河市,隨后出現的幾起案子,更是在濱河造成了極大的恐慌,老刑警劉巖,帶你破解...
    沈念sama閱讀 228,739評論 6 534
  • 序言:濱河連續發生了三起死亡事件,死亡現場離奇詭異,居然都是意外死亡,警方通過查閱死者的電腦和手機,發現死者居然都...
    沈念sama閱讀 98,634評論 3 419
  • 文/潘曉璐 我一進店門,熙熙樓的掌柜王于貴愁眉苦臉地迎上來,“玉大人,你說我怎么就攤上這事。” “怎么了?”我有些...
    開封第一講書人閱讀 176,653評論 0 377
  • 文/不壞的土叔 我叫張陵,是天一觀的道長。 經常有香客問我,道長,這世上最難降的妖魔是什么? 我笑而不...
    開封第一講書人閱讀 63,063評論 1 314
  • 正文 為了忘掉前任,我火速辦了婚禮,結果婚禮上,老公的妹妹穿的比我還像新娘。我一直安慰自己,他們只是感情好,可當我...
    茶點故事閱讀 71,835評論 6 410
  • 文/花漫 我一把揭開白布。 她就那樣靜靜地躺著,像睡著了一般。 火紅的嫁衣襯著肌膚如雪。 梳的紋絲不亂的頭發上,一...
    開封第一講書人閱讀 55,235評論 1 324
  • 那天,我揣著相機與錄音,去河邊找鬼。 笑死,一個胖子當著我的面吹牛,可吹牛的內容都是我干的。 我是一名探鬼主播,決...
    沈念sama閱讀 43,315評論 3 442
  • 文/蒼蘭香墨 我猛地睜開眼,長吁一口氣:“原來是場噩夢啊……” “哼!你這毒婦竟也來了?” 一聲冷哼從身側響起,我...
    開封第一講書人閱讀 42,459評論 0 289
  • 序言:老撾萬榮一對情侶失蹤,失蹤者是張志新(化名)和其女友劉穎,沒想到半個月后,有當地人在樹林里發現了一具尸體,經...
    沈念sama閱讀 49,000評論 1 335
  • 正文 獨居荒郊野嶺守林人離奇死亡,尸身上長有42處帶血的膿包…… 初始之章·張勛 以下內容為張勛視角 年9月15日...
    茶點故事閱讀 40,819評論 3 355
  • 正文 我和宋清朗相戀三年,在試婚紗的時候發現自己被綠了。 大學時的朋友給我發了我未婚夫和他白月光在一起吃飯的照片。...
    茶點故事閱讀 43,004評論 1 370
  • 序言:一個原本活蹦亂跳的男人離奇死亡,死狀恐怖,靈堂內的尸體忽然破棺而出,到底是詐尸還是另有隱情,我是刑警寧澤,帶...
    沈念sama閱讀 38,560評論 5 362
  • 正文 年R本政府宣布,位于F島的核電站,受9級特大地震影響,放射性物質發生泄漏。R本人自食惡果不足惜,卻給世界環境...
    茶點故事閱讀 44,257評論 3 347
  • 文/蒙蒙 一、第九天 我趴在偏房一處隱蔽的房頂上張望。 院中可真熱鬧,春花似錦、人聲如沸。這莊子的主人今日做“春日...
    開封第一講書人閱讀 34,676評論 0 26
  • 文/蒼蘭香墨 我抬頭看了看天上的太陽。三九已至,卻和暖如春,著一層夾襖步出監牢的瞬間,已是汗流浹背。 一陣腳步聲響...
    開封第一講書人閱讀 35,937評論 1 288
  • 我被黑心中介騙來泰國打工, 沒想到剛下飛機就差點兒被人妖公主榨干…… 1. 我叫王不留,地道東北人。 一個月前我還...
    沈念sama閱讀 51,717評論 3 393
  • 正文 我出身青樓,卻偏偏與公主長得像,于是被迫代替她去往敵國和親。 傳聞我的和親對象是個殘疾皇子,可洞房花燭夜當晚...
    茶點故事閱讀 48,003評論 2 374

推薦閱讀更多精彩內容

  • MySQL MHA 架構介紹:MHA由兩部分組成MHA Manager(管理節點)和MHA Node(數據節點),...
    XuDongTian閱讀 888評論 0 3
  • MySQLMHA高可用環境搭建 簡介: MHA(Master High Availability)目前在MySQ...
    cubotudo閱讀 371評論 0 2
  • 前言 搭建完成 MHA 環境,然后模擬 master 故障,驗證是否正確切換成了新的 master參考1 搭建思路...
    小小的小帥閱讀 1,206評論 0 1
  • 在之前的博客中,介紹了mysql的主從模型以及深層次的mysql的讀寫分離插件——ProxySQL,讓我們可以很大...
    4a873e424089閱讀 1,105評論 0 0
  • 這個周末,換了紗窗,收拾了儲藏室,把花兒搬到樓下,打掃廚房,這兩天工作量不小啊! 女兒是主角,她仔細打掃每一個角落...
    lW平淡是真閱讀 114評論 0 2