At work I’ve a MySQL 4.1 Master -> Slave set-up over our VPN as a backup and for local data processing. Today we realised that it’d not been replicating for days, @rse.
Running “SHOW SLAVE STATUS G” on the slave showed:
mysql> show slave status G *************************** 1. row *************************** Slave_IO_State: Master_Host: 10.0.2.136 Master_User: replicate Master_Port: 3306 Connect_Retry: 60 Master_Log_File: server-bin.000014 Read_Master_Log_Pos: 288643583 Relay_Log_File: flump-relay-bin.000008 Relay_Log_Pos: 50 Relay_Master_Log_File: sherlock-bin.000014 Slave_IO_Running: No Slave_SQL_Running: No Replicate_Do_DB: MY_DB_NAME Replicate_Ignore_DB: Replicate_Do_Table: Replicate_Ignore_Table: Replicate_Wild_Do_Table: Replicate_Wild_Ignore_Table: Last_Errno: 0 Last_Error: Could not parse relay log event entry. The possible reasons are: the master's binary log is corrupted (you can check this by runnin g 'mysqlbinlog' on the binary log), the slave's relay log is corrupted (you can check this by running 'mysqlbinlog' on the relay log), a network problem, or a b ug in the master's or slave's MySQL code. If you want to check the master's bina ry log or slave's relay log, you will be able to know their names by issuing 'SH OW SLAVE STATUS' on this slave. Skip_Counter: 0 Exec_Master_Log_Pos: 137056502 Relay_Log_Space: 151587131 Until_Condition: None Until_Log_File: Until_Log_Pos: 0 Master_SSL_Allowed: No Master_SSL_CA_File: Master_SSL_CA_Path: Master_SSL_Cert: Master_SSL_Cipher: Master_SSL_Key: Seconds_Behind_Master: NULL 1 row in set (0.00 sec)
First I tried on the master:
mysqlbinlog --start-position=137056502 server-bin.000014
/*!40019 SET @@session.max_insert_delayed_threads=0*/; ERROR: Error in Log_event::read_log_event(): 'Found invalid event in binary log', data_len: 151587081, event_type: 49 Could not read entry at offset 137056502:Error in log format or read error
That sounded bad.
A mysqldump on the live machine and re-import on the slave to get them back into sync is impossible, we’re talking about 7GB of data. From experience that’s a day or so of faffing.
First thing to do is figure out the date range od missing data.
MASTER-mysql> select max(serverTimestamp) from table; +----------------------+ | max(serverTimestamp) | +----------------------+ | 1120160193 | +----------------------+ 1 row in set (0.01 sec)
SLAVE-mysql> select max(serverTimestamp) from table; +----------------------+ | max(serverTimestamp) | +----------------------+ | 1119031050 | +----------------------+ 1 row in set (0.00 sec)
Doh! About 12 days data missing.
So, on the master I did:
select * into outfile "/tmp/table1" FIELDS TERMINATED BY '|' from table1 where serverTimestamp > 1119031050; select * into outfile "/tmp/table2" FIELDS TERMINATED BY '|' from table2 where serverTimestamp > 1119031050;
I tar.bz2’d the files, scp’d them down to the slave and extracted them.
Into mysql on the slave I did:
LOAD DATA INFILE "~bealers/table1" INTO TABLE table1 FIELDS TERMINATED BY '|'; LOAD DATA INFILE "~bealers/table2" INTO TABLE table2 FIELDS TERMINATED BY '|';
MASTER-mysql> reset master; SLAVE-mysql> reset slave;
Maybe there’s a better way, but that seemed to work.
Resources:
http://lists.mysql.com/mysql/180727
http://dev.mysql.com/doc/mysql/en/replication-problems.html
http://dev.mysql.com/doc/mysql/en/mysqlbinlog.html
Leave a Reply