How to setup a mysql replication check on your slave

How to setup replication check on your mysql slave with email alerting

I decided to write this, because there are probably lots of people who have mysql replication setup in master –slave, and the only way they are able to identify if replication is broken, is by logging in to the slave and checking. I found this to be a pain and in efficient.

What this entails

  • This is a perl script which will run every hour via cron
  • It will send an email alert notifying you that replication is broken on the slave
  • Script is smart enough to know if mysql is simply stopped on the master.
  • Script also check to see if mysql is running or not
  1. Open a file under nano  -w /usr/sbin/replicationcheck.pl (either copy and paste below or download the link below and edit as needed, this goes on your slave mysql server)
    http://www.nicktailor.com/files/replicationcheck
  2. You need to ensure the file has +x permissions chmod +x /usr/sbin/replicationcheck.pl
  3. Create the following file ‘touch /root/repl_check/show_slave_status.txt’ (this file is used to pipe information to)
  4. Create the log file ‘touch /var/log/mysqlstopped.txt’ (this will be used to log the results
  5. Finally you will need to setup a cron to run this script. I ran mine every hour ‘crontab -e’ if your adding this to root
    0 * * * /usr/sbin/replicationcheck.pl (This runs every hour)
  6. Lastly, you can setup a bash script on master db which can ssh to your slave and output the results on your master using this script, so you dont need to log into the slave to use the script, if your lazy.

Explains the script 

#!/usr/bin/perl
use Sys::Hostname;
use POSIX;

$timestamp = strftime “%b%e %Y %H:%M:%S”, localtime;
$host = hostname;
$email_lock = “/root/email.lck”;
$mysql_socket = “/var/lib/mysql/mysql.sock”;
$show_slave_status = “/root/repl_check/show_slave_status.txt”;
$pword = “”; (You will need to add this to the mysql lines below if you have password)

 

# This checks to see if mysql socket exists. if it exists, means that mysql is running. if mysql’s not running, don’t need to run slave status check

sub check_mysql_socket

{

# Can’t connect to local MySQL server through socket ‘/var/lib/mysql/mysql.sock
if (-e $mysql_socket)
{
print “MySQL running, will proceed\n”;
return 1;
}

else

{

print “MySQL not running, will do nothing\n”;
return 0;

}

}

 

# so this is the server doesn’t repeatedly keep sending email alerts if the replication is broken. How it does it is by a email lock file. If an email is sent it creates the lock file and stops sending, if it doesn’t it will send. You can change this how you see fit, this is the way I learned so stuck with it. It’s a sub, so we can use it as a function variable later down the script.

sub check_email_lock

{

if (-e $email_lock)

{

print “email file exists\n”;
return 1;

}

else

{

print “no email file exists\n”;
return 0;

}

}

 

#So this section just basically continues from above by using the check email lock function and then sends the email if the lock file doesn’t exist and creates the lock file, you also define the email address you want to send here. It also logs the results to a file called “mysqlstopped.txt” if there is a problem.

sub stop_mysql

{

print “**Show Slave Status**\n”;
if (check_email_lock)

{

print “email lock exists, keep email lock, no email will be sent “;

}

else

{

system (“mail -s ‘mysql stopped because replication is broken $host’ nick\@nicktailor.com < /var/log/mysqlstopped.txt”);

system (“touch $email_lock”);
print “email sent, email lock created\n”;

}

 

}

print $timestamp . “\n”;

 

# if MySQL is running then it moves on to the next phase where it mines the information from mysql we need:

  • last io error
  • last sql errno
  • slave io running
  • slave sql running

 

if (check_mysql_socket)

{

system (“/usr/bin/mysql -Bse ‘show slave status\\G’ > $show_slave_status”);
$last_io_errno = `less $show_slave_status | grep Last_IO_Errno | /usr/bin/awk ‘{print \$2}’`;
$last_sql_errno = `less $show_slave_status | grep Last_SQL_Errno | /usr/bin/awk ‘{print \$2}’`;
$slave_io_running = `less $show_slave_status | grep Slave_IO_Running | /usr/bin/awk ‘{print \$2}’`;
$slave_sql_running = `less $show_slave_status | grep Slave_SQL_Running | /usr/bin/awk ‘{print \$2}’`;

 

# trim newline character
chomp($last_io_errno);
chomp($last_sql_errno);
chomp($slave_io_running);
chomp($slave_sql_running);

print “last io error is ” . $last_io_errno . “\n”;
print “last sql errno is ” . $last_sql_errno . “\n”;
print “slave io running is ” . $slave_io_running . “\n”;
print “slave sql running is ” . $slave_sql_running . “\n”;

#So this piece is here because if you stop mysql on the master, the result on the slave from “show slave status” is a very specific one. You will need to test yours to see if the results match the code here, and edit it according.
Basically its saying if last_io_errno is less than 0 and does not equal 2013 there is a problem, If last sql_ernno is less than 0 there is also problem. You get the idea, you can add as many circumstances you need. I found this to be the best combo which covered pretty much most scenarios .

if (($last_io_errno > 0) && ($last_io_errno != 2013))

{

&stop_mysql;

}

elsif ($last_sql_errno > 0)

{

&stop_mysql;

}

# if slave not running = Slave_IO_Running and Slave_SQL_Running are set to No

elsif (($slave_io_running eq “No”) && ($slave_sql_running eq “No”))

{

&stop_mysql;

}

 

else

{

if (check_email_lock)

{

system (“rm $email_lock”);

}

print “replication fine or master’s just down, mysql can keep going, removed lock file\n”;

}

}

 

else

{

print “#2 MySQL not running, will do nothing\n”;

}

 

print “\n#########################\n”;

If the script works you should see the following below if replication is working and no email will be sent to you

#########################

Oct 3 2012 00:13:12
MySQL running, will proceed
last io error is 0
last sql errno is 0
slave io running is Yes
slave sql running is Yes
no email file exists
replication fine or master’s just down, mysql can keep going, removed lock file

#########################

If there is a problem it will look something like:

##########################

Oct 2 2012 02:02:54
MySQL running, will proceed
last io error is 0
last sql errno is 0
slave io running is No
slave sql running is No
**Show Slave Status**
no email file exists
Null message body; hope that’s ok
email sent, email lock created

Hope this helped you 🙂

Cheers

Nick Tailor

 

Leave a Reply

Your email address will not be published. Required fields are marked *

0