DatabaseBackup
From SysadminWiki
| Table of contents |
Introduction
The backup of databases can sometimes be problematic when you don't even know what's inside your databases. And even if you knew, would you know how to make reliable backups ? This page is here propose a backup strategy for MySQL databases use in EGEE.
Firstly : what kind of databases are we facing ? This depends in fact on the kind of node :
node | database kind ----------------+----------------- lfc | MySQL : InnoDB DPM | MySQL : InnoDB MON Box | MySQL : MyISAM lcg RB | MySQL : MyISAM
Proposed Strategy
First, what do we want ?
- We can't backup databases using mysql replication on the same node (this has become possible with the latest mysql), because we could want to backup several nodes of the same kind, and we would end up with data mixed on the backup server.
- We want to have minimum downtimes
- We want regular and safe backups (data integrity is important).
Because of all this, a "relative" easy way to backup is to :
- enable binary logging (http://dev.mysql.com/doc/refman/5.0/en/binary-log.html)
- make regular full dumps of the database
- regularly rotate binary logs
- backup binary logs associated with full dumps
Backups can be made the way you want, but we'll quickly describe how this can be done with rsync, using rsnapshot.
The database backup itself
This is finaly the easy part.
setup mysql
You have to enable binary logging :
- Create a directory where mysql will save its binary logs
- chmod the directory
BACKUP_DIR="/var/lib/mysql/backup" BACKUP_FILE="mysql_binary_log" mkdir -p $BACKUP_DIR chown mysql:mysql $BACKUP_DIR chmod 770 $BACKUP_DIR
- edit the my.cnf file - make sure it contains the following :
#replace the variables with your own values [mysqld] log-bin=$BACKUP_DIR/$BACKUP_FILE
- add a backup user
mysql -u root --password=$MYSQL_ROOT_PWD -e "GRANT SELECT, LOCK TABLES, FILE, RELOAD, REPLICATION CLIENT ON *.* TO 'backup'@'localhost' IDENTIFIED BY '$BACKUP_PASS';"
- restart mysql
setup crontabs
The crontabs will allow you to :
- dump the whole database (*)
- rotate the binary logs
(*) : you can also setup crontabs on a backup server that will dump the database remotely : this will save space on your local server, which is usefull for huge databases...
knowing what kind of tables you have
TO see what kind of tables you have, you need to loop over your databases to see if they contain one or more InnoDB table : this can be done like this in a script :
echo "* searching for InnoDB tables..." INNODB="" DATABASES=`/usr/bin/mysqlshow -u root --password=$MYSQL_ROOT_PWD | egrep "\| ." | sed -e 's/| //' -e 's/ *|//' | tail -n +2` for i in $DATABASES; do if /usr/bin/mysqlshow -u root --password=$MYSQL_ROOT_PWD --status $i '*' | grep -i innodb > /dev/null ; then INNODB="$INNODB $i" fi done if [ "x$INNODB" != "x" ]; then echo " innodb tables detected : $INNODB" fi
the contabs
To make a whole dump of a mysql server that contains InnoDB databases, and to reset binary logs :
/usr/bin/mysqldump --single-transaction -u backup --password='$BACKUP_PASS' --flush-logs --master-data=2 --all-databases -a --delete-master-logs > $BACKUP_DIR/dump.sql
To make a whole dump of a mysql server that contains only MyISAM databases, and to reset binary logs :
/usr/bin/mysqldump --opt --lock-all-tables -u backup --password='$BACKUP_PASS' --flush-logs --master-data=2 --all-databases -a --delete-master-logs > $BACKUP_DIR/dump.sql
To rotate binary logs :
/usr/bin/mysqladmin -u backup --password='$BACKUP_PASS' flush-logs
What you'll want to do is :
- schedule a whole dump, let's say once a week
- schedule regular (daily ?) rotations of the bianry logs - this will allow rsync to only backup what's new without having a downtime at all
note : to reset binary logs directly in mysql, use the command "reset master"
The Remote Backup
This backup can be done as you want, but the main thing to do is to
- backup any whole database backup
- backup any binary log that was created after the whole backup
We'll show how to setup this using rsnapshot(rsync) (http://www.rsnapshot.org/),ssh keys with password, and keychain (http://www.gentoo.org/proj/en/keychain/).
setup ssh keys
First, create ssh keys :
ssh-keygen -t rsa -b 2048 -f /home/root/.ssh/rsync-key<pre>
Then, create an executable script that you will propagate on all machines where you want to backup. This script will be in charge of restricting the use of the key to only allow rsync commands from the backup host. It will be placed in say "/usr/local/bin/" and will be called sshkey_check_command
Its contents must be :
<pre>
#!/bin/sh
case "$SSH_ORIGINAL_COMMAND" in
*\&*)
echo "Rejected"
;;
*\(*)
echo "Rejected"
;;
*\{*)
echo "Rejected"
;;
*\;*)
echo "Rejected"
;;
*\<*)
echo "Rejected"
;;
*\`*)
echo "Rejected"
;;
rsync\ --server\ --sender*)
$SSH_ORIGINAL_COMMAND
;;
*)
echo "Rejected"
;;
esac
This will only allow rsync commands that require the machine to send files, and this will not allow incoming files.
Next, change the scripts rights :
chown root:root /usr/local/bin/sshkey_check_command chmod 750 /usr/local/bin/sshkey_check_command
Finally propagate the public key you just created but restrict its use to the backup host : Edit /root/.ssh/authorized_keys and add this :
from="backup.server.IP.here",command="/usr/local/bin/sshkey_check_command" __public_key_here__
This will instruct SSH to sen dany command receive using this key through the previous script... thus effectively only allowing rsync commands.
keychain
On the backup host, you will want to use your key, but it contains a pasword... a (The) solution to this is to use keychain : it basically starts an ssh-agent and allows to use it in your crontabs or when you login. Please refer to the keychain website (http://www.gentoo.org/proj/en/keychain/) for more information.
To setup keychain :
wget http://gentoo.org/~agriffis/gpg-pubkey-20104eb0.asc wget http://dev.gentoo.org/~agriffis/keychain/keychain-2.6.8-1.noarch.rpm rpm --import gpg-pubkey-20104eb0.asc #(checks) rpm -K keychain-2.6.8-1.noarch.rpm && rpm -Uvh keychain-2.6.8-1.noarch.rpm
Add the following at the *BEGINNING* of the /root/.bashrc (or equivalent)
/usr/bin/keychain ~/.ssh/rsync-key
source ~/.keychain/${HOSTNAME}-sh
For convenience, you can copy/paste the following :
echo '/usr/bin/keychain ~/.ssh/rsync-key' > /root/.bash_profile.tmp
echo 'source ~/.keychain/${HOSTNAME}-sh' >> /root/.bash_profile.tmp
cat /root/.bash_profile >> /root/.bash_profile.tmp
mv /root/.bash_profile.tmp /root/.bash_profile
That's it ! When you first login next time, keychain should ask for you key password, and it should not ask anymore until you reboot the machine - you will just have to remember to login after downtimes...
rsnapshot
Well, this is one amongst other solutions. If you wish to use rsnapshot (http://www.rsnapshot.org/), go on the rsnapshot website (http://www.rsnapshot.org/), download and install the software.
Once installed, you'll have to edit /etc/rsnapshot.conf and make it fit your needs. This done, you'll have to test your rsnapshot configuration, and then to setup crontabs. These crontabs will be able to use the ssh-agent keychain maintains, and will thus allow you to schedule bakups of remote hosts using an ssh key that indeed is *not* passwordless.
We're using at CEA the following rsnapshot options : rsync, we disabled link_dest and enabled sync_first to not loose backups in case a host would be down too long.
Your crontabs should look like this :
# cat /etc/cron.d/rsnapshot.cron SHELL=/bin/bash MAILTO=mail@domain.org PATH=/sbin:/bin:/usr/sbin:/usr/bin # We're rotating directories, no matter what happens. If a node backup fails, the last good/known version # will be propagated from the "rsnapshot root/.sync" directory - we're not erasing latest backup. 31 1 * * * root source /root/.keychain/<your full hostname here>-sh ; /usr/bin/rsnapshot sync ; /usr/bin/rsnapshot daily # Add an entry to remind people to restore the SSH agent on node reboot @reboot echo -n -e "`hostname -s` just rebooted.\nIn order for backups to work, please go on this node and login as root:\nwhen connecting, you will be asked to enter the SSH key password, and ssh agents will be restored."
Notes on how to restore a Database
!! Important !! You have to understand that the binary logging records every insert or update action... and that when you restore a database, you're re-inserting/updating every single record in the database. If you don't disable binary logging (by changing your my.cnf file and restarting the mysql server), you'll end up with a restored database, but also with binary logs as big as the database itself, which is far from ideal.
Other than that, mysql retauration is quite simple : just run this kind of command :
mysql -u root -p < mydump.sql ... mysqlbinlog mylog.bin | mysql -u root -p ...
!! Important !! Do not *think* you backed up your database : once your strategy is implemented, please check your backups can be restored on an offline database
