How to backup and restore your nosupportlinuxhosting site

11 April 2013 by Lincoln Ramsay

The good folks over at nosupportlinuxhosting run a nice operation. I appreciate that I’m only paying for what I need, rather than for support people to answer stupid questions. What they do make clear is that they don’t do site backups for you. The thing is though, you can never really trust your hosting company to keep a reliable backup of your site, as I found out at my last hosting company.

So in the interest of helping you keep a good backup that you can restore in the event of site problems, I’m sharing the scripts I use to keep my site backed up.

Note that nosupportlinuxhosting allows me to use SSH to access the site. I have setup authentication-based logins so that my backup scripts can run unattended and I use rsync to get file on/off the host. You can probably use these scripts for any hosting site that provides SSH access.

Update: I can’t use rsync anymore because nosupportlinuxhosting has removed it. Please see this post for updated scripts.

I have a machine running Ubuntu that runs the backup process. You may need to install some tools that aren’t installed by default.

backup database

Lets start with the database. I use this script to backup the databases. It’s actually sent to the host and run there (by the next script).

db_backup.sh

#!/bin/sh
set -e # bail on errors
mkdir -p $HOME/db # ensure this directory exists
cd $HOME/db # this is where we store the DB backups

# a re-usable function - for sites with multiple databases
process()
{
    db="$1"
    user="$2"
    pass="$3"
    mysqldump --add-drop-table --skip-extended-insert -h localhost -u $user -p$pass $db | grep -v '^-- Dump completed' >$db.sql
}

# you need a line like this for each database
process dbname user pass

exit 0

backup site

Now for the main backup script.

backup.sh

#!/bin/sh
set -e # bail on errors

user=myuser
host=mydomain.com

echo "Starting run at `date`"

# If we're running from cron we have a lock already
if [ "$HAVE_LOCK" != 1 ]; then
    # Try to avoid having 2 copies of this run at the same
    # time (from the same directory).
    if [ -f "lock" ]; then
        echo "lock exists! This script is already running!"
        exit 1
    fi
    :> lock
    trap "rm lock" 0
fi

# send and run the db_backup script
rsync -av db_backup.sh $user@$host:~/db_backup.sh
ssh $user@$host '~/db_backup.sh'

# copy files from the site
rsync -av --delete --exclude=caches $user@$host:~/.cpanel/ cpanel/
rsync -av --delete $user@$host:~/db/ db/
rsync -av --delete --exclude=cache $user@$host:~/public_html/ mirror

# add copied files to git for incremental local backup
git add cpanel
git add mirror
git add db
git commit -a -m 'nightly backup' || true
git gc

echo Done
exit 0

This script sends the db_backup script from before to your host and then runs it (to produce a backup of the DB). Then it copies files from the site to your local machine (using rsync, which efficiently copies only changes). Note that I use --exclude=cache to avoid backing up a (volatile) cache directory. You can add your own exclusions as required.

Finally, I store everything in a local git repo so that I can track changes to my site over time. You can probably live without that but if your site is hacked, being able to go back to a previous backup might come in handy.

I should probably explain that cron bit… it’s a really crappy check designed to help avoid running multiple instances of the backup script at the same time. I have been burned testing out some changes to the backup script while a scheduled backup was running 😉

running from cron

Being able to backup manually is good but to be really useful, you need to do this automatically.

run_from_cron

#!/bin/sh

# These aren't set right when run from cron
export HOME=/home/user
export PATH=
$HOME/bin:
/usr/bin:
/bin:
/usr/sbin:
/sbin

EMAIL_ADDRESS="myemail@mydomain.com"
host=mydomain.com

cd $HOME/sitebackup
# Try to avoid having 2 copies of this run at the same
# time (from the same directory).
if [ -f "lock" ]; then
    MSG="lock exists! This script is already running!"
    echo $MSG
    # Send me an email about the failure
    echo "Subject: $host: Run for `date` failed!" >msg
    echo >>msg
    echo "$MSG" >>msg
    msmtp "$EMAIL_ADDRESS" <msg
    rm msg
    exit 1
fi
:> lock
export HAVE_LOCK=1
trap "rm lock" 0

./backup.sh >backup.log 2>&1
status="$?"
if [ "$status" -ne 0 ]; then
    # Send me an email about the failure
    echo "Subject: $host: Run for `date` failed!" >msg
    echo >>msg
    tail -1000 "backup.log" >>msg
    msmtp "$EMAIL_ADDRESS" <msg
    rm msg
fi

exit 0

This fixes up the environment a little and runs the backup script. It sends you an email on failure, using msmtp. If you have some other way of sending emails from your Linux box feel free to substitute that. You could remove the email part but then how will you know if your backups stop working? For reference, here’s my .msmtprc file.

account default
host smtp.gmail.com
port 587
tls on
tls_certcheck off
auth on
user myemail@gmail.com
password mypassword
from myemail@gmail.com

To actually get the script to run, you need to tell cron about it. Run crontab -e and use a rule like this.

# m h  dom mon dow   command
  0 0  *   *   *     $HOME/sitebackup/run_from_cron

I run my backup at midnight for no particular reason. The scheduling for cron is very flexible but figuring out the options is left as an exercise for the reader.

restore database

A backup is only good if you can recover from it. Here’s the script to restore the database. Run on the host.

db_restore.sh

#!/bin/sh
set -e # bail on errors
mkdir -p $HOME/db # ensure this directory exists
cd $HOME/db # this is where we store the DB backups

# a re-usable function - for sites with multiple databases
process()
{
    db="$1"
    user="$2"
    pass="$3"
    mysql -h localhost -u $user -p$pass $db <$db.sql
}

# you need a line like this for each database
process dbname user pass

exit 0

restore site

Here’s the script to restore the whole site.

restore.sh

#!/bin/sh
set -e

user=myuser
host=mydomain.com

echo "Starting run at `date`"

# If we're running from cron we have a lock already
if [ "$HAVE_LOCK" != 1 ]; then
    # Try to avoid having 2 copies of this run at the same
    # time (from the same directory).
    if [ -f "lock" ]; then
        echo "lock exists! This script is already running!"
        exit 1
    fi
    :> lock
    trap "rm lock" 0
fi

ssh $user@$host 'mkdir -p db'
rsync -av cpanel/ $user@$host:~/.cpanel/
rsync -av --delete db/ $user@$host:~/db/
rsync -av --delete mirror/ $user@$host:~/public_html/

rsync -av db_restore.sh $user@$host:~/db_restore.sh
ssh $user@$host '~/db_restore.sh'

echo Done
exit 0

conclusion

You should have a backup of your website. You should know that you can restore from your backup if you need to. This is how I backup my site. I know it works because I’ve had to rely on my backups a few times now. Hopefully it can help you too.