Revisiting my web host backup solution

04 March 2014 by Lincoln Ramsay

So for some reason, nosupportlinuxhosting removed /usr/bin/rsync from my web host on 27 Feb. I thought about emailing them but… they do advertise how they ignore support questions.

I have a workaround anyway. This is how I backed up my site at a previous hosting company that didn’t provide rsync. It’s not quite as good but it’s a lot better than just downloading everything all the time.

My original scripts (that use rsync) can be found here. If you have rsync on your web host you should use those instructions.

The alternative I’m using now is lftp. Specifically, it’s mirror command. This is kind of like rsync in that it doesn’t need to copy everything but it can’t do efficient partial transfers so small changes to big files mean big downloads. I have a few tweaks to help with that.

lftp notes

There’s a few things I’ll note about my use of lftp. I set a few options. A longer timeout, no retries and exit on failure. I’m also using sftp as the transport (like FTP but over SSH). lftp supports several other transports too if sftp doesn’t work for you.

lftp doesn’t compress files. If you know there are large files that change regularly (like the DB backup) and you want to cut down on transfer times, you can compress them during the backup. My script does this for the DB files but not for anything else. You may also be able to play with SSH options to achieve a similar result (ie. compressing the SSH tunnel traffic).

backup database

Lets start with the database. I use this script to backup the databases. It’s actually sent to the host and run there (by the next script).

db_backup.sh

#!/bin/sh
set -e # bail on errors
mkdir -p $HOME/db # ensure this directory exists
cd $HOME/db # this is where we store the DB backups

# a re-usable function - for sites with multiple databases
process()
{
    db="$1"
    user="$2"
    pass="$3"
    mysqldump --add-drop-table --skip-extended-insert -h localhost -u $user -p$pass $db | grep -v '^-- Dump completed' >$db.new
    NEWMD5=`md5sum <$db.new`
    OLDMD5=`cat $db.md5 || true`
    if [ "$NEWMD5" = "$OLDMD5" ]; then
        # DB has not changed
        rm $db.new
    else
        # Compress the file for faster download
        gzip $db.new
        mv $db.new.gz $db.sql.gz
        echo "$NEWMD5" >$db.md5
    fi
}

# you need a line like this for each database
process dbname user pass

exit 0

backup site

Now for the main backup script.

backup.sh

#!/bin/sh
set -e # bail on errors

user=myuser
host=mydomain.com

echo "Starting run at `date`"

# If we're running from cron we have a lock already
if [ "$HAVE_LOCK" != 1 ]; then
    # Try to avoid having 2 copies of this run at the same
    # time (from the same directory).
    if [ -f "lock" ]; then
        echo "lock exists! This script is already running!"
        exit 1
    fi
    :> lock
    trap "rm lock" 0
fi

# Compress all the .sql files so that our local filenames match the remote ones
(
cd db
for file in *.sql; do
    if [ "$file" = "*.sql" ]; then
        continue
    fi
    gzip $file
done
)

# send and run the db_backup script
lftp -c '
    set net:timeout 30
    set net:max-retries 1
    set cmd:fail-exit true
    open sftp://'$host'
    put db_backup.sh'
ssh $user@$host '~/db_backup.sh'

# copy files from the site
lftp -c '
    set net:timeout 30
    set net:max-retries 1
    set cmd:fail-exit true
    open sftp://'$host'
    lcd cpanel
    cd .cpanel
    mirror -ev -x caches
    lcd ../db
    cd ../db
    mirror -ev
    lcd ../mirror
    cd ../public_html
    mirror -ev -x cache'

# Uncompress all the .sql.gz files because you can't diff .gz files
(
cd db
for file in *.sql.gz; do
    if [ "$file" = "*.sql.gz" ]; then
        continue
    fi
    gzip -d $file
done
)

# add copied files to git for incremental local backup
git add cpanel
git add mirror
git add db
git commit -a -m 'nightly backup' || true
git gc

echo Done
exit 0

This script sends the db_backup script from before to your host and then runs it (to produce a backup of the DB). Then it copies files from the site to your local machine. Note that I use -x cache to avoid backing up a (volatile) cache directory. You can add your own exclusions as required.

Finally, I store everything in a local git repo so that I can track changes to my site over time. You can probably live without that but if your site is hacked, being able to go back to a previous backup might come in handy.

I’m not going to copy the cron stuff here, go seem my older post for that.

restore database

A backup is only good if you can recover from it. Here’s the script to restore the database. Run on the host.

db_restore.sh

#!/bin/sh
set -e # bail on errors
mkdir -p $HOME/db # ensure this directory exists
cd $HOME/db # this is where we store the DB backups

# a re-usable function - for sites with multiple databases
process()
{
    db="$1"
    user="$2"
    pass="$3"
    gzip -dc $db.sql.gz | mysql -h localhost -u $user -p$pass $db
}

# you need a line like this for each database
process dbname user pass

exit 0

restore site

Here’s the script to restore the whole site.

restore.sh

#!/bin/sh
set -e

user=myuser
host=mydomain.com

echo "Starting run at `date`"

# If we're running from cron we have a lock already
if [ "$HAVE_LOCK" != 1 ]; then
    # Try to avoid having 2 copies of this run at the same
    # time (from the same directory).
    if [ -f "lock" ]; then
        echo "lock exists! This script is already running!"
        exit 1
    fi
    :> lock
    trap "rm lock" 0
fi

ssh $user@$host 'mkdir -p db'
lftp -c '
    set net:timeout 30
    set net:max-retries 1
    set cmd:fail-exit true
    open sftp://yasmar.net
    put db_restore.sh
    lcd cpanel
    cd .cpanel
    mirror -Rv
    lcd ../db
    cd ../db
    mirror -Rev
    lcd ../mirror
    cd ../public_html
    mirror -Rev'
ssh $user@$host '~/db_restore.sh'

echo Done
exit 0

conclusion

You should have a backup of your website. You should know that you can restore from your backup if you need to. This is how I backup my site. I know it works because I’ve had to rely on my backups a few times now. Hopefully it can help you too.