Introduction, and my philosophy on backups:
My philosophy for general file storage (or at least what works for me) is to have a centralized server, where I store all files that have any worth in keeping, and to have regular automated backups. I never store any files on my computers, unless I am actively working on them (and even then, I usually have a cron job rsync the files to the server every few minutes). Any one of my computers can crash and burn, and it will be no big deal (in fact one of my laptops completely died two days ago... No worry!). Even if my server dies I won't have much downtime, as the whole setup is rather simple! I will describe my method for keeping backups in detail below, and hopefully I can rub off some ideas on you. I assume the reader knows enough that they can figure out what the scripts do on their own, but I am willing to assist anyone if they need help getting one of the scripts to work for their needs (just don't expect me to write a new script for you).
My file server is simply a Linux box running NFS and Samba, as well as password protected HTTPS so I can easily grab any files I need from remote locations so long as there is internet connectivity. My backups are stored on an external USB drive, while my primary files are on a partition on my server's hard drive (a small fraction of that drive holds the operating system, I decided to have only one drive in the server to conserve electricity). Every night I have a cron job run a script (backup_files_snapshots.sh) which uses rsync to make a snapshot backup of my files onto the external backup drive (and I'll store a month's worth of nightly snapshots). In addition, before running the backup I run a script which fixes all file permissions, and a script (delete_metadata.sh) which removes any junk files that may have been left behind on the drive (such as backup files from vi, and metadata my Mac likes to spew all over the place). The backup script also keeps a log, which is important to check every so often to make sure everything is actually working.
Now simply having a drive and it's backup is not enough in my opinion. If one drive fails, then you are left with only one good copy of everything. If there is a fire, then both drives are gone. So, every few months I'll swap the backup drive with another, and I keep the unused drive at a remote location. Now my server and the entire building it is in can be destroyed with everything in it, but at least my files will be safe! In addition to the backup I keep on my home server, I also back up all my files to my computer at work in a separate script (backup_files_remotely.sh). This backup also runs nightly, and executes after the local backup has completed. The script requires the use of an RSA key pair, with the private key on my home server, and the public key in $HOME/.ssh/authorized_keys on my work server.
Lastly, I run a script every two weeks (backup_diff_snapshots.sh) which compares byte for byte all the files in the latest completed snapshot backup with the primary drive, to ensure the backup copies are identical.
Using the scripts I describe below, I have a fully automated backup system. The only thing I need to do is check the logs once and a while to make sure nothing broke.
Main snapshot backup script:
Below is the main backup script on my home server, which uses rsync to make a snapshot backup of the primary directory (denoted in the FILES_PATH variable) in the backup directory (BACKUP_FILES_PATH variable). Every time this script runs, it will create a new directory (named with the current date and time in the format "YYYY-MM-DD_hh.mm.ss") in BACKUP_FILES_PATH, and store a backup of FILES_PATH in that new directory. This new directory acts as a snapshot in time of the contents of FILES_PATH. When we run the script again, it creates an additional snapshot of FILES_PATH in BACKUP_FILES_PATH in a directory with the current date and time. Any files which did not change since the last backup are linked to one another with a hard link, so while each snapshot backup directory contains a complete backup, the disk space it takes to add a new snapshot is only the amount of space it takes to store all new and changed files since the last backup. And any snapshot backup directory may be deleted without affecting any of the other snapshot backup directories. Other than deleting old snapshot backup directories, the contents in the snapshot backup directories should never be edited (the backup drive should really be read-only to anything other than the backup script), as this may ruin the snapshot backups for any files edited. If a file in one of the backups is edited, this change will affect all snapshots which hard link to that file (while deleting a file has no effect on other snapshots).
Note that this backup is not a "true" snapshot, in the sense that it takes a finite amount of time for rsync to scan through the drives and copy new/updated files, which allows for the potential that a file may be modified while rsync is running. However, for servers which are lightly used (at least while the script is running), this is generally not a problem. If a true snapshot backup is required, this may be accomplished if the file system of the directory to be backed up is using a logical volume manager (LVM). However, this would require root privileges to create a snapshot volume, while the backup_files_snapshots.sh script as it stands only requires the user running the script have read access to the files directory and read/write access to the backup directory.
The first thing the script does is check for the presence of the file specified in LOCK_FILE. This file prevents the script from running in multiple instances. If the file exists, the script will abort, otherwise it will create that file, run the backup, then delete the file once the backup is completed. If the script refuses to run because this file is present, but no instances of the script exist, it is likely because the script was aborted while running and did not have a chance to clean up after itself (and in this case, the file specified in LOCK_FILE may be deleted manually, to allow the script to run).
This script will then check for the presence of a file named .identity on the root directory of the primary and backup paths (it will not run without these files). This ensures you do not create an empty backup if the primary drive is not mounted (which would cause the script to re-copy all data the next time the primary drive is mounted), and that you do not copy the contents of the primary drive to the root drive of your system (potentially filling it to capacity) when the backup drive is not mounted. These are both problems I've encountered before I finally added this drive checking feature to the script. I also use the .identity file to give a name to the drives/directories these are in, and read these out in the log file to clarify what drives are being used.
The next thing the script does is create the new snapshot backup directory in BACKUP_FILES_PATH in the format "YYYY-MM-DD_hh.mm.ss", but prefixed with "tmp.". The backup will store all the files in this directory, and once the backup completes it will remove the "tmp." prefix from the directory. If the backup ends abruptly, the next time the backup runs it will see the directory prefixed by "tmp." and resume the backup within that directory, moving it to a directory with the current date and time once completed. Any files which have not changed in this new directory since the last completed backup are hard linked to the files in the last completed backup.
Lastly, the script removes any old snapshot backup directories, if more backups exist than are specified in SNAPSHOTS_KEEP. If SNAPSHOTS_KEEP is set to 0, then the deletion of old backups is disabled.
backup_files_snapshots.sh
#!/bin/bash # Nick Masluk # last updated 2011-03-29 # Note: FILES_PATH and BACKUP_FILES_PATH must both contain a file with filename .identity for this script to run FILES_PATH=$HOME"/files/" BACKUP_FILES_PATH=$HOME"/files_backup/" SNAPSHOTS_KEEP="31" # set to 0 to disable deleting old snapshots EXCLUDE="lost+found .identity" LOG_FILE=$HOME"/logs/backup_files_log_"`date +%F`".txt" KEEP_LOG="1" # set to 0 to disable, 1 to keep a running log, 2 to delete the log and record only current session LOCK_FILE=$HOME"/.backup_files_running" # Check if a backup is already running. If not, create file $LOCK_FILE to # indicate to other instances of this script that a backup is running. if [ ! -e $LOCK_FILE ]; then touch $LOCK_FILE else echo "Backup is already running" # exit with error code 2 if a backup is already running exit 2 fi # check that .identity exists on the root of the files and backup directories check_identity() { if [ ! -e $FILES_PATH/.identity ] || [ ! -e $BACKUP_FILES_PATH/.identity ]; then date +%F\ %T\ %A | $LOG_CMD echo "" | $LOG_CMD if [ ! -e $FILES_PATH/.identity ]; then echo $FILES_PATH "is missing .identity file" | $LOG_CMD fi if [ ! -e $BACKUP_FILES_PATH/.identity ]; then echo $BACKUP_FILES_PATH "is missing .identity file" | $LOG_CMD fi # remove $LOCK_FILE to indicate the script is done running rm -f $LOCK_FILE echo "" | $LOG_CMD echo "--------------------------------------------------------------------------------" | $LOG_CMD exit 3 fi } set_dirs() { date +%F\ %T\ %A | $LOG_CMD echo "" | $LOG_CMD # directory where last complete backup is located LAST_BACKUP_DIR=`ls $BACKUP_FILES_PATH | sort | grep ^"....-..-.._..\...\..."$ | tail -1` # for some reason -n never returned false when the string was "" # if [ -n $LAST_BACKUP_DIR ]; then if [[ $LAST_BACKUP_DIR != "" ]]; then echo "Last completed backup is located in" $LAST_BACKUP_DIR | $LOG_CMD else echo "No previous backup found, all files will be newly copied" | $LOG_CMD fi # directory where this finalized backup will reside CURRENT_BACKUP_DIR=`date +%F_%H.%M.%S` echo "This backup will reside in" $CURRENT_BACKUP_DIR "if completed" | $LOG_CMD # check if incomplete backups exist if [[ `ls $BACKUP_FILES_PATH | sort | grep ^"tmp\.....-..-.._..\...\..."$ | wc -l` -eq 0 ]]; then # temporary location for backup while backup is running TEMP_BACKUP_DIR=tmp.`date +%F_%H.%M.%S` mkdir $BACKUP_FILES_PATH/$TEMP_BACKUP_DIR else if [[ `ls $BACKUP_FILES_PATH | sort | grep ^"tmp\.....-..-.._..\...\..."$ | wc -l` -gt 1 ]]; then echo "More than one partial backup exists, cancelling backup" | $LOG_CMD # if more than one partial backup exists, terminate backup # remove $LOCK_FILE to indicate the script is done running rm -f $LOCK_FILE echo "" | $LOG_CMD echo "--------------------------------------------------------------------------------" | $LOG_CMD exit 4 else # set temporary backup location to that of the partially completed backup TEMP_BACKUP_DIR=`ls $BACKUP_FILES_PATH | sort | grep tmp.` echo "A partial backup exists in" $TEMP_BACKUP_DIR "and will resume now" | $LOG_CMD fi fi } run_backup() { # generate a list of items to ignore EXCLUDED="" for i in $EXCLUDE; do EXCLUDED="$EXCLUDED --exclude=$i"; done ID_FILES=`cat $FILES_PATH/.identity` ID_BACKUP_FILES=`cat $BACKUP_FILES_PATH/.identity` echo "" | $LOG_CMD echo "Starting rsync backup, from" $ID_FILES "to" $ID_BACKUP_FILES | $LOG_CMD rsync $EXCLUDED --delete-after -av --link-dest=$BACKUP_FILES_PATH/$LAST_BACKUP_DIR/ $FILES_PATH $BACKUP_FILES_PATH/$TEMP_BACKUP_DIR/ 2>&1 | $LOG_CMD # store error code from rsync's exit ERROR=${PIPESTATUS[0]} # if rsync succeeds, move the temporary backup location to its final location if [ $ERROR == 0 ]; then mv $BACKUP_FILES_PATH/$TEMP_BACKUP_DIR $BACKUP_FILES_PATH/$CURRENT_BACKUP_DIR fi } find_removed() { echo "" | $LOG_CMD echo "Files removed since last backup:" | $LOG_CMD # run a dry rsync run between current and previous backups to determing which files were removed rsync $EXCLUDED --delete-before -avn $BACKUP_FILES_PATH/$CURRENT_BACKUP_DIR/ $BACKUP_FILES_PATH/$LAST_BACKUP_DIR/ | grep ^"deleting " | cut --complement -b 1-9 | $LOG_CMD } del_old_snapshots() { if [ $SNAPSHOTS_KEEP -gt 0 ]; then # disable deleting snapshots if SNAPSHOTS=0 NUMBER_SHOTS=`ls $BACKUP_FILES_PATH | sort | grep -v lost+found | wc -l` NUMBER_DEL=0 if [ $NUMBER_SHOTS -gt $SNAPSHOTS_KEEP ]; then NUMBER_DEL=$(($NUMBER_SHOTS - $SNAPSHOTS_KEEP)) echo ""| $LOG_CMD echo "Removing" $NUMBER_DEL "old backups" | $LOG_CMD fi for OLD_DIR in $(ls $BACKUP_FILES_PATH | sort | grep -v lost+found | head -$NUMBER_DEL) ; do echo "Removing" $BACKUP_FILES_PATH/$OLD_DIR | $LOG_CMD rm -rf $BACKUP_FILES_PATH/$OLD_DIR done fi } if [ $KEEP_LOG -eq 1 ] || [ $KEEP_LOG -eq 2 ]; then # run backup logged if [ $KEEP_LOG -eq 2 ] && [ -e $LOG_FILE ]; then # if log mode is set to "2", delete old log file before starting (if it exists) rm -f $LOG_FILE fi # set log command to split stdout into a log file and stdout LOG_CMD="tee -a $LOG_FILE" else # set log command to only print to stdout LOG_CMD="cat" fi # check that .identity files exist in files and backup directories check_identity # set directory locations of backup set_dirs # run rsync backup run_backup # find files which have been removed since last backup find_removed # remove old snapshots del_old_snapshots # remove $LOCK_FILE to indicate the script is done running rm -f $LOCK_FILE echo "" | $LOG_CMD date +%F\ %T\ %A | $LOG_CMD echo "--------------------------------------------------------------------------------" | $LOG_CMD # exit with the error code left by rsync exit $ERROR
Mirroring files remotely:
Below is a script which will mirror files to (or from, if SOURCE_DIR and BACKUP_DIR are swapped) a remote location. I use this for backing up my files on my server to a remote computer (at work). The script requires the use of an RSA key pair to run on its own, with the private key located on the local computer which initiates the script (in my case, my home server), and the public key located in $HOME/.ssh/authorized_keys of the remote computer (in my case, my computer at work). This script does not check for .identity files in the root directories like the backup_files_snapshots.sh script does. I keep this script located on the drive I am backing up, so it is impossible to have this script run and clear out the remote drive when the primary drive is not mounted.
backup_files_remotely.sh
#!/bin/bash # Nick Masluk # last updated 2011-03-29 # To backup multiple source dirs into the backup dir, separate dirs with a space and do not end dir paths with a slash # To copy the contents of the source dir into the backup dir, end with a slash SSH_KEY=$HOME"/.ssh/rsa_key" SOURCE_DIR=$HOME"/files/" BACKUP_DIR="nick@remote_address:files/" EXCLUDE="lost+found .identity" LOG_FILE=$HOME"/logs/backup_files_remotely_log.txt" KEEP_LOG="2" # set to 0 to disable, 1 to keep a running log, 2 to delete the log and record only current session LOCK_FILE=$HOME"/.backup_files_remotely_running" # Check if a backup is already running. If not, create file $LOCK_FILE to # indicate to other instances of this script that a backup is running. if [ ! -e $LOCK_FILE ]; then touch $LOCK_FILE else echo "Backup is already running" # exit with error code 2 if a backup is already running exit 2 fi if [ $KEEP_LOG -eq 1 ] || [ $KEEP_LOG -eq 2 ]; then # run backup logged if [ $KEEP_LOG -eq 2 ] && [ -e $LOG_FILE ]; then # if log mode is set to "2", delete old log file before starting (if it exists) rm -f $LOG_FILE fi # set log command to split stdout into a log file and stdout LOG_CMD="tee -a $LOG_FILE" else # set log command to only print to stdout LOG_CMD="cat" fi date +%F\ %T\ %A | $LOG_CMD echo "" | $LOG_CMD # generate a list of items to ignore EXCLUDED="" for i in $EXCLUDE; do EXCLUDED="$EXCLUDED --exclude=$i"; done rsync -e "ssh -i $SSH_KEY" $EXCLUDED --delete-after -av $@ $SOURCE_DIR $BACKUP_DIR 2>&1 | $LOG_CMD # store error code from rsync's exit ERROR=${PIPESTATUS[0]} date +%F\ %T\ %A | $LOG_CMD echo "" | $LOG_CMD echo "--------------------------------------------------------------------------------" | $LOG_CMD # remove $LOCK_FILE to indicate that backup is no longer running rm -f $LOCK_FILE # exit with the error code left by rsync exit $ERROR
Backing up laptops and computers:
To automatically back up the contents of a laptop/workstation to a server (or remote computer), the above script backup_files_remotely.sh may be used, with the private key on the laptop/workstation, and the public key in the server's $HOME/.ssh/authorized_keys. I keep a backup directory on the desktop of my laptops where I store any files I actively work with, and have a cron job synchronize the directory to the server every 15 minutes.
Checking for errors between primary and latest snapshot backup:
While rsync generally works well for keeping everything backed up, the only thing rsync actually checks by default are the time files were modified, and the file sizes. During a backup, rsync does a 128-bit MD4 checksum after copying files to make sure they copied correctly. I have noticed instances where this seemed to fail, but it could be that the data in one of the files mutated sometime between the rsync transfer and the file comparison which I performed later. As a rough estimate, in my own experience I'll have about one file differ for every 500GB copied. If any programs modify the contents of a file, but keep the modified time the same, and the resulting file size happens to stay fixed, rsync will not catch any difference between the original and updated files. To get around this problem, we can force rsync to do file comparisons based on file size and checksums with the -c switch. With this option, a checksum of all files on the sending side will be generated, and checksums will be generated on the receiving side only for files whose file size is the same as that on the sending side. This results in a much slower backup, but at least it will catch some of the holes in the typical but much faster method of archiving.
Another option is to directly compare the files with diff. Twice a month I have a cron job run the script below to compare the primary directory and latest snapshot backup with diff. If a file differs it will let me know, and I will compare the file with another backup (like my backup at work, using md5sum to compare checksums if the files are large), and recopy the file to the drive with the altered copy (and re-check that file with diff to make sure it took).
backup_diff_snapshots.sh
#!/bin/bash # Nick Masluk # last updated 2011-03-29 FILES_PATH=$HOME"/files/" BACKUP_FILES_PATH=$HOME"/files_backup/" LOG_FILE=$HOME"/logs/backup_diff.txt" KEEP_LOG="1" # set to 0 to disable, 1 to keep a running log, 2 to delete the log and record only current session if [ $KEEP_LOG -eq 1 ] || [ $KEEP_LOG -eq 2 ]; then if [ $KEEP_LOG -eq 2 ] && [ -e $LOG_FILE ]; then # if log mode is set to "2", delete old log file before starting (if it exists) rm -f $LOG_FILE fi # set log command to split stdout into a log file and stdout LOG_CMD="tee -a $LOG_FILE" else # set log command to only print to stdout LOG_CMD="cat" fi date +%F\ %T\ %A | $LOG_CMD # directory where last complete backup is located LAST_BACKUP_DIR=`ls $BACKUP_FILES_PATH | sort | grep ^"....-..-.._..\...\..."$ | tail -1` echo "Last completed backup is located in" $LAST_BACKUP_DIR | $LOG_CMD echo 'Starting backup diff between' `cat $FILES_PATH/.identity` 'and' `cat $BACKUP_FILES_PATH/.identity` | $LOG_CMD diff -rq $FILES_PATH $BACKUP_FILES_PATH/$LAST_BACKUP_DIR 2>&1 | $LOG_CMD # record error code left by diff ERROR=${PIPESTATUS[0]} date +%F\ %T\ %A | $LOG_CMD echo "--------------------------------------------------------------------------------" | $LOG_CMD # exit with error code left by diff exit $ERROR
Junk file cleanup:
Before I backup my files, I like to first clean out any metadata junk files left on the primary files directory. The script below searches the directory passed to the script on the command line for certain metadata files, and then removes them. These include "Thumbs.db" (from Windows when thumbnail caching is not disabled), ".DS_Store" and files prefixed with "._" (metadata files Mac OS X loves to spew all over anything it touches), a directory named ".TemporaryItems" in the path passed to the script (Mac OS X will leave this in the root directory of a drive after deleting files), and files suffixed with "~" (backup files from VIM and a few other text editors). This script may be used to clean a the directory /home/nick/files with the command "./delete_metadata.sh /home/nick/files". If no argument is passed to the script, it will search the working directory.
delete_metadata.sh
#!/bin/sh echo "Searching for Thumbs.db Windows thumbnail metadata" find $1 -name 'Thumbs.db' -exec rm -vf {} \; echo "Searching for .DS_Store Macintosh metadata" find $1 -name '.DS_Store' -exec rm -vf {} \; echo "Searching for ._* Macintosh metadata" find $1 -name '._*' -exec rm -vf {} \; echo "Searching for *~ backups" find $1 -name '*~' -exec rm -vf {} \; echo "Checking for .TemporaryItems/ in" $1 if [ -e $1/.TemporaryItems/ ]; then rm -rfv $1/.TemporaryItems/ fi
Restoring snapshot backups:
If it is found that a file or group file files have gone missing or corrupted, the backed-up copies can be immediately accessed in one of the snapshot backup directories on the backup drive. However, in the case the primary directory has completely failed/corrupted/deleted, or a new and empty primary drive is installed, the following script will restore the most recent completed backup in BACKUP_FILES_PATH to FILES_PATH.
backup_files_restore_snapshot.sh
#!/bin/bash # Nick Masluk # last updated 2011-03-29 # # Restores the latest backup in $BACKUP_FILES_PATH to $FILES_PATH # Note: FILES_PATH and BACKUP_FILES_PATH must both contain a file with filename .identity for this script to run FILES_PATH=$HOME"/files/" BACKUP_FILES_PATH=$HOME"/files_backup/" EXCLUDE="lost+found .identity" LOG_FILE=$HOME"/logs/backup_files_log.txt" KEEP_LOG="1" # set to 0 to disable, 1 to keep a running log, 2 to delete the log and record only current session LOCK_FILE=$HOME"/.backup_files_running" # Check if a backup is already running. If not, create file $LOCK_FILE to # indicate to other instances of this script that a backup is running. if [ ! -e $LOCK_FILE ]; then touch $LOCK_FILE else echo "Backup is already running" # exit with error code 2 if a backup is already running exit 2 fi # check that .identity exists on the root of the files and backup directories check_identity() { if [ ! -e $FILES_PATH/.identity ] || [ ! -e $BACKUP_FILES_PATH/.identity ]; then date +%F\ %T\ %A | $LOG_CMD echo "" | $LOG_CMD if [ ! -e $FILES_PATH/.identity ]; then echo $FILES_PATH "is missing .identity file" | $LOG_CMD fi if [ ! -e $BACKUP_FILES_PATH/.identity ]; then echo $BACKUP_FILES_PATH "is missing .identity file" | $LOG_CMD fi # remove $LOCK_FILE to indicate the script is done running rm -f $LOCK_FILE echo "" | $LOG_CMD echo "--------------------------------------------------------------------------------" | $LOG_CMD exit 3 fi } set_dirs() { date +%F\ %T\ %A | $LOG_CMD echo "" | $LOG_CMD # directory where last complete backup is located LAST_BACKUP_DIR=`ls $BACKUP_FILES_PATH | sort | grep ^"....-..-.._..\...\..."$ | tail -1` # for some reason -n never returned false when the string was "" # if [ -n $LAST_BACKUP_DIR ]; then if [[ $LAST_BACKUP_DIR != "" ]]; then echo "Last completed backup is located in" $LAST_BACKUP_DIR | $LOG_CMD else echo "No backups found!" | $LOG_CMD # remove $LOCK_FILE to indicate the script is done running rm -f $LOCK_FILE echo "" | $LOG_CMD echo "--------------------------------------------------------------------------------" | $LOG_CMD exit 4 fi } # run rsync backup run_backup() { # generate a list of items to ignore EXCLUDED="" for i in $EXCLUDE; do EXCLUDED="$EXCLUDED --exclude=$i"; done ID_FILES=`cat $FILES_PATH/.identity` ID_BACKUP_FILES=`cat $BACKUP_FILES_PATH/.identity` echo "" | $LOG_CMD echo "Starting rsync backup restore, from" $ID_BACKUP_FILES "to" $ID_FILES | $LOG_CMD rsync $EXCLUDED --delete-after -av $BACKUP_FILES_PATH/$LAST_BACKUP_DIR/ $FILES_PATH 2>&1 | $LOG_CMD # store error code from rsync's exit ERROR=${PIPESTATUS[0]} } if [ $KEEP_LOG -eq 1 ] || [ $KEEP_LOG -eq 2 ]; then # run backup logged if [ $KEEP_LOG -eq 2 ] && [ -e $LOG_FILE ]; then # if log mode is set to "2", delete old log file before starting (if it exists) rm -f $LOG_FILE fi # set log command to split stdout into a log file and stdout LOG_CMD="tee -a $LOG_FILE" else # set log command to only print to stdout LOG_CMD="cat" fi # check that .identity files exist in files and backup directories check_identity # set directory locations of backup set_dirs # run rsync backup run_backup # remove $LOCK_FILE to indicate the script is done running rm -f $LOCK_FILE # exit with the error code left by rsync exit $ERROR