/\ / \ (C) Copyright 2006 Parliament Hill Computers Ltd. \ / All rights reserved. \/ . Author: Alain Williams, April 2006 . . SCCS: @(#)Documentation 1.5 08/25/07 15:51:47 . Overview ******** RsyncBackup is a script that backs machines up over a network using rsync. It maintains one directory tree per backed up machine per day - this makes it easy to find/restore a consistent set of files from a particular day. By use of hard links files that have not changed are shared between different daily backup trees. This saves on disk space on the archive machine and also saves on the time taken to backup. This works since, typically, most files do not change from day to day. The idea is that one central archive server initiates backups on several other machines. Configuration on the machine being backed up ******************************************** Rsync needs to allow the archive machine read access to the directories that need to be backed up. Sample: /etc/rsyncd.conf use chroot = no # Something so that the backup server can read the entire machine: [backup] comment = Backup export path = / uid = 0 hosts allow = archive.example.co.uk Don't forget that you ought to keep a copy of configuration files on the archive machine. So arrange to run RsyncBackup on some other machine and have it backup your arhive machine. These config files will probably not use a lot of disk space. Running the backup ****************** This will typically be done via root's cron on the archive machine, here is an example: 30 0 * * * /usr/local/bin/RsyncBackup -c -1 -r server -d 'etc home usr/local root' The remote machine (the one being backed up) is 'server' The directories backed up are: 'etc home usr/local root' Archive Created *************** The archive directory (on the archive machine) for a machine 'server' will be: /arch/server That directory will contain directories with names being the date of back up, eg: 20070812 There will be one such directory for every date on which a backup was done. The sumbolic link LATEST will be to the directory of the last successful backup. Below this level will be directories just like on the machine that was backed up. Samba share definition ********************** This is to allow people to recover files from backup using Microsoft shares. Beware: the definitions below allow anyone to recover any file; this is not what you would want if the files being archived are confidential. [archive] comment = Archive Directory for archive path = /arch/archive browseable = yes writable = no public = yes [server] comment = Archive Directory for server path = /arch/server browseable = yes writable = no public = yes [logs] comment = Archive Logs path = /var/log/backups browseable = yes writable = no public = yes Option Explanation ****************** Archive cleaning options You will eventually fill the the space available for archive. So you need to clean out archives that are too old to be wanted. It is, however, nice to keep the occasional snapshot for a long time. -c Clean out (remove) old backups that are more than 14 days old. This might not save as much space as you might think. If few files change from day to day then most of what you save are directories. -C days Set the number of days older than which backups will be Cleaned to days (default 14). This works by looking at the archive directory name (eg 20070825) and deciding how old it is. -1 Don't clean a backup made on 1st of the month, eg leave 20070801 This can be useful to keep the occasional snapshot. -y yrs Clean out 1st of the month backups that are more than yrs years old. A year is deemed 366 days long - to be on the safe side. What to back up options -r name Backup the remote machine machine 'name'. Name is a DNS name, you could probably also use an IP address. This acts as the default name for local storgage of the archive (see -R). -s src The name of the rsync source (module name) on the machine that is being backed up. This module name will be specified in /etc/rsyncd.conf on the machine being backed up. The default name is 'backup'. This module should permit at least read access to the files that are to be backed up. The phrase 'module name' is an rsync concept. -d dirs List of directories to backup is 'dirs'. These should be specified without the leading '/', all the directories will be with respect to '/'. These directories will be copied into directories under the date directory (eg 20070825/home). -D dirs List of snapshot directories to backup is 'dirs'. These should be specified without the leading '/', all the directories will be with respect to '/'. These directories will be copied into directories under the snapshot directory (eg SNAPSHOT/home). The purpose of this is to have one copy, ie do not store by distinct date of backup. You probably want to use the '-d' option in preference to '-D'. -F Do not cross mount points onto other File systems. This is especially useful with directories that contain chrooted environments as this will contain a mount to /proc. Eg var/named. Where to store the back up Ie where things go on the machine that it performing the back up. -l dir Where backups are stored on the local machine. The default is '/arch'. The backup for the machine 'server' will be kept under '/arch/server'. This directory must exist before the program is run. -R name Name to use for local storage of archive (default is value is -r). This is used for the directory names under '/arch' and under '/var/log/backups'. You generally do not need to use this option, the remote name (-r) is probably good enough; however if the remote name is very log or an IP address you may want to specify another name for local use. -S file Touch a file with the start time of a successful backup. The point is that files may change or be created while a backup is taking place. Such a file with the name RsyncBackupSuccessTime will be used by the WriteRsyncBackupToTape program. If the name 'file' is not absolute (does not start with a '/') it will be created in the top level archive directory (eg /arch/server). If the name starts with a '_' it will be created in the directory for a particular day (eg /arch/server/20070812). Miscellaneous options -b kbps Set average bandwidth to kbsp (Kilobytes per second). Default no limit. You probably only need this if backing up over the Internet. -o Where does logging output go to ? By default output will be sent to /var/log/backups/RemoteMachineName/YYYYMMDD-HHMM. If the program is being run interactively (stderr is a terminal) output will only go to the terminal. If the '-o' option is given and the program is interactive then output will output will also go to the log file. If the log directory does not exist it will be created. -x eXplain - provide a brief help message. -Z Don't compress rsync data transfer. Use this if you are backing up over a fast local network and/or the machine CPUs are not that fast. Installation ************ You will probably designate one machine as the archive server, ie the machine that stores the backups of all the other machines. You should probably store the backups on a nice big disk partition all of it's own - using a mirrored disk would be a good idea. 1) Install rsync on all the machines Install the RsyncBackup into /usr/local/bin and make it executable. NOTE that the script uses the Korn Shell (ksh), you may need to install this on your machine. 2) Create a directory (default /arch) on the archive server. 3) Create a /etc/rsyncd.conf on each machine to be backed up. 4) Test that rsync works. The following command will do for the 'server' to be backed up: rsync rsync://server/backup/ This should give you a directory listing of the module name 'backup' on the machine 'server'. If it doesn't work, check: * /var/log/messages on the server (and similar) * the permissions in /etc/rsyncd.conf * What does the server think the archive machine is called when a network connection is made -- ie is your reverse DNS correct ? * Is rsync running on the server ? 5) Create a directory to store each machine's backup, eg /arch/server 6) Run RsyncBackup on one smallish directory, eg /etc: /usr/local/bin/RsyncBackup -r server -d etc Check that this works, check the files in /arch/server/LATEST/etc, check the log file in /var/log/backups/server/ 7) Install an appropriate entry into root's crontab. If you are backing many machines up, stagger the start times. If you don't run this as root then you will not be able to preserve file ownership, this will make it more difficult to restore a remote machine properly. 8) Note that the first backup will take a long time since files need to be copied over the network. Once this has been done it will be much quicker. You should avoid installing the crontab entry until one successful copy has been made - ie do the first one by hand or as an 'at' job. 9) Don't forget to backup the config files on the server itself, probably you will want to do this both on the archive machine itself (ie get it to back itself up) and back the archive server up onto some other machine. 10) Send the author an email saying that you are using the program along with any suggestions, bug fixes or beer tokens. You will also want to arrange to take backups off site occasionally. How often is up to you - the less frequent then the more that you loose if your entire building is destroyed. It is a good idea to keep your archive machine as far away as possible from the machines that it is backing up - preferably in another building. Be careful about physical security, if someone steals your archive machine then they have a copy of all your company secrets.