Backups, rsync and rsnapshot

· general ·

I've just taken delivery of a spanking new 500GB LaCie Firewire drive for work. I've partitioned it in two, with the bulk of the space allocated for video data, but I've also made a backup partition for my laptop. I've wanted to try using rsync for ages, but I never really had a large enough hard drive spare to try it.

The great thing about rsync is that you can do frequent incremental backups (keeping snapshots of the state of your drive at various time intervals), while not using much more space than a single backup, and while allowing you to easily restore a whole directory structure without having to piece together backup sets.

It manages this feat by using hard links, and rotating backup sets. In the latest set, only the files which have changed are backed up, but hard links are created for the rest of the files, so the latest backup appears to be (and to all intents and purposes is) a full copy of the source.

rsync is included in Tiger, but it can be a little tricky to set up a script to back everything up, particularly if you want to back up multiple but separate directories to a single destination. However, rsnapshot provides a nice, easily configured wrapper around rsync. You can configure the number of hourly, daily, weekly and monthly backups that get rotated, and you can set multiple source paths to backup.

It works very nicely, but I've been trying to think of a good way to automate starting it. Now that cron is deprecated in Tiger, I'm looking in to using launchd to run the backups. The tricky part is that I only want it to run if my Firewire drive is mounted, which is only more or less in office hours. rsnapshot does come with a little script that checks whether the snapshot path is mounted before it runs the process itself, but it would be nice to do it within launchd. There's the watchpaths property which is supposed to only start the process if a path changes (which would happen if my drive was mounted). The other snag is that I don't think that launchd yet supports the '*/4' style of time intervals (i.e. every 4 hours).

I'll try to construct something over the weekend, and test it on Monday. If anyone has any tips for using launchd as a cron replacement, let me know!