HTCondor Repository Backups

As of 2007-12-18, we do regular backups of the following directories (CVS, Git and externals):

We have had occasional CVS corruption over the years, including an incident in early December 2007. Some of the corruption is believed to be caused by AFS. For the December 2007 incident, we were lucky to have a recent backup of the corrupted file which hadn't changed, so swapping in the backup allowed us to quickly recover. However, we might not notice corruption for years if the corrupted version isn't in use. Efforts to experimentally import our respository into Git and Subversion have revealed corruption going far into the past. So we're going to have regular backups so we can more easily recover if something important is damaged.

Backups are done from cron as the user cndrutil on They are stored in /scratch.1/cvs-backup.

We're using rsnapshot, a simple differential backup system built on top of rsync. The CVS directories are copied over, using hardlinks for files that haven't changed. rsnapshot is installed in /p/condor/home/bin/cvs-backup/rsnapshot (which is a symlink to the exact version in the same directory). It currently relies on rsync in /s/std/bin/.

To access the cndrutil account, you'll need ksu access on Ken Hahn can get you ksu access on tonic. The lab updates ksu access nightly.

As of 2007-12-18, the repository is about 1.6GB. A nightly different is estimated to be about 0.6GB. That may seem large, but rsync needs to copy any file with the slightest change, and the nightly build tags touch every file currently in use on any branch we're building.

The exact backup policy is in /p/condor/home/bin/cvs-backup/rsnapshot.conf. As of 2007-12-18, we have 7 daily backups, 4 weekly backups, and 12 monthly backups. This is estimated to use about 14GB

Why rsync/rsnapshot?

Other options were considered. Given the 2007-12-18 assumptions, some tests were done, revealing:

Technique  Per Image   Total     Time    Notes
             Size      Size
cp -r       1.6 GB     34 GB     6 min
tar czf     1.0 GB     22 GB     8 min	 z = gzip
tar cjf     0.9 GB     21 GB    22 min	 j = bzip2
rsync       0.6 GB     14 GB     2 min	 +1.6GB for the first image

rsync was chosen because:

2008-01-09 update: After a few weeks of live runs, it looks like a daily backup is about 750 MB, the weekly is about 1.3 GB, plus 1.6 GB for the original backup. Assuming the monthly is similar to the weekly, this means a total file usage of about 27 GB. While not as good as hoped, we're sticking with it for the other reasons given above. For anyone curious about disk usage, use the following run as cndrutil on tonic:

/s/std/bin/runauth /p/condor/home/bin/cvs-backup/rsnapshot/bin/rsnapshot du

2008-03-26 update: With the switch to Git as our system, we're now backing up the Git repositories in addition. This has just changed, so the disk usage of a snapshot isn't yet known. Prior to this change, individual snapshots were running about 760MB. The entire set of backups at the moment, which only goes back two months, is about 13.5G.

2009-02-27 update: With the new handling of externals (eliminating the CONDOR_EXT, CONDOR_TEST_LRG, and CONDOR_TEST_CNFDTL git repositories from active usage), we're now backing up the directories of external tarballs as well. The affect on disk usage for backups isn't known yet.