Page History
- 2020-Jan-22 11:53 tim
- 2015-Sep-28 17:55 johnkn
- 2015-Sep-28 17:54 johnkn
- 2012-Nov-13 15:31 adesmet
- 2012-Aug-15 16:14 kronenfe
- 2012-Aug-15 15:44 kronenfe
- 2012-Aug-15 15:38 kronenfe
- 2012-Aug-15 12:40 kronenfe
- 2012-Aug-15 12:39 kronenfe
- 2011-Apr-07 14:42 kronenfe
- 2011-Apr-07 14:42 kronenfe
- 2011-Apr-07 14:34 gthain
- 2011-Apr-07 14:24 kronenfe
Overview
In addition to our nightly build and test we run a "continuous" build and test on a subset of platforms. The goal of these builds:- Provide quick feedback if a commit breaks the builds or tests (without waiting until the next day to view the nightly results)
- Help spot race conditions in tests that fail sporadically, especially when subsequent runs of the test produce different results against the same code
- Eventually the desire is to run a build and test for each commit to Git. We don't currently have the CPU cycles available, largely because the tests take so long to run.
Current implementation
The continuous runs are submitted from nmi-s006.cs.wisc.edu in the /home/cndrauto/continuous/ directory. A single crondor job exists for each platform. This job simply calls condor_nmi_submit to do the dirty work. Occasionally, someone will condor_rm cndrauto, forgetting about these jobs. When that happens, the crondor job will get removed too, removing the continuous builds. The crondor schedule is set up in the submit files to run about once an hour, and we try to bias the hours to ones where developers are most likely to be committing. Note that condor_nmi_submit sets the JobPrio such that the most recently submitted build (and test) has the highest priority. Because tests are slow, they usually fall behind in the course of a day, but the most recently submitted onces are started first. The goal is to try to set the schedule so that all the tests "catch up" overnight. Even if it takes overnight for a test run to finish, it can still be very valuable. This is because it is over hard to track backwards from a broken test to the commit which broke the test. The more frequent the tests are, the smaller the commit windows is to find the guilty commit, even if it takes a long time for the test to finally finish.
To add a new continuous run:
- Login to
nmi-s006and sudo to thecndrautouser - Copy the template directory from Git. It is in the
CONDOR_SRC/nmi_tools/continuousdirectory. Put the directory into/home/cndrauto/continuous/PLATFORMwhere PLATFORM is the name of the platform you are testing against. It should match the NMI platform name, e.g.x86_64_rhap_5 - Replace "PLATFORM" in the files
run.shandsubmit - Submit the Condor-Cron job:
condor_submit submit
Modify a continuous run
- Login to
nmi-s006and switch to thecndrautouser - Stop the appropriate job using
condor_rm. You can runcondor_q | grep run.shto figure out which job to remove. - Modify
run.shorsubmitas appropriate and then re-submit to condor:condor_submit submit
