Page History
- 2020-Jan-22 11:53 tim
- 2015-Sep-28 17:55 johnkn
- 2015-Sep-28 17:54 johnkn
- 2012-Nov-13 15:31 adesmet
- 2012-Aug-15 16:14 kronenfe
- 2012-Aug-15 15:44 kronenfe
- 2012-Aug-15 15:38 kronenfe
- 2012-Aug-15 12:40 kronenfe
- 2012-Aug-15 12:39 kronenfe
- 2011-Apr-07 14:42 kronenfe
- 2011-Apr-07 14:42 kronenfe
- 2011-Apr-07 14:34 gthain
- 2011-Apr-07 14:24 kronenfe
Overview
In addition to our nightly build and test we run a "continuous" build and test on a subset of platforms. The goal of these builds:- Provide quick feedback if a commit breaks the builds or tests (without waiting until the next day to view the nightly results)
- Provide historical coverage to help trace back and find which commit broke tests/builds. It is over hard to track backwards from a broken test to the commit which broke the test. The more frequent the tests are, the smaller the commit windows is to find the guilty commit.
- Help spot race conditions in tests that fail sporadically, especially when subsequent runs of the test produce different results against the same code
- Eventually the desire is to run a build and test for each commit to Git. We don't currently have the CPU cycles available, largely because the tests take so long to run.
Current implementation
The continuous runs are submitted from nmi-s006.cs.wisc.edu
in the /home/cndrauto/continuous/
directory. A single crondor job exists for each platform. This job simply calls condor_nmi_submit to do the dirty work. Occasionally, someone will condor_rm cndrauto, forgetting about these jobs. When that happens, the crondor job will get removed too, removing the continuous builds. The crondor schedule is set up in the submit files to run about once an hour, and we try to bias the hours to ones where developers are most likely to be committing. Note that condor_nmi_submit sets the JobPrio such that the most recently submitted build (and test) has the highest priority. Because tests are slow, they usually fall behind in the course of a day, but the most recently submitted onces are started first. The goal is to try to set the schedule so that all the tests "catch up" overnight. Even if it takes overnight for a test run to finish, it can still be very valuable. This is because it is over hard to track backwards from a broken test to the commit which broke the test. The more frequent the tests are, the smaller the commit windows is to find the guilty commit, even if it takes a long time for the test to finally finish.
Add a new continuous run
- Login to
nmi-s006
and sudo to thecndrauto
user - Copy the template directory from Git. It is in the
CONDOR_SRC/nmi_tools/continuous
directory. Put the directory into/home/cndrauto/continuous/PLATFORM
where PLATFORM is the name of the platform you are testing against. It should match the NMI platform name, e.g.x86_64_rhap_5
- Replace "PLATFORM" in the files
run.sh
andsubmit
- Submit the Condor-Cron job:
condor_submit submit
Modify a continuous run
- Login to
nmi-s006
and switch to thecndrauto
user - Stop the appropriate job using
condor_rm
. You can runcondor_q | grep run.sh
to figure out which job to remove. - Modify
run.sh
orsubmit
as appropriate and then re-submit to condor:condor_submit submit