Page History

Turn Off History

The New System for Building Condor

This document describes the new system for building Condor, which makes Condor significantly more easy to build at other sites. This is a major step towards making our source publically available, and also enables us to easily do nightly builds using machines not on AFS. This new build system was originally merged back into the main Condor source on 10/23/03 into the V6_6-branch in anticipation of the 6.6.0 stable release.

The Big Picture

The condor build system now does a much better job of including all the external packages we need to use. Instead of relying on pre-installed versions of Kerberos, Zlib, Globus, etc, the Condor build is now aware of these packages in a way that is not specific to our local site and build environment. Moreover, there is a clear way to deal with different versions of each external package for different branches of the Condor source. (The list of these external packages has been referred to as the "List of Shame", hence the acronym "LoS" has been used among the staff and people working on this).

There is a new top-level directory in the Condor repository, "externals", which lives next to "src", "config", and so on. This contains separate bundles for all the different external packages we link against. Now, to build condor, you must first build a version of the externals directory tree. However, you can still use a pre-built version of the externals tree, and at our site, we've got one for the major development platforms (i386_rh72 and sun4x_58) in /p/condor/workspaces/externals, so most of you won't have to wait through the long task of building all of Globus, Glibc, etc, everytime you want to build a new version of Condor. :)

To aid in all this, and to make Condor even less dependent on site-specific installation tricks, the Condor build now uses a configure script automatically generated via autoconf. The configure script tests for the existence of various programs required to build Condor in your PATH, and ensures that it finds versions that support the functionality condor requires. For example, the configure script will tell you if the version of perl in your PATH is too old for certain perl scripts used to build and bundle Condor. Many software packages use autoconf, so many of you are familiar with how this works. in principle, once you have a source workspace all checked out, you can just do "./configure; make" and Condor should build successfully.

Of course, there's more to it than that, which brings us to...

The Gory Details

  1. Prepairing Your Environment to Build Condor
  2. Checking Out the Source
  3. Setting Up a Local Build Workspace (Optional)
  4. Running autoconf
  5. Running configure
  6. Running make

1) Preparing Your Environment to Build Condor

Because of the autoconf-generated configure script mentioned above, the Condor build system is now a lot more careful about testing for the programs that Condor needs before you do the build. This avoids nasty surprises and often confusing error messages deep inside the build when a certain tool we depend on isn't available or isn't behaving how we expect. There are number of tools that the configure script tests for. In the outside world, people will either read through a list of these programs and install them, or use the configure script a few times, following the directions about what it needs until it passes successfully.

However, to make life easier for us at UWCS, I've added more symlinks to /p/condor/workspaces/build/bin for the main platforms we care about (i386_rh72 and sun4x_58). It'll be easy to add more, so if someone's trying to build locally on a different platform than either of those, let me know.

As always, you should make sure /p/condor/workspaces/build/bin is at the front of your PATH (at the very least, before /s/std/bin and /usr/bin, since neither of those always has the right versions of everything on CSL supported machines) before you start to try to build condor.

2) Checking Out the Source

If you want to use the pre-built externals directory, checking out the source is totally unaffected by this change. The only thing you need to worry about for existing V6_6-branch workspaces is that a bunch of files were modified in config, some added, and some removed. Also, there are some new files in the top-level src directory. So, the best thing to do is to go to the root of your V6_6-branch source workspace (where "src" and "config" live), and perform:

    cvs co -r V6_6-branch V6_6_ALL

That will update your workspace and will ensure that all the new files are added (which a regular cvs update won't necessarily do).

However, if you want to build your own version of externals for some reason, you need to checkout your source in two steps. The reason for this is that the "externals" directory only lives on the trunk. It is not branched and merged. You cannot check out the condor source with a single command and get different files from different branches. Instead, you must do something like this:

    cvs co -r V6_6-branch V6_6_ALL
    cvs co V6_6_EXT

Notice that you specify "-r <branch-name>" for the V6_6_ALL module (which gives you src, config, imake, and NTconfig), but not for V6_6_EXT (which just gives you the externals directory from the trunk).

"But I thought you just said the 'externals' directory always lives on the trunk. Why do I need to use 'V6_6_EXT?" ... you ask?

The reason is that while the external packages always live on the trunk, different branches of Condor might depend on different versions of each package. We'll use these <BRANCH-PREFIX>_EXT modules to define what versions of each package you actually need, so you don't check out packages you're not going to use. Remember, cvs checkout usually takes two main arguments: the revision you care about (what branch you're checking out, or the trunk if not defined), and the modules you want (what files you want, either specified by directory names directly, or the magic cvs modules we define in the CVSROOT/modules file). However, other than the wasted cycles and disk space, there's no harm in just grabbing the entire externals tree:

    cvs co externals

The build system knows to only build the versions of each package it needs on any given branch or platform.

The build system requires the presence of generated nroff Condor man pages, however these are not stored in the CVS repository. So if you are building your own externals, and you are building release tarballs from make public, then you must generate the Condor man pages at this point:

    cd doc; make nroff; tar zcf man-current.tar.gz man
    ln -s `pwd`/man-current.tar.gz ../externals/bundles/man/current/man-current.tar.gz

3) Setting Up a Local Build Workspace (Optional)

In the past, we required that people build condor in a separate directory from where their source lives. In general, this is a good idea, since you can leave your source workspace on AFS where it is:

    * Backed up
    * Can be shared across multiple build machines

It has the added benefit that the build workspace can be on a local file system, and therefore, the build will go faster than trying to build into AFS. However, this step not required anymore. So, if you want, once you check out the source, you can go directly to step #3 and build condor directly in the source workspace. However, if you want to maintain local build workspaces, read on.

The procedure for this is unchanged. However, some of the side effects are new, so some added explanation is required.

The Procedure:

  1. mkdir /scratch/<username>/<dir>
  2. cd /scratch/<username>/<dir>
  3. new_workspace /full/path/to/your/AFS/source/workspace

The Explanation:

new_workspace is a perl script that lives in /p/condor/home/bin (there's also a symlink in /p/condor/workspaces/build/bin). It takes 1 argument, the full path to your source workspace (that has "src", "config", as sub-directories). The new-improved version then figures out if your source workspace is an old build (e.g. you're building on V6_7-branch before we merge V6_6-branch there) or if it's a new build. Then, it makes sub-directories in your local build workspace for "src", "config", and "imake". In each one, it adds symlinks back to your shared source workspace for the files it needs (everything in "imake" and "config", and a few files in "src"). If it's an old build, the "config" directory itself is a symlink, since everything in there would be shared across multiple build workspaces. However, in the new build, it's a real directory (full of symlinks), since the configure script (described below) will create a platform-specific file in the config directory when it runs. In the "src" directory, if new_workspace finds the new files required for autoconf ("aclocal.m4" and "configure.ac"), it will add symlinks for those, in addition to the "Imakefile" and "condor_imake".

After this, new_workspace will go into the source directory, run autoconf, and if that worked, it will execute the configure script for you. However, since you may want to run those yourself, i'm going to describe them in their own sections.

4) Running autoconf

Once you have a workspace to build condor in (either a complete source workspace checked out from cvs, or a local build workspace setup with new_workspace), you must generate a configure script, and run that script to create the Makefiles needed to build Condor.

First, make sure you've got at least autoconf version 2.57 (which is what we've got a symlink to from /p/condor/workspaces/build/bin). Other than that, you need to be in the "src" directory, and execute autoconf. You don't need to give it any arguments, and it usually won't produce any messages. however, it will take the "configure.ac" and "aclocal.m4" input files, and generate a configure script as output. This step is not platform dependent. You can run autoconf once when you first checkout your source workspace, and the resulting configure script will be portable across all build platforms. In fact, if you do this, the new_workspace script will just add a symlink to the configure script for you, and won't try to re-run autoconf.

5) Running configure

Our configure script is fairly standardized. I have tried, wherever possible, to follow the generally accepted practices and behaviors of configure scripts. However, since we're not (yet) using automake, a lot of the standard configure features are not really supported. For example, the --prefix option doesn't do anything for the Condor builds. Hopefully we'll be able to standardize our make rules and configure script more and more as time goes on, but for now, it just does the essentials of what we need.

In general, all you have to do is "./configure". It will print out a bunch of messages telling you what it's looking for, if it found it (and usually, if it found it, it will make sure it behaves in a way we need). If anything goes wrong, it will print a message about it, and exit immediately. If all the tests pass, the configure script will generate ../config/externals.cf (from ../config/externals.cf.in), and then attempt to run the ./condor_imake script to generate a Makefile. The first time you run configure or condor_imake in a new build workspace, condor_imake will have to build a valid copy of imake from the source living in ../imake. Someday, we'll probably move our imake package to the externals tree, since ultimately, that's where it belongs. However, I didn't have time (or an urgent reason) to do this in the initial phase.

If you see any errors while running configure, the very first thing to check is to make sure "/p/condor/workspaces/build/bin" is at the front of your PATH. I expect that the majority of problems anyone on the Condor Team will have running configure will come from not doing that, since it means the configure script will either not find some programs it needs, or will find incompatible versions. In fact, to prevent you from getting very far with a broken PATH, the configure script is now so smart that if it notices you're running it on a machine in the "cs.wisc.edu" domain that has the directory "/s/std/bin", it assumes you're on a CSL-supported machine and verifies your PATH as the very first step. If "/p/condor/workspaces/build/bin" is not found, or if it comes after "/s/std/bin", "/usr/bin", "/bin", and/or "/usr/local/bin", the configure script now bails out with a verbose message explaining that you messed up.

Our configure script does support --help and prints a verbose message with all the options you can use. Not all of them are used by the Condor build, so I'll just mention the ones here that people might want to specify:

    * -C

      Perform caching. If you're going to be running configure a lot in a given workspace (for example, while you're testing something), you probably want to enable configure's caching mechanism. It will store the results of nearly all of its tests, and if you re-run configure, it'll just read the cached answers (which is very cheap), instead of re-performing all the tests (which can be quite expensive).

    * --with-externals=DIR

      Root of directory tree for external programs needed for building Condor (default is to search in ../externals)

      As a hidden feature for our ease of use, if --with-externals is not specified, and ../externals doesn't exist, the configure script will also look for /p/condor/workspaces/externals and see if that is valid for the platform you're building on. If that works, it uses it, otherwise, configure will exit with an error explaining that you need to give it the path to where you've checked out the externals tree.

    * --with-purecachedir=DIR

      Cache directory for objects instrumented with Purify (default is /tmp/<user>/.pcache)

      This is only useful on Solaris (or platforms where we have purify). This is needed so that different developers can purify at the same time without screwing up each other's instrumented system libraries. The default will prevent users from stepping on each other's toes. However, if you want to keep your purify cache somewhere other than /tmp, you might want to specify this one when building on Solaris.

In addition, the configure script pays attention to certain shell and environment variables. Obviously PATH is important. :) In addition, certain standard environment variables are used to specify things we need while building Condor. The most obvious of these is TMPDIR. This is a standard environment variable used by other programs. If it's set, Condor will use that during the build for creating certain tmp files. If it's not set, the default is /tmp.

6) Running make

After configure has generated externals.cf and run condor_imake, you'll have a "Makefile" sitting in your src directory. this is very similar to the old Condor Makefile.

The big difference is that if the externals tree you're using has not been built yet, one of the very first things make depends on is building the externals tree (i.e. make externals). If you're using /p/condor/workspaces/externals (the default), you won't see this at all, since it will already be pre-built. However, if you checked out your own version of the externals tree for some reason, you'll see make output referring to building the externals. The logs of the individual builds for each external are written to separate files (the filenames are printed to stdout from make), so if you want to follow the progress in another window, you can. Be warned, even on a fast machine, it can take an hour or so to build all the externals. :(

Otherwise, the Makefile is pretty much identical to how it always worked. If you've got a local build workspace, the Makefile knows how to create symlinks back to the source workspace for all the files it needs, and to run condor_imake in each sub-directory to generate a Makefile in each one.

One new make target of interest:

make public now performs all the packaging needed for Condor binaries we ship to the web. After building a complete dynamically-linked version of condor (and a statically-linked one, if supported on a given platform) it calls make_final_tarballs for you, to package up everything. This has additional effect creating a "public" directory where all these files end up, along with a tarball containing the entire contents of release_dir. This make rule is what we now use for the nightly Condor builds.

If you want the old behavior of make public (create all the dynamic and static binaries, then package up all the release and contrib tarballs in static_dir and strip_dir, but not the final tarballs to put on the web), you can use make all_tarballs. Sorry the names are still somewhat weird. I just spent enough time with this to get it to work for our needs, not to clean up all the old cruft from our build system.