Installing and Maintaining HTCondor-CE
NOTE: If you are installing an HTCondor-CE for the Open Science Grid, consult the OSG-specific installation guide
The HTCondor-CE software is a job gateway for a grid Compute Element (CE). As such, HTCondor-CE is the entry point for jobs coming from the grid - it handles authorization and delegation of jobs to your local batch system. See the OSG HTCondor-CE Overview for a much more detailed introduction.
Use this page to learn how to install, configure, run, test, and troubleshoot HTCondor-CE from the HTCondor Yum repositories.
Before starting the installation process, consider the following points (consulting the Reference section below as needed):
- User IDs: If they do not exist already, the installation will create the Linux user
- SSL certificate: The HTCondor-CE service uses a host certificate at
/etc/grid-security/hostcert.pemand an accompanying key at
- DNS entries: Forward and reverse DNS must resolve for the HTCondor-CE host
- Network ports: The pilot factories must be able to contact your HTCondor-CE service on port 9619 (TCP)
- Submit host: HTCondor-CE should be installed on a host that already has the ability to submit jobs into your local cluster running supported batch system software (HTCondor, LSF, PBS/Torque, SGE, Slurm)
- File Systems: Non-HTCondor batch systems require a shared file system between the HTCondor-CE host and the batch system worker nodes.
There are some one-time (per host) steps to prepare in advance:
- Ensure the host has a supported operating system (Red Hat Enterprise Linux variant version 6 or 7)
- Obtain root access to the host
- Prepare the EPEL and HTCondor Yum repositories
- Install CA certificates and VO data into
- Install the appropriate HTCondor-CE package for your batch system:
|If your batch system is...||Then run the following command...|
- Install the certificate revocation list updater available from EPEL:
# yum install fetch-crl
To authorize the local HTCondor-CE daemons and job submission from external users and VOs, edit the contents of
/etc/condor-ce/condor_mapfile by following the comments.
condor_mapfile.rpmnew files may be generated upon HTCondor-CE version updates that should be merged into
Configuring Non-HTCondor Batch Systems
HTCondor-CE requires a shared filesystem between the CE and the worker nodes to support file transfer for non-HTCondor batch systems. See the documentation here for details.
HTCondor-CE uses the Batch Language ASCII Helper Protocol (BLAHP) to submit and track jobs to non-HTCondor batch systems.
To work with the HTCondor-CE, modify
/etc/blah.config using the following steps:
- Disable BLAHP handling of certificate proxies:
blah_disable_wn_proxy_renewal=yes blah_delegate_renewed_proxies=no blah_disable_limited_proxy=yes
- (Optional) If your batch system tools are installed in a non-standard location (i.e., outside of
/usr/bin/), set the corresponding
*_binpathvariable to the directory containing your batch system tools:
|If your batch system is...||Then change the following configuration variable...|
If your site needs to report to BDII, consult the HTCondor-CE BDII Provider documentation.
See this section of the OSG installation guide for additional optional HTCondor-CE configurations. Note that the "Accounting with multiple CEs or local user jobs" and "HTCondor-CE monitoring web interface" sections are both OSG specific and can be safely ignored.
See this section of the OSG installation guide for how to use HTCondor-CE, including starting and stopping of services. Note that Gratia is accounting software specific to the OSG and can be safely ignored.
See this section of the OSG installation guide for steps to validate your HTCondor-CE.
See this OSG guide for troubleshooting tips and strategies.