{section: How to limit CPU usage of jobs}

_Known to work with HTCondor v8.2 and above_

When you submit a job in HTCondor, users can specify =request_cpus= in the job submit description to the expected maximum number of CPU cores required by all the processes in the job. HTCondor will then schedule the job to run on an appropriately sized slot for your job that has at least as many cores as specified. This document describes three different strategies to configure HTCondor to prevent jobs from using more  CPU cores than specified in the slot.  Each strategy has its own set of pros and cons that are highlighted below.

The three strategies are:

1: Use ASSIGN_CPU_AFFINITY
2: Use cgroups
3: Use startd policy expressions to monitor cores used, and place jobs on hold that use more cores than were requested/allocated.

{subsection: Option 1: ASSIGN_CPU_AFFINITY}

To enable, just put the following line in your condor_config file(s) on all your execute nodes
{code}
ASSIGN_CPU_AFFINITY = True
{endcode}
and then issue a condor_reconfig command.  HTCondor will then configure the kernel to prevent it from utilizing more core than
your job than requested.  For instance, if your job is scheduled onto a one-core slot and starts up 50 threads, all 50 threads will timeshare upon one CPU core.

This requires that your execute node is running Linux and that you only have one condor_startd instance running per machine.  The condor_startd does not need to have root access.  One downside of this method is even if there is no contention for CPU cores on the server, the job cannot use more cores than specified in the Cpus attribute of the slot.  That is, on an eight core machine, with only a single, one-core slot running, and otherwise idle, the job running in the one slot could only consume one core.

{subsection: Option 2: cgroups}


This configuration requires a relatively recent version of Linux (RHEL 6.5 or above), only works when the HTCondor daemons have been started as root, and requires there only be one condor_startd instance running per machine.

To use, configure cgroups as described in the manual in section "Cgroup-Based Process Tracking", in section 3.12: {link: http://research.cs.wisc.edu/htcondor/manual/current/3_12Setting_Up.html Cgroup-Based Process Tracking}. HTCondor will then configure the kernel to prevent it from utilizing more core than
your job than requested.  However, unlike =ASSIGN_CPU_AFFINITY= above, this limit only applies when there is contention for the CPU. That is, on an eight core machine, with only a single, one-core slot running, and otherwise idle, the job running in the one slot could consume all eight CPU cores concurrently with this limit in play, if it is the only thing running. If, however, all eight slots had running jobs, with each configured for one CPU core, the core usage would be assigned equally to each job, regardless of the number of processes in each job.


{subsection: Option 3: startd policy expressions}

This configuration works on any operating system (Linux, Windows, MacOS), does not require root/administrator access, and continues to work even if there is more than one startd running per machine (and thus is applicable in a glide-in scenario).
Append the below into the condor_config file(s) on your execute machines to 1) advertise the number of CPU cores used by the job in the
slot ad as attribute =CpusUsage=, and to 2) place jobs on hold that use more cores than requested.

{code}
#
# Publish the number of CPU cores being used by the job into
# to slot ad as attribute "CpusUsage".  This value will
# be the average number of cores used by the job over the
# past minute, sampling every 5 seconds, and rounded to two digits (X.XX).
#
CpusUsage = ifthenelse( \
   TotalLoadAvg > 0.0 && Activity!="Idle", \
   int(CondorLoadAvg / TotalLoadAvg * \
       ifthenelse(TotalLoadAvg < $(DETECTED_CORES), TotalLoadAvg, $(DETECTED_CORES)) \
       * 100) / 100.0, \
    0)
STARTD_EXPRS = $(STARTD_EXPRS) CpusUsage
#
# If the number of CPU cores used by the job exceeds the
# number of cores requested by more than 0.8, put the
# job on hold with a helpful message in job attribute "HoldReason",
# and job attributes HoldReasonCode set to 21 and HoldReasonSubCode set to 1.
# Note that to place the job on hold, we first eliminate any
# retirement time and preempt the job.
#
CPU_EXCEEDED = (CpusUsage > Cpus + 0.8)
PREEMPT = ($(PREEMPT:False)) || $(CPU_EXCEEDED)
MAXJOBRETIREMENTTIME = ifthenelse($(CPU_EXCEEDED),0,$(MAXJOBRETIREMENTTIME:0))
WANT_SUSPEND = ($(WANT_SUSPEND:False)) && $(CPU_EXCEEDED) =!= TRUE
WANT_HOLD = ($(WANT_HOLD:False)) || $(CPU_EXCEEDED)
WANT_HOLD_REASON = ifThenElse($(CPU_EXCEEDED), \
     "cpu usage exceeded request_cpus",$(WANT_HOLD_REASON:UNDEFINED))
WANT_HOLD_SUBCODE = ifThenElse($(CPU_EXCEEDED),1,$(WANT_HOLD_SUBCODE:UNDEFINED))
{endcode}