How to configure priorities/quotas for groups of users

By default, HTCondor provides fair sharing between individual users by keeping track of usage and adjusting their relative user priorities. Frequently, however, there is a need to allocate resources at a more abstract level. Suppose you have a single HTCondor pool shared by several sets of users. Your goal is to configure HTCondor so that each group (set) gets its fair share of the computing resources and within each group, each user gets a fair share relative to other members of the group. What is defined to be fair depends on circumstances, such as whether some groups own a larger share of the machines than others and whether some groups have a different pattern of usage such as occasional bursts of computation verses steady demand.

The following recipes can be adapted to a variety of such situations.

How to divide the machines by the machine's RANK

You can give a group of users higher priority on a specific set of machines by configuring the RANK expression. Example:

# This machine belongs to the "biology" group.
MachineOwner = "biology"

# Give high priority on this machine to the group that owns it.
STARTD_ATTRS = $(STARTD_ATTRS) MachineOwner
Rank = TARGET.Group =?= MY.MachineOwner

The above example requires that jobs be submitted with an additional custom attribute called "Group" in this example, that declares what group they belong to. See How to insert custom ClassAd attributes into a job ad for different ways of doing that. The user may set this explicitly in the job's submit description file with

+Group = "biology"

If group membership is not likely to change frequently, or you really don't want users to have to declare their group membership, you could configure the machine RANK expression to look at the built-in TARGET.User attribute, rather than relying on the custom attribute TARGET.Group. An example configuration that implements this:

MachineOwners = "user1@biology.wisc.edu user2@biology.wisc.edu"

STARTD_ATTRS = $(STARTD_ATTRS) MachineOwners
RANK = stringListMember(TARGET.User,MY.MachineOwners)

Since RANK is an arbitrary ClassAd expression, you can customize the policy in a variety of ways. For example, there could be a group with second priority on the machines. Or you could specify that some types of jobs (identified by some other ClassAd attribute in the job) have higher priorities than others. In that case, write the expression so that it produces a higher number for higher priority jobs.

The down side of RANK is that it involves preemption. RANK only comes into play when there is an existing job on a machine and the negotiator is considering whether a new job should preempt it. You can control how quickly the preemption happens in order for the new job to replace the lower priority job using MaxJobRetirementTime as described in How to disable preemption. By default, the preemption will happen immediately. This is most appropriate in HTCondor pools where groups own specific machines and want guaranteed access to them whenever they need them.

Given a choice of two machines to run a job on, it is a good idea to steer jobs towards machines that rank them higher so they stand less of a chance of being preempted in the future. Here is an example configuration that preferentially runs jobs where they are most highly ranked and secondarily prefers to run jobs on idle machines rather than claimed machines:

NEGOTIATOR_PRE_JOB_RANK = 10 * (MY.RANK) + 1 * (RemoteOwner =?= UNDEFINED)

Note that preemption by RANK trumps considerations of user priority within the pool. For example, if user A with a high (bad) user priority is competing with another user B with a low (good) user priority, user B, with the better user priority, will be able to claim more idle machines, but user A can then preempt user B if the RANK is higher. The relative user priorities therefore only matter when the RANK values are equal or when vying for unclaimed machines.

Group accounting to implement pool shares

Whereas RANK can be used to give groups preemptive priority on specific machines, HTCondor's group accounting system can be used to give groups a share of a pool that is not tied to specific machines. The HTCondor manual sections on Group Accounting and Accounting Groups with Hierarchical Group Quotas covers this topic. Please locate the manual sections for descriptions and implementation examples to follow.