Page History

Turn Off History

How to configure priorities/quotas for groups of users

Known to work with Condor version: 7.0

By default, Condor provides fair sharing between individual users by keeping track of usage and adjusting their relative user priorities in the pool. Frequently, however, there is a need to allocate resources at a higher level. Suppose you have a single Condor pool shared by several groups of users. Your goal is to configure Condor so that each group gets its fair share of the computing resources and within each group, each user gets a fair share relative to other members of the group. What is "fair" depends on circumstances, such as whether some groups own a larger share of the machines than others and whether some groups have a different pattern of usage such as occasional bursts of computation verses steady demand.

The following recipes can be adapted to a variety of such situations.

How to divide the machines by startd RANK

You can give a group of users higher priority on a specific set of machines by using the startd RANK expression. Example:

# This machine belongs to the "biology" group.
MachineOwner = "biology"

# Give high priority on this machine to the group that owns it.
STARTD_ATTRS = $(STARTD_ATTRS) MachineOwner
Rank = TARGET.Group =?= MY.MachineOwner

The above example requires that jobs be submitted with an additional custom attribute "Group" that declares what group they belong to. See How to insert custom ClassAd attributes into a job ad for different ways of doing that. One way is simply for the user to do it explicitly in the submit file:

+Group = "biology"

If group membership is not likely to change frequently, or you really don't want users to have to declare their group membership, you could configure the startd RANK expression to look at the built-in TARGET.User attribute rather than relying on the custom attribute TARGET.Group. Example:

MachineOwners = "user1@biology.wisc.edu user2@biology.wisc.edu"

STARTD_ATTRS = $(STARTD_ATTRS) MachineOwners
RANK = stringListMember(TARGET.User,MY.MachineOwners)

Since RANK is an arbitrary ClassAd expression, you can customize the policy in a number of ways. For example, there could be a group with second priority on the machines. Or you could specify that some types of jobs (identified by some other ClassAd attribute in the job) have higher priorities than others. You just need to write the expression so that it produces a higher number for higher priority jobs.

The down side of RANK is that it involves preemption. RANK only comes into play when there is an existing job on a machine and the negotiator is considering whether a new job should preempt it. You can control how quickly the preemption happens in order for the new job to replace the lower priority job using MaxJobRetirementTime as described in How to disable preemption. By default, the preemption will happen immediately. This is most appropriate in Condor pools where groups own specific machines and want guaranteed access to them whenever they need them.

Given a choice of two machines to run a job on, it is a good idea to steer jobs towards machines that rank them higher so they stand less of a chance of being preempted in the future. Here is an example configuration that preferentially runs jobs where they are most highly ranked and secondarily prefers to run jobs on idle machines rather than claimed machines:

NEGOTIATOR_PRE_JOB_RANK = 10 * (MY.RANK) + 1 * (RemoteOwner =?= UNDEFINED)

Note that preemption by startd RANK trumps considerations of user priority in the pool. For example, if user A with a high (bad) user priority is competing with another user B with a low (good) user priority, user B, with the better user priority, will be able to claim more idle machines, but user A can then preempt user B if the startd RANK is higher. The relative user priorities therefore only matter when the startd RANKs are equal or when vying for unclaimed machines.

How to use group accounting to implement pool shares

Whereas startd RANK can be used to give groups preemptive priority on specific machines, Condor's group accounting system can be used to give groups a share of a pool that is not tied to specific machines. The Condor manual section on Group Accounting covers this topic.

Basically, there are two options:

Quick examples of these two options are provided below.

How to use group membership exclusively for sharing the pool

Users must simply declare their membership in a group. This is done in the submit file:

+AccountingGroup = "group_physics"

The administrator can then adjust the priority factor for the various groups using condor_userprio. The following example sets the biology group's priority factor to be twice that of physics, which means the physics group should get twice as big a share of the pool (because user priorities are inversely proportional to the share of the pool).

condor_userprio -setfactor 2.0 group_physics@wisc.edu
condor_userprio -setfactor 4.0 group_biology@wisc.edu

How to have group quotas and individual priorities for sharing the pool

In the negotiator configuration:

# define the groups in your pool
GROUP_NAMES = group_physics, group_biology, group_chemistry

# specify what share of the pool (# of batch slots) each group should have
GROUP_QUOTA_group_physics   = 100
GROUP_QUOTA_group_biology   =  75
GROUP_QUOTA_group_chemistry =  80

# specify that once a group has used its full quota, users from within
# the group can still vie for additional resources as individuals
GROUP_AUTOREGROUP = True

The submit file of the job needs to specify which accounting group and user within the group it belongs to:

+AccountingGroup = "group_physics.newton"