HTCondorWiki: Folding At Home Dynamic Slots

Audience

If you have a HTCondor pool that is configured to use Partitionable slots and you wish to run Folding@home as backfill jobs on those machines using the recipe here

Strategy

We will set up a set of backfill static slots that share the resources of the primary partitionable slot, but use those resources only when then primary slot is idle or partially idle. The backfill slots are not configured to have GPUs, since there is currently no way in HTCondor to reliably share a GPU between slots.

We then will use HTCondor's "fetch work" configuration to run a script for each backfill slot that periodically generates a job classad for Folding@Home work. Those jobs will have a retirement time of 0 so that they can be pre-empted when the primary slot wants to use the resources. These jobs will all be run by a dedicated local user named backfill and will run at a fixed directory for each slot. This allows the Folding@home software to save progress periodically and then resume that work later if it is preempted.

The configuration fragment below can be customized by modifying BackfillCpusPerSlot, BackfillMemoryPerSlot FAH_Owner, FAH_HOME and FAH_Executable configuration variables near the top of the file.

pslot_backfill_folding_at_home.config

#  Configuration to add Static backfill slots running Folding@Home to a HTCondor execute
#  Node that uses partitionable slots
#
# each backfill slot will have the following resources
#
BackfillCpusPerSlot = 1
BackfillMemoryPerSlot = 1000

# The local user to run FAH jobs as, you must create this user
#
FAH_Owner = backfill

# Folding at home settings, change to match your install path and work directory
# Each slot will use a subdirectory under this path named from the slotid of the slot
# e.g.  /opt/fah/slot2, /opt/fah/slot3, etc.
#
FAH_HOME = /opt/fah
FAH_Executable = /usr/bin/FAHClient

# Arguments to the Folding@Home client for each slot
#
# Checkpoint every N minutes, default is 15
FAH_CheckpointArg = 15
FAH_CpusArg       = $(BackfillCpusPerSlot)
FAH_MemArg        = $(BackfillMemoryPerSlot)*1024*1024

FAH_Arguments = --config $(FAH_HOME)/config.xml \
                --data-directory . \
                --checkpoint $INT(FAH_CheckpointArg) \
                --client-threads $INT(FAH_CpusArg) \
                --memory $INT(FAH_MemArg) \
                --cpus $(FAH_CpusArg)

# the script that HTCondor will use to get FAH jobs, steal this script from HTCondor website
#
# this script extracts slot id from stdin and
# puts it into the environment as _CONDOR_BACKFILL_SLOTID=<SlotId>
# then runs
#    condor_config_val -macro '$(FAH_JOB)'
# to generate a job classad for that slot
#
FAH_HOOK_FETCH_WORK = $(FAH_HOME)/fetch_work

#
# ----- configuration of the backfill slots and workfetch ------
#

#
# Template for the Folding@Home job that the FAH_FETCH_SCRIPT will return
#
FAH_JOB @=end
  Cmd = "$(FAH_Executable)"
  Iwd = "$(FAH_HOME)/slot$(BACKFILL_SLOTID)"
  Owner = "$(FAH_Owner)"
  User = "$(FAH_Owner)@$(UID_DOMAIN)"
  JobUniverse = 5
  Arguments = "$(FAH_Arguments)"
  Out = "fah.out"
  Err = "fah.err"
  NiceUser = true
  Requirements = BackfillSlot && (State == "Unclaimed")
  MaxJobRetirementTime = 0
  JobLeaseDuration = 604800
  RequestCpus = $(BackfillCpusPerSlot)
  RequestMemory = $(BackfillMemoryPerSlot)
@end


# First tell the startd we have double the amount of cpus and memory than we actually have
# and advertise some additional information into the slots (such as the amount
# of Cpus, Memory leftover in the pslot.
#
num_cpus = 2 * $(DETECTED_CPUS)
memory = 2 * $(DETECTED_MEMORY)
startd_attrs = $(startd_attrs) BackfillSlot
startd_slot_attrs = $(startd_slot_attrs) Cpus Memory

# Decrease startd polling internal so backfill jobs are killed quickly when
# production jobs arrive.
#
polling_interval = 2

# Set up a pslot for the production jobs, adding a START requirement to
# prohibit accepting backfill jobs.  If you already have a p-slot configuration
# you only need to add the slot_type_1_start expression below. It prevents
# workfetch jobs from starting on the p-slot.
#
slot_type_1_partitionable = true
slot_type_1 = cpus=$(DETECTED_CPUS) memory=$(DETECTED_MEMORY) gpus=100% disk=50% swap=50%
num_slots_type_1 = 1

# Disable workfetch for the p-slot
slot_type_1_STARTD_JOB_HOOK_KEYWORD =

#
# Set up a set of static slots for backfill jobs.  The number of static slots
# is determined by how many CPUs/Memory required per slot.  Set the START expression
# to only accept backfill jobs, and only if the pslot has enough free resources.
# Preempt backfill slots one at a time as unused resources in the pslot disappear.
# Give this slots a custom name of "backfillX@foo", and a custom attribute of
# BackfillSlot=True.  Disable vacate on these slots, so that jobs are immediately killed
# upon preemption (we want the resources freed up asap for the production jobs).
#
NumBackfillSlots = min( { $(DETECTED_CPUS) / $(BackfillCpusPerSlot), $(DETECTED_MEMORY) / $(BackfillMemoryPerSlot) } )
Backfill_Cpus_Exhausted = (slot1_Cpus / $(BackfillCpusPerSlot) < (SlotID - 1))
Backfill_Memory_Exhausted = (slot1_Memory / $(BackfillMemoryPerSlot) < (SlotID- 1))
Backfill_Resources_Exhausted = ( $(Backfill_Cpus_Exhausted) || $(Backfill_Memory_Exhausted) )
slot_type_2_partitionable = false
slot_type_2_name_prefix = backfill
slot_type_2 = cpus=$(BackfillCpusPerSlot) memory=$(BackfillMemoryPerSlot), gpus=0
num_slots_type_2 = $INT(NumBackfillSlots)
slot_type_2_BackfillSlot = True
slot_type_2_start = (Owner == "$(FAH_Owner)") && ($(Backfill_Resources_Exhausted) == False)
slot_type_2_preempt = $(Backfill_Resources_Exhausted)
slot_type_2_want_vacate = False

# enable work fetch for this slot type
slot_type_2_STARTD_JOB_HOOK_KEYWORD = FAH
slot_type_2_FetchWorkDelay = 120
slot_type_2_Rank =