The techniques described here depend on HTCondor version 8.1.4 or a later version. Those with less current versions may try an older, less flexible technique as
described at HowToManageGpusInSeriesSeven.
-The general technique to manage GPUs is:
+The general technique to manage GPUs has 3 steps:
-1: Advertise the GPU: configure HTCondor so that execute nodes include information about available GPUs in their machine ClassAd.
-2: Job requests a GPU: jobs identify their request for a GPU and specify any further specific requirements of the GPU, in order to acquire a suitable GPU
-3: Identify the GPU: Jobs modify their arguments or environment to learn which GPU they may use.
+1: Advertise the GPU by configuring HTCondor such that an execute node includes information about available GPUs in its machine ClassAd.
+2: A job requests a GPU, specifying any further specific requirements about the GPU, in order to acquire a suitable GPU
+3: The job identifies the GPU through the use of arguments or environment, to learn which GPU it may use.
-{section: Advertise the GPU}
+{section: 1. Advertise the GPU}
The availability of GPU resources must be advertised in the machine's ClassAd, in order for jobs that need GPUs to be matched with machines that have GPUs.
@@ -79,7 +79,7 @@
When using Partitionable slots, by default the Partitionable slot will be assigned all GPUs. Dynamic slots created from the Partitionable slot will be assigned GPUs when the job requests them.
-{section: Job requests a GPU}
+{section: 2. A job requests a GPU}
User jobs that require a GPU must specify this requirement. In a job's submit description file, the simple request is