Ticket #1992: Implement First Come, First Served for dagman jobs

Implement First Come, First Served for dagman jobs where an entire dagman job can be prioritized over another dagman job. Currently jobs within different dag's are prioritized with qdate and individual prios. There is no way to prioritize an entire dag job over another.
[Append remarks]

Remarks:

2011-Mar-22 10:48:55 by matt:
Initial comments -

  1. Need patch against V7_6-branch (or master, where c++_utils is now just utils)
  2. Avoid whitespace changes (in qmgmt.cpp)
  3. The sort comparator is in two places (!)
  4. Must avoid creating attributes that are required on a job ad, e.g. put a default for PreJobPrio at its evaluation point (get_job_prio instead of CreateJobAd)


2011-Mar-29 07:30:54 by jrt:
Note on use.

The new prios should be positive integers. -1 is used for GetAttributeInt and LookupInteger to indicate the prios are not defined. If a user submits a negative prio, the code will consider it not to be defined. If job A has [Pre,Post]JobPrio[1,2] defined and job B does not, the code will skip the [Pre,Post]JobPrio[1,2] comparison.


2011-Mar-29 11:33:56 by matt:
dagman7.6-v2.patch -

  1. clusterSortByPrioAndDate() compares against INT_MIN, but should compare against -1
  2. Comment in get_job_prio should -1 -> INT_MIN

Can you add the test to src/condor_tests?


2011-Mar-29 13:58:37 by matt:
Here's a different testing strategy.

# Using a personal condor configuration with NUM_CPUS=1
$ condor_master
$ condor_off -negotiator
$ ./submit.sh
...spam...
$ condor_on -negotiator
...wait about 5 minutes...
$ condor_history -format "%d " JobStartDate -format "%d " PreJobPrio1 -format "%d " PreJobPrio2 -format "%d " JobPrio -format "%d " PostJobPrio1 -format "%d " PostJobPrio2 -format "%s\n" GlobalJobId | sort -n -k1

submit.sh:

#!/bin/sh

for pre1 in $(seq -1 1 1); do
   for pre2 in $(seq 1 -1 -1); do
      for prio in $(seq -1 1 1); do
         for post1 in $(seq 1 -1 -1); do
            for post2 in $(seq -1 1 1); do
               condor_submit -a pre1=$pre1 \
                             -a pre2=$pre2 \
                             -a prio=$prio \
                             -a post1=$post1 \
                             -a post2=$post2 \
                  job.sub
            done
         done
      done
   done
done

job.sub:

cmd = /bin/sleep
args = 1

log = job.log

+PreJobPrio1 = $(pre1)
+PreJobPrio2 = $(pre2)
priority = $(prio)
+PostJobPrio1 = $(post1)
+PostJobPrio2 = $(post2)

queue

Successful output will look like:

1301424089 1 1 1 1 1 matt@eeyore.local#183.0#1301424079
1301424091 1 1 1 1 0 matt@eeyore.local#182.0#1301424079
1301424093 1 1 1 1 -1 matt@eeyore.local#181.0#1301424079
1301424095 1 1 1 0 1 matt@eeyore.local#186.0#1301424079
1301424098 1 1 1 0 0 matt@eeyore.local#185.0#1301424079
1301424100 1 1 1 0 -1 matt@eeyore.local#184.0#1301424079
1301424102 1 1 1 -1 1 matt@eeyore.local#189.0#1301424079
1301424105 1 1 1 -1 0 matt@eeyore.local#188.0#1301424079
1301424107 1 1 1 -1 -1 matt@eeyore.local#187.0#1301424079
1301424109 1 1 0 1 1 matt@eeyore.local#174.0#1301424079
1301424112 1 1 0 1 0 matt@eeyore.local#173.0#1301424079
1301424114 1 1 0 1 -1 matt@eeyore.local#172.0#1301424079
...


2011-Mar-29 14:15:33 by matt:
A successful run's md5 is...

$ condor_history -format "%d\t" JobStartDate -format "%d\t" PreJobPrio1 -format "%d\t" PreJobPrio2 -format "%d\t" JobPrio -format "%d\t" PostJobPrio1 -format "%d\n" PostJobPrio2 | sort -n -k1 | cut -f1 --complement | md5sum -
2365d29aa83308493a0387e6038e9cd5  -


2011-Mar-29 15:03:55 by matt:
Final notes on use -

New attributes: PreJobPrio1, PreJobPrio2, PostJobPrio1, PostJobPrio2

These new attributes are only accessible via explicit addition to a job ad, e.g. +PreJobPrio1 in a submit file.

The new Prios can range from INT_MIX+1 to INT_MAX. Higher priorities run before lower priorities, e.g. 1 then 0 then -1. If job A has {Pre,Post}JobPrio{1,2} defined and job B does not, the code skips the comparison.

Order of comparison: PreJobPrio1, PreJobPrio2, JobPrio, PostJobPrio1, PostJobPrio2


2011-Mar-31 15:54:08 by psilord:
Pardon me if I am brash, but that's a crappy interface.

Why not make JobPrio an array of integers and then stable sort them in a particular order on the columns going left to right, filling in anything you don't know about with zero when there is a length mismatch

job1:
JobPrio = 1, 0

job2:
JobPrio = 2

job3:
JobPrio = 1, 3

Now, the jobs sort left to right in descending order (with canonnicalized job prio arrays]:

job2   [2 0]
job3   [1 3]
job1   [1 0]

This is sufficient to produce any fine grained ordering you want with macro level behavior depending on how many columns you put into the ordering.

Likely, we use a crappy strtol on the JobPrio attribute for versions of Condor which don't have this feature. This means that if you downgrade a pool with array based JobPrios, then it'll convert the first number upto the , and the jobprio and functions like how it is supposed to.

[Append remarks]

Properties:

Type: enhance           Last Change: 2011-Mar-31 16:29
Status: resolved          Created: 2011-Mar-22 09:39
Fixed Version: v070700           Broken Version: v070506 
Priority:          Subsystem: Daemons 
Assigned To: jrt           Derived From: #804
Creator: jrt  Rust:  
Customer Group: other  Visibility: public 
Notify: matt@cs.wisc.edu jthomas@redhat.com  Due Date:  

Related Check-ins:

2011-Mar-29 15:19   Check-in [21189]: Added {Pre,Post}JobPrio{1,2} job ad attributes, #1992 ===GT=== #1992 ===VersionHistory=== New attributes: PreJobPrio1, PreJobPrio2, PostJobPrio1, PostJobPrio2 [...] (By Jon Thomas )

Attachments: