HTCondorWiki: Job Ordering

 
 Once a job is in the ready queue, it will eventually get submitted by =Dag::SubmitReadyJobs()=, which is called from =condor_event_timer()= in _dagman_main.cpp_; =condor_event_timer()= is called by daemoncore (every 5 seconds by default).  Note that =Dag::SubmitReadyJobs()= will only submit a certain number of jobs each time it is called (that number is configurable).  If the attempt to submit the job fails, =Dag::SubmitReadyJobs()= calls =Dag::ProcessFailedSubmit()=, which puts the job back into the ready queue.
 
+When a job finishes, DAGMan sees the job's terminated event in the appropriate log file, and calls =Dag::ProcessTerminatedEvent()=.  If the job failed, =Dag::ProcessTerminatedEvent()= calls =Job::TerminateFailure()=, which marks the job as failed.  =Dag::ProcessTerminatedEvent()= then calls =Dag::ProcessJobProcEnd()=, whether the job succeeded or failed.  =Dag::ProcessJobProcEnd()= takes a number of possible actions, such as initiating a retry for the node, starting the node's POST script, waiting for other job procs to finish if the cluster contains more than one proc, or marking the node as successful.
+
+TEMPTEMP -- talk about post script
+
+When a node finishes, we call =Dag::TerminateJob()= on it; that method goes through the list of this node's children and removes the just-finished node from the children's waiting queues.  For each child whose waiting queue becomes empty, it calls =Dag::StartNode()=, and the cycle continues.
+
+
 
 
 =condor_event_timer()= in _dagman_main.cpp_ gets called every five (by default) seconds.  In that function, we call =Dag::SubmitReadyJobs()= to submit any jobs that are ready; ready any new node job events (see ???); output the status of the DAG; and check whether the DAG is finished.