Ticket #2398: Setting hold=true when spooling a job's files causes it to fail

If you put 'hold=true' in your submit file and then submit your job with the -spool or -remote option to condor_submit, the schedd doesn't modify the file paths in the job ad to point at the spool directory. The schedd will then be unable to run the job.

The schedd looks at the HoldReasonCode of a newly-submitted job to determine whether to expect spooling of the job's files. It's expecting the value to be CONDOR_HOLD_CODE_SpoolingInput. But when 'hold=true' appears in the submit file, condor_submit sets HoldReasonCode to CONDOR_HOLD_CODE_SubmittedOnHold. The subsequent spooling of files releases the job, negating the effect of the hold attribute.

[Append remarks]

Remarks:

2011-Aug-17 15:03:16 by jfrey:
Fix committed to V7_6-branch for 7.6.3.


2011-Aug-17 15:17:36 by zmiller:
REVIEW DONE, CODE OKAY FOR STABLE SERIES

however, i would suggest making a child ticket to fix the actual problem in the development branch. i agree however that that's a larger change than we would want to make in the stable series.


2011-Aug-17 21:21:12 by matt:
Same as #1005?


2011-Aug-18 08:46:28 by jfrey:
Closely related to #1005. The central bug in this ticket is that the job can't run when hold=true and -submit are used together. The patch in this ticket addresses both problems by not allowing this combination of options. As Zach notes, we should consider a better fix that allows both options to be used together.
[Append remarks]

Properties:

Type: defect           Last Change: 2011-Aug-23 12:25
Status: resolved          Created: 2011-Aug-17 13:55
Fixed Version: v070603           Broken Version: v070602 
Priority:          Subsystem: Unknown 
Assigned To: jfrey           Derived From:  
Creator: jfrey  Rust:  
Customer Group: other  Visibility: public 
Notify:   Due Date:  

Derived Tickets:

#2399   Make hold=true work properly when spooling job files

Related Check-ins:

2011-Aug-17 14:43   Check-in [26827]: Disallow hold=true when spooling job files. #2398 hold=true conflicts with job-file spooling. Both want to set a HoldReason. If the wrong HoldReason is set, the schedd doesn't modify the paths in the job ad to refer to the spool area. Also, the job ends up in the idle state once spooling completes. Now, [...] (By Jaime Frey )