Ticket #1786: review treatment of undefined policy settings in startd

Miron suggested that we review the treatment of undefined policy expressions, starting with the startd. I have summarized below the current treatment as of 7.5.4.

Meaning of the columns in the following table:

Meaning of the values in the following table:

Variable Unconfigured Undefined Error
START Error False (S) False (S)
PREEMPT Error Error Error
SUSPEND Error Error Error
CONTINUE Error Error Error
WANT_HOLD False (S) False Error
WANT_SUSPEND Error False (S) False (S)
WANT_VACATE Error Error Error
KILL Error Error Error
PERIODIC_CHECKPOINT False (S) False (S) False (S)
PREEMPT_VANILLA Ignore (S) Ignore (S) Ignore (S)
SUSPEND_VANILLA Ignore (S) Ignore (S) Ignore (S)
CONTINUE_VANILLA Ignore (S) Ignore (S) Ignore (S)
WANT_SUSPEND_VANILLA Ignore (S) Ignore (S) Ignore (S)
WANT_VACATE_VANILLA Ignore (S) Ignore (S) Ignore (S)
KILL_VANILLA Ignore (S) Ignore (S) Ignore (S)

Prior to 7.4.2, if WANT_SUSPEND evaluated to undefined, this was treated as an error. This was changed to be treated as false because numerous instances occurred in which admins were surprised to have their startd exit when a job showed up lacking some attribute the WANT_SUSPEND expression referenced. In discussion, it was concluded that in every known case of this problem, the desired behavior was for WANT_SUSPEND to be treated as false, so a change was made to the code to implement this behavior (#1001).

Proposal

Treat 'unconfigured' and 'undefined' and 'error' as semantically equivalent. For all the cases mentioned in the table above where 'undefined' leads to an error, it should instead be equivalent to 'false'. We should, however, provide useful diagnostics. For example, it should not log that SUSPEND is false when it is actually 'undefined' or 'error'.

[Append remarks]

Remarks:

2010-Nov-30 23:41:49 by matt:
The proposal sounds good, but what about making CONTINUE default to True? Also, please make a policy object to create a central place where code managing these parameters can be documented and handled consistently.


2010-Dec-01 17:08:01 by danb:
I notice that the <x>_VANILLA knobs are barely documented. I propose that we phase them out (warning in 7.5, remove in 7.7). We can ask the users first to see if anybody cares. I would be surprised if anybody does. In my opinion, they just make things more complicated, and it's not clear why knobs exist for just the vanilla universe. I see half-implemented support for similar VM universe knobs.
2011-Jan-27 14:50:47 by danb:
Bulk change of target version from v070505 to v070506 using ./ticket-target-mover.
2011-Feb-01 16:20:02 by tannenba:
Bulk change of target version from v070506 to NULL using ./ticket-target-mover.
[Append remarks]

Properties:

Type: enhance           Last Change: 2011-Feb-01 16:21
Status: new          Created: 2010-Nov-29 18:20
Fixed Version:            Broken Version: v070504 
Priority:          Subsystem: Daemons 
Assigned To:            Derived From:  
Creator: danb  Rust:  
Customer Group: other  Visibility: public 
Notify: dan@hep.wisc.edu, miron@cs.wisc.edu, tannenba@cs.wisc.edu, matt@cs.wisc.edu  Due Date: