Ticket #2536: make upper bound on time spent in Preempting/Vacate explicit
For adding support for draining partitionable slots (#2330), it is desired to make the upper bound on time spent in
Currently, the boolean KILL expression is used to terminate the
Preempting/Vacating state. This makes it awkward to compute how long it could take to drain a slot, because the upper bound on time, if any, is embedded in the boolean expression. Also, there is no standard way for the job to express its requirements or preferences.
The following is proposed:
MachineMaxVacateTime: this is the maximum time in seconds that the machine will wait before escalating from a graceful shutdown of the job to a fast shutdown. This is evaluated when the job starts. This configuration setting is required. It defaults to 10 minutes, like the default for KILL used to be.
- In addition, the timing of the beginning of eviction will be changed. Previously, eviction was initiated at the end of the promised
MaxJobRetirementTime. Now, if there is sufficient retirement time left, the goal is to begin evicting the job before the end of retirement so that if the job uses the full eviction time, it will finish by the end of retirement. If there is not sufficient retirement time left, eviction will be initiated without delay.
JobMaxVacateTime: this is an optional expression supplied by the job. If it yields a number smaller than
MachineMaxVacateTime, the job's value is used in place of the machine's. Alternatively, if there is sufficient retirement time to accommodate it,
JobMaxVacateTime may increase the effective vacate time so that eviction is initiated earlier during retirement than it would have otherwise been.
- WANT_VACATE will still work as it does today. This means that even if
MachineMaxVacateTime is non-zero, if WANT_VACATE evaluates to false when the job is being evicted, the graceful shutdown stage will be skipped.
- KILL will still work as it does today, except that it cannot be used to grant more time than offered by
MachineMaxVacateTime. Normally, KILL should just be set to
MachineMaxVacateTime should be used to control how long to wait. However, in unusual cases, KILL could be used to abort the graceful shutdown of a job. The new default will be
- In addition, a change to the job attributes
KillSig is proposed. Currently, in unix for vanilla jobs, the fast shutdown mode of the starter will send
RmKillSig if it is defined, and if not, falling back to
KillSig, and if that is not defined, falling back to SIGKILL (see #1198). This means it is possible for a job to ignore the fast shutdown signal, so the starter has some logic to escalate to SIGKILL if a custom signal was defined. One issue with this implementation is that the starter doesn't really know that the job is being removed; it only knows that fast shutdown is ensuing, so the semantics of
KillSig are a bit muddy. Another issue is that the signal escalation seems to duplicate functionality of graceful shutdown mode, so there is added complexity that may not really be necessary. Another issue is that the feature is tied to unix and to vanilla jobs. What if one wants graceful removal of windows jobs? We are still gathering requirements to find out what exactly is needed. Based on the feedback we get, we plan to select from one of the following options. [We have not received any indication that the second option is needed, so we have implemented just the first option. The second option can be added later if needed.]
- [DONE] Make fast shutdown of the job always use a hard kill. When requested by the user, have the shadow initiate a graceful shutdown of the job rather than a fast one. For condor_rm, the user could request this behavior by putting WantGracefulRemoval=True in the job ad. This is platform and job universe independent. This is the preferred option unless users need the greater flexibility provided by the second option.
- [NOT DONE] Everything as in the first option but additionally, when the shadow requests graceful shutdown of the job, it specifies a reason. The reason is used to select a custom kill signal, which may be specific to the cause of the shutdown: job removal, hold, condor_vacate, condor_vacate_job, PREEMPT, and so on.
Allow old signal escalation semantics and to use remove_killI_sig|