Page History
- 2019-Jun-24 14:45 jfrey
- 2017-Dec-15 11:11 gthain
- 2014-Mar-18 09:43 bt
- 2013-Jun-26 11:41 tannenba
- 2013-Mar-08 16:25 gthain
- 2013-Mar-08 16:25 gthain
- 2012-Nov-13 16:20 adesmet
- 2012-Mar-12 17:57 adesmet
- 2011-Feb-04 13:50 jfrey
- 2011-Jan-21 16:18 tannenba
- 2010-Aug-16 14:34 adesmet
- 2010-Apr-28 09:59 jfrey
- 2010-Mar-23 17:41 pfc
- 2010-Mar-23 17:16 pfc
- 2009-Sep-11 11:14 adesmet
- 2009-Jul-08 15:39 jfrey
- 2009-Jun-15 16:09 jfrey
- 2009-Jun-15 16:05 jfrey
- 2009-Jun-15 13:54 jfrey
- 2009-Jun-12 10:29 jfrey
- 2009-Jun-11 16:40 jfrey
- 2009-Jun-02 15:42 jfrey
- 2009-Jun-02 15:13 jfrey
Originally written by Alan De Smet. Expanded by Jaime Frey.
Condor
Universe
JobUniverse in job ClassAds
0 | Min | A placeholder, not a universe |
1 | Standard | Single process relinked jobs |
2 | Pipe | A placeholder, no longer used |
3 | Linda | A placeholder, no longer used |
4 | PVM | Parallel Virtual Machine apps |
5 | Vanilla | Single process non-relinked jobs |
6 | PVMD | PVM daemon process |
7 | Scheduler | A job run under the schedd |
8 | MPI | Message Passing Interface jobs |
9 | Grid / Globus | Jobs managed by condor_gridmanager (V6.6: always Globus, V6.7: grid_type=gt2, gt3, gt4, condor, oracle, nordugrid...) |
10 | Java | Jobs for the Java Virtual Machine |
11 | Parallel | Generalized parallel jobs |
12 | Local | A job run under the schedd using a starter (advanced form of Scheduler) |
13 | Max | A placeholder, not a universe |
Job Status
JobStatus in job ClassAds
0 | Unexpanded | U |
1 | Idle | I |
2 | Running | R |
3 | Removed | X |
4 | Completed | C |
5 | Held | H |
6 | Submission_err | E |
Notification
JobNotification in job ClassAds
0 | Never |
1 | Always |
2 | Complete |
3 | Error |
Shadow exit status
(Source: h/exit.h)
Value | Name | Description |
4 | JOB_EXCEPTION | The job exited with an exception |
44 | DPRINTF_ERROR | There is a fatal error with dprintf() |
100 | JOB_EXITED | The job exited (not killed) |
101 | JOB_CKPTED | The job was checkpointed |
102 | JOB_KILLED | The job was killed |
103 | JOB_COREDUMPED | The job was killed and a core file produced |
105 | JOB_NO_MEM | Not enough memory to start the shadow |
106 | JOB_SHADOW_USAGE | incorrect arguments to condor_shadow |
107 | JOB_NOT_CKPTED | The job was kicked off without a checkpoint |
107 | JOB_SHOULD_REQUEUE | (!) We define this to the same number, since we want the same behavior. However, "JOB_NOT_CKPTED" doesn't mean much if we're not a standard universe job. The effect of this exit code is that we want the job to be put back in the job queue and run again. |
108 | JOB_NOT_STARTED | Can't connect to startd or request refused |
109 | JOB_BAD_STATUS | Job status != RUNNING on startup |
110 | JOB_EXEC_FAILED | Exec failed for some reason other than ENOMEM |
111 | JOB_NO_CKPT_FILE | There is no checkpoint file (lost) |
112 | JOB_SHOULD_HOLD | The job should be put on hold |
113 | JOB_SHOULD_REMOVE | The job should be removed |
Job Hold Reason Codes
(Source: src/condor_c++_util/condor_holdcodes.h)
Value | Reason | SubCode Contents |
0 | Unspecified | N/A |
1 | UserRequest | N/A |
2 | GlobusGramError | GRAM error code |
3 | JobPolicy | N/A |
4 | CorruptedCredential | N/A |
5 | JobPolicyUndefined | N/A |
6 | FailedToCreateProcess | Unix errno |
7 | UnableToOpenOutput | Unix errno |
8 | UnableToOpenInput | Unix errno |
9 | UnableToOpenOutputStream | Unix errno |
10 | UnableToOpenInputStream | Unix errno |
11 | InvalidTransferAck | N/A |
12 | DownloadFileError | Unix errno |
13 | UploadFileError | Unix errno |
14 | IwdError | Unix errno |
15 | SubmittedOnHold | N/A |
16 | SpoolingInput | N/A |
Log event codes
Submit | 0 |
Execute | 1 |
Executable error | 2 |
Checkpointed | 3 |
Job evicted | 4 |
Job terminated | 5 |
Image size | 6 |
Shadow exception | 7 |
Generic | 8 |
Job aborted | 9 |
Job suspended | 10 |
Job unsuspended | 11 |
Job held | 12 |
Job released | 13 |
Node execute | 14 |
Node terminated | 15 |
Post script terminated | 16 |
Globus submit | 17 |
Globus submit failed | 18 |
Globus resource up | 19 |
Globus resource down | 20 |
Remote error | 21 |
Job disconnected | 22 |
Job reconnected | 23 |
Job reconnect failed | 24 |
Grid resource up | 25 |
Grid resource down | 26 |
Grid submit | 27 |
{subseciton: Starters and Shadows}
Shadow | Starter | Universe |
jim | jim | PVM |
V6 | V5 | Standard |
V6.1 | V6.1 | Everything else |
Queue-Management Commands
Number | Action |
10001 | InitializeConnection |
10002 | NewCluster |
10003 | NewProc |
10004 | DestroyCluster |
10005 | DestroyProc |
10006 | SetAttribute |
10007 | CloseConnection |
10008 | GetAttributeFloat |
10009 | GetAttributeInt |
10010 | GetAttributeString |
10011 | GetAttributeExpr |
10012 | DeleteAttribute |
10013 | GetNextJob |
10014 | FirstAttribute |
10015 | NextAttribute |
10016 | DestroyClusterByConstraint |
10017 | SendSpoolFile |
10018 | GetJobAd |
10019 | GetJobByConstraint |
10020 | GetNextJobByConstraint |
10021 | SetAttributeByConstraint |
10022 | InitializeReadOnlyConnection |
10023 | BeginTransaction |
10024 | AbortTransaction |
10025 | SetTimerAttribute |
ClassAd Log Commands
Number | Action |
101 | NewClassAd |
102 | DestroyClassAd |
103 | SetAttribute |
104 | DeleteAttribute |
105 | BeginTransaction |
106 | EndTransaction |
DaemonCore Commands and Signals
From condor_includes/condor_commands.h
Name | Number |
DC_RAISESIGNAL | 60000 |
DC_PROCESSEXIT | 60001 |
DC_CONFIG_PERSIST | 60002 |
DC_CONFIG_RUNTIME | 60003 |
DC_RECONFIG | 60004 |
DC_OFF_GRACEFUL | 60005 |
DC_OFF_FAST | 60006 |
DC_CONFIG_VAL | 60007 |
DC_CHILDALIVE | 60008 |
DC_SERVICEWAITPIDS | 60009 |
DC_AUTHENTICATE | 60010 |
DC_NOP | 60011 |
DC_RECONFIG_FULL | 60012 |
DC_FETCH_LOG | 60013 |
DC_INVALIDATE_KEY | 60014 |
DC_OFF_PEACEFUL | 60015 |
DC_SET_PEACEFUL_SHUTDOWN | 60016 |
DC_TIME_OFFSET | 60017 |
CEDAR Authentication Methods
From condor_includes/condor_auth.h and condor_io/condor_secman.C
Name | Number |
CLAIMTOBE | 2 |
FS | 4 |
FS_REMOTE | 8 |
NTSSPI | 16 |
GSI | 32 |
KERBEROS | 64 |
ANONYMOUS | 128 |
SSL | 256 |
PASSWORD | 512 |
Globus
TODO
Misc
TODO