This is a collection of various magic numbers in Condor- and Grid-related software. These are mostly extracted from the source files for the software in question.
Originally written by Alan De Smet. Expanded by Jaime Frey.
Condor
Universe
JobUniverse in job ClassAds
0 |
Min |
A placeholder, not a universe |
1 |
Standard |
Single process relinked jobs |
2 |
Pipe |
A placeholder, no longer used |
3 |
Linda |
A placeholder, no longer used |
4 |
PVM |
Parallel Virtual Machine apps |
5 |
Vanilla |
Single process non-relinked jobs |
6 |
PVMD |
PVM daemon process |
7 |
Scheduler |
A job run under the schedd |
8 |
MPI |
Message Passing Interface jobs |
9 |
Grid / Globus |
Jobs managed by condor_gridmanager (V6.6: always Globus, V6.7: grid_type=gt2, gt3, gt4, condor, oracle, nordugrid...) |
10 |
Java |
Jobs for the Java Virtual Machine |
11 |
Parallel |
Generalized parallel jobs |
12 |
Local |
A job run under the schedd using a starter (advanced form of Scheduler) |
13 |
Max |
A placeholder, not a universe |
Job Status
JobStatus in job ClassAds
0 |
Unexpanded |
U |
1 |
Idle |
I |
2 |
Running |
R |
3 |
Removed |
X |
4 |
Completed |
C |
5 |
Held |
H |
6 |
Submission_err |
E |
Notification
JobNotification in job ClassAds
0 |
Never |
1 |
Always |
2 |
Complete |
3 |
Error |
Shadow exit status
(Source: h/exit.h)
Value |
Name |
Description |
4 |
JOB_EXCEPTION |
The job exited with an exception |
44 |
DPRINTF_ERROR |
There is a fatal error with dprintf() |
100 |
JOB_EXITED |
The job exited (not killed) |
101 |
JOB_CKPTED |
The job was checkpointed |
102 |
JOB_KILLED |
The job was killed |
103 |
JOB_COREDUMPED |
The job was killed and a core file produced |
105 |
JOB_NO_MEM |
Not enough memory to start the shadow |
106 |
JOB_SHADOW_USAGE |
incorrect arguments to condor_shadow |
107 |
JOB_NOT_CKPTED |
The job was kicked off without a checkpoint |
107 |
JOB_SHOULD_REQUEUE |
(!) We define this to the same number, since we want the same behavior. However, "JOB_NOT_CKPTED" doesn't mean much if we're not a standard universe job. The effect of this exit code is that we want the job to be put back in the job queue and run again. |
108 |
JOB_NOT_STARTED |
Can't connect to startd or request refused |
109 |
JOB_BAD_STATUS |
Job status != RUNNING on startup |
110 |
JOB_EXEC_FAILED |
Exec failed for some reason other than ENOMEM |
111 |
JOB_NO_CKPT_FILE |
There is no checkpoint file (lost) |
112 |
JOB_SHOULD_HOLD |
The job should be put on hold |
113 |
JOB_SHOULD_REMOVE |
The job should be removed |
Job Hold Reason Codes
(Source: src/condor_c++_util/condor_holdcodes.h)
Value |
Reason |
SubCode Contents |
0 |
Unspecified |
N/A |
1 |
UserRequest |
N/A |
2 |
GlobusGramError |
GRAM error code |
3 |
JobPolicy |
N/A |
4 |
CorruptedCredential |
N/A |
5 |
JobPolicyUndefined |
N/A |
6 |
FailedToCreateProcess |
Unix errno |
7 |
UnableToOpenOutput |
Unix errno |
8 |
UnableToOpenInput |
Unix errno |
9 |
UnableToOpenOutputStream |
Unix errno |
10 |
UnableToOpenInputStream |
Unix errno |
11 |
InvalidTransferAck |
N/A |
12 |
DownloadFileError |
Unix errno |
13 |
UploadFileError |
Unix errno |
14 |
IwdError |
Unix errno |
15 |
SubmittedOnHold |
N/A |
16 |
SpoolingInput |
N/A |
Log event codes
Submit |
0 |
Execute |
1 |
Executable error |
2 |
Checkpointed |
3 |
Job evicted |
4 |
Job terminated |
5 |
Image size |
6 |
Shadow exception |
7 |
Generic |
8 |
Job aborted |
9 |
Job suspended |
10 |
Job unsuspended |
11 |
Job held |
12 |
Job released |
13 |
Node execute |
14 |
Node terminated |
15 |
Post script terminated |
16 |
Globus submit |
17 |
Globus submit failed |
18 |
Globus resource up |
19 |
Globus resource down |
20 |
Remote error |
21 |
Job disconnected |
22 |
Job reconnected |
23 |
Job reconnect failed |
24 |
Grid resource up |
25 |
Grid resource down |
26 |
Grid submit |
27 |
{subseciton: Starters and Shadows}
Shadow |
Starter |
Universe |
jim |
jim |
PVM |
V6 |
V5 |
Standard |
V6.1 |
V6.1 |
Everything else |
Queue-Management Commands
Number |
Action |
10001 |
InitializeConnection |
10002 |
NewCluster |
10003 |
NewProc |
10004 |
DestroyCluster |
10005 |
DestroyProc |
10006 |
SetAttribute |
10007 |
CloseConnection |
10008 |
GetAttributeFloat |
10009 |
GetAttributeInt |
10010 |
GetAttributeString |
10011 |
GetAttributeExpr |
10012 |
DeleteAttribute |
10013 |
GetNextJob |
10014 |
FirstAttribute |
10015 |
NextAttribute |
10016 |
DestroyClusterByConstraint |
10017 |
SendSpoolFile |
10018 |
GetJobAd |
10019 |
GetJobByConstraint |
10020 |
GetNextJobByConstraint |
10021 |
SetAttributeByConstraint |
10022 |
InitializeReadOnlyConnection |
10023 |
BeginTransaction |
10024 |
AbortTransaction |
10025 |
SetTimerAttribute |
ClassAd Log Commands
Number |
Action |
101 |
NewClassAd |
102 |
DestroyClassAd |
103 |
SetAttribute |
104 |
DeleteAttribute |
105 |
BeginTransaction |
106 |
EndTransaction |
DaemonCore Commands and Signals
From condor_includes/condor_commands.h
Name |
Number |
DC_RAISESIGNAL |
60000 |
DC_PROCESSEXIT |
60001 |
DC_CONFIG_PERSIST |
60002 |
DC_CONFIG_RUNTIME |
60003 |
DC_RECONFIG |
60004 |
DC_OFF_GRACEFUL |
60005 |
DC_OFF_FAST |
60006 |
DC_CONFIG_VAL |
60007 |
DC_CHILDALIVE |
60008 |
DC_SERVICEWAITPIDS |
60009 |
DC_AUTHENTICATE |
60010 |
DC_NOP |
60011 |
DC_RECONFIG_FULL |
60012 |
DC_FETCH_LOG |
60013 |
DC_INVALIDATE_KEY |
60014 |
DC_OFF_PEACEFUL |
60015 |
DC_SET_PEACEFUL_SHUTDOWN |
60016 |
DC_TIME_OFFSET |
60017 |
CEDAR Authentication Methods
From condor_includes/condor_auth.h and condor_io/condor_secman.C
Name |
Number |
CLAIMTOBE |
2 |
FS |
4 |
FS_REMOTE |
8 |
NTSSPI |
16 |
GSI |
32 |
KERBEROS |
64 |
ANONYMOUS |
128 |
SSL |
256 |
PASSWORD |
512 |
Globus
GRAM Error Codes
From Globus 2.2.4, globus_gram_protocol-5.0/globus_gram_protocol_constants.h and globus_gram_protocol_error.c
Value |
GLOBUS_GRAM_PROTOCOL_ERROR_... |
Error String |
0 |
|
Success |
1 |
PARAMETER_NOT_SUPPORTED |
one of the RSL parameters is not supported |
2 |
INVALID_REQUEST |
the RSL length is greater than the maximum allowed |
3 |
NO_RESOURCES |
an I/O operation failed |
4 |
BAD_DIRECTORY |
jobmanager unable to set default to the directory requested |
5 |
EXECUTABLE_NOT_FOUND |
the executable does not exist |
6 |
INSUFFICIENT_FUNDS |
of an unused INSUFFICIENT_FUNDS |
7 |
AUTHORIZATION |
authentication with the remote server failed |
8 |
USER_CANCELLED |
the user cancelled the job |
9 |
SYSTEM_CANCELLED |
the system cancelled the job |
10 |
PROTOCOL_FAILED |
data transfer to the server failed |
11 |
STDIN_NOT_FOUND |
the stdin file does not exist |
12 |
CONNECTION_FAILED |
the connection to the server failed (check host and port) |
13 |
INVALID_MAXTIME |
the provided RSL 'maxtime' value is not an integer |
14 |
INVALID_COUNT |
the provided RSL 'count' value is not an integer |
15 |
NULL_SPECIFICATION_TREE |
the job manager received an invalid RSL |
16 |
JM_FAILED_ALLOW_ATTACH |
the job manager failed in allowing others to make contact |
17 |
JOB_EXECUTION_FAILED |
the job failed when the job manager attempted to run it |
18 |
INVALID_PARADYN |
an invalid paradyn was specified |
19 |
INVALID_JOBTYPE |
the provided RSL 'jobtype' value is invalid |
20 |
INVALID_GRAM_MYJOB |
the provided RSL 'myjob' value is invalid |
21 |
BAD_SCRIPT_ARG_FILE |
the job manager failed to locate an internal script argument file |
22 |
ARG_FILE_CREATION_FAILED |
the job manager failed to create an internal script argument file |
23 |
INVALID_JOBSTATE |
the job manager detected an invalid job state |
24 |
INVALID_SCRIPT_REPLY |
the job manager detected an invalid script response |
25 |
INVALID_SCRIPT_STATUS |
the job manager detected an invalid script status |
26 |
JOBTYPE_NOT_SUPPORTED |
the provided RSL 'jobtype' value is not supported by this job manager |
27 |
UNIMPLEMENTED |
unused ERROR_UNIMPLEMENTED |
28 |
TEMP_SCRIPT_FILE_FAILED |
the job manager failed to create an internal script submission file |
29 |
USER_PROXY_NOT_FOUND |
the job manager cannot find the user proxy |
30 |
OPENING_USER_PROXY |
the job manager failed to open the user proxy |
31 |
JOB_CANCEL_FAILED |
the job manager failed to cancel the job as requested |
32 |
MALLOC_FAILED |
system memory allocation failed |
33 |
DUCT_INIT_FAILED |
the interprocess job communication initialization failed |
34 |
DUCT_LSP_FAILED |
the interprocess job communication setup failed |
35 |
INVALID_HOST_COUNT |
the provided RSL 'host count' value is invalid |
36 |
UNSUPPORTED_PARAMETER |
one of the provided RSL parameters is unsupported |
37 |
INVALID_QUEUE |
the provided RSL 'queue' parameter is invalid |
38 |
INVALID_PROJECT |
the provided RSL 'project' parameter is invalid |
39 |
RSL_EVALUATION_FAILED |
the provided RSL string includes variables that could not be identified |
40 |
BAD_RSL_ENVIRONMENT |
the provided RSL 'environment' parameter is invalid |
41 |
DRYRUN |
the provided RSL 'dryrun' parameter is invalid |
42 |
ZERO_LENGTH_RSL |
the provided RSL is invalid (an empty string) |
43 |
STAGING_EXECUTABLE |
the job manager failed to stage the executable |
44 |
STAGING_STDIN |
the job manager failed to stage the stdin file |
45 |
INVALID_JOB_MANAGER_TYPE |
the requested job manager type is invalid |
46 |
BAD_ARGUMENTS |
the provided RSL 'arguments' parameter is invalid |
47 |
GATEKEEPER_MISCONFIGURED |
the gatekeeper failed to run the job manager |
48 |
BAD_RSL |
the provided RSL could not be properly parsed |
49 |
VERSION_MISMATCH |
there is a version mismatch between GRAM components |
50 |
RSL_ARGUMENTS |
the provided RSL 'arguments' parameter is invalid |
51 |
RSL_COUNT |
the provided RSL 'count' parameter is invalid |
52 RSL_DIRECTORY the provided RSL 'directory' parameter is invalid
53 RSL_DRYRUN the provided RSL 'dryrun' parameter is invalid
54 RSL_ENVIRONMENT the provided RSL 'environment' parameter is invalid
55 RSL_EXECUTABLE the provided RSL 'executable' parameter is invalid
56 RSL_HOST_COUNT the provided RSL 'host_count' parameter is invalid
57 RSL_JOBTYPE the provided RSL 'jobtype' parameter is invalid
58 RSL_MAXTIME the provided RSL 'maxtime' parameter is invalid
59 RSL_MYJOB the provided RSL 'myjob' parameter is invalid
60 RSL_PARADYN the provided RSL 'paradyn' parameter is invalid
61 RSL_PROJECT the provided RSL 'project' parameter is invalid
62 RSL_QUEUE the provided RSL 'queue' parameter is invalid
63 RSL_STDERR the provided RSL 'stderr' parameter is invalid
64 RSL_STDIN the provided RSL 'stdin' parameter is invalid
65 RSL_STDOUT the provided RSL 'stdout' parameter is invalid
66 OPENING_JOBMANAGER_SCRIPT the job manager failed to locate an internal script
67 CREATING_PIPE the job manager failed on the system call pipe()
68 FCNTL_FAILED the job manager failed on the system call fcntl()
69 STDOUT_FILENAME_FAILED the job manager failed to create the temporary stdout filename
70 STDERR_FILENAME_FAILED the job manager failed to create the temporary stderr filename
71 FORKING_EXECUTABLE the job manager failed on the system call fork()
72 EXECUTABLE_PERMISSIONS the executable file permissions do not allow execution
73 OPENING_STDOUT the job manager failed to open stdout
74 OPENING_STDERR the job manager failed to open stderr
75 OPENING_CACHE_USER_PROXY the cache file could not be opened in order to relocate the user proxy
76 OPENING_CACHE cannot access cache files in ~/.globus/.gass_cache, check permissions, quota, and disk space
77 INSERTING_CLIENT_CONTACT the job manager failed to insert the contact in the client contact list
78 CLIENT_CONTACT_NOT_FOUND the contact was not found in the job manager's client contact list
79 CONTACTING_JOB_MANAGER connecting to the job manager failed. Possible reasons: job terminated, invalid job contact, network problems, ...
80 INVALID_JOB_CONTACT the syntax of the job contact is invalid
81 UNDEFINED_EXE the executable parameter in the RSL is undefined
82 CONDOR_ARCH the job manager service is misconfigured. condor arch undefined
83 CONDOR_OS the job manager service is misconfigured. condor os undefined
84 RSL_MIN_MEMORY the provided RSL 'min_memory' parameter is invalid
85 RSL_MAX_MEMORY the provided RSL 'max_memory' parameter is invalid
86 INVALID_MIN_MEMORY the RSL 'min_memory' value is not zero or greater
87 INVALID_MAX_MEMORY the RSL 'max_memory' value is not zero or greater
88 HTTP_FRAME_FAILED the creation of a HTTP message failed
89 HTTP_UNFRAME_FAILED parsing incoming HTTP message failed
90 HTTP_PACK_FAILED the packing of information into a HTTP message failed
91 HTTP_UNPACK_FAILED an incoming HTTP message did not contain the expected information
92 INVALID_JOB_QUERY the job manager does not support the service that the client requested
93 SERVICE_NOT_FOUND the gatekeeper failed to find the requested service
94 JOB_QUERY_DENIAL the jobmanager does not accept any new requests (shutting down)
95 CALLBACK_NOT_FOUND the client failed to close the listener associated with the callback URL
96 BAD_GATEKEEPER_CONTACT the gatekeeper contact cannot be parsed
97 POE_NOT_FOUND the job manager could not find the 'poe' command
98 MPIRUN_NOT_FOUND the job manager could not find the 'mpirun' command
99 RSL_START_TIME the provided RSL 'start_time' parameter is invalid
100 RSL_RESERVATION_HANDLE the provided RSL 'reservation_handle' parameter is invalid
101 RSL_MAX_WALL_TIME the provided RSL 'max_wall_time' parameter is invalid
102 INVALID_MAX_WALL_TIME the RSL 'max_wall_time' value is not zero or greater
103 RSL_MAX_CPU_TIME the provided RSL 'max_cpu_time' parameter is invalid
104 INVALID_MAX_CPU_TIME the RSL 'max_cpu_time' value is not zero or greater
105 JM_SCRIPT_NOT_FOUND the job manager is misconfigured, a scheduler script is missing
106 JM_SCRIPT_PERMISSIONS the job manager is misconfigured, a scheduler script has invalid permissions
107 SIGNALING_JOB the job manager failed to signal the job
108 UNKNOWN_SIGNAL_TYPE the job manager did not recognize/support the signal type
109 GETTING_JOBID the job manager failed to get the job id from the local scheduler
110 WAITING_FOR_COMMIT the job manager is waiting for a commit signal
111 COMMIT_TIMED_OUT the job manager timed out while waiting for a commit signal
112 RSL_SAVE_STATE the provided RSL 'save_state' parameter is invalid
113 RSL_RESTART the provided RSL 'restart' parameter is invalid
114 RSL_TWO_PHASE_COMMIT the provided RSL 'two_phase' parameter is invalid
115 INVALID_TWO_PHASE_COMMIT the RSL 'two_phase' value is not zero or greater
116 RSL_STDOUT_POSITION the provided RSL 'stdout_position' parameter is invalid
117 INVALID_STDOUT_POSITION the RSL 'stdout_position' value is not zero or greater
118 RSL_STDERR_POSITION the provided RSL 'stderr_position' parameter is invalid
119 INVALID_STDERR_POSITION the RSL 'stderr_position' value is not zero or greater
120 RESTART_FAILED the job manager restart attempt failed
121 NO_STATE_FILE the job state file doesn't exist
122 READING_STATE_FILE could not read the job state file
123 WRITING_STATE_FILE could not write the job state file
124 OLD_JM_ALIVE old job manager is still alive
125 TTL_EXPIRED job manager state file TTL expired
126 SUBMIT_UNKNOWN it is unknown if the job was submitted
127 RSL_REMOTE_IO_URL the provided RSL 'remote_io_url' parameter is invalid
128 WRITING_REMOTE_IO_URL could not write the remote io url file
129 STDIO_SIZE the standard output/error size is different
130 JM_STOPPED the job manager was sent a stop signal (job is still running)
131 USER_PROXY_EXPIRED the user proxy expired (job is still running)
132 JOB_UNSUBMITTED the job was not submitted by original jobmanager
133 INVALID_COMMIT the job manager is not waiting for that commit signal
134 RSL_SCHEDULER_SPECIFIC the provided RSL scheduler specific parameter is invalid
135 STAGE_IN_FAILED the job manager could not stage in a file
136 INVALID_SCRATCH the scratch directory could not be created
137 RSL_CACHE the provided 'gass_cache' parameter is invalid
138 INVALID_SUBMIT_ATTRIBUTE the RSL contains attributes which are not valid for job submission
139 INVALID_STDIO_UPDATE_ATTRIBUTE the RSL contains attributes which are not valid for stdio update
140 INVALID_RESTART_ATTRIBUTE the RSL contains attributes which are not valid for job restart
141 RSL_FILE_STAGE_IN the provided RSL 'file_stage_in' parameter is invalid
142 RSL_FILE_STAGE_IN_SHARED the provided RSL 'file_stage_in_shared' parameter is invalid
143 RSL_FILE_STAGE_OUT the provided RSL 'file_stage_out' parameter is invalid
144 RSL_GASS_CACHE the provided RSL 'gass_cache' parameter is invalid
145 RSL_FILE_CLEANUP the provided RSL 'file_cleanup' parameter is invalid
146 RSL_SCRATCH the provided RSL 'scratch_dir' parameter is invalid
147 INVALID_SCHEDULER_SPECIFIC the provided scheduler-specific RSL parameter is invalid
148 UNDEFINED_ATTRIBUTE a required RSL attribute was not defined in the RSL spec
149 INVALID_CACHE the gass_cache attribute points to an invalid cache directory
150 INVALID_SAVE_STATE the provided RSL 'save_state' parameter has an invalid value
151 OPENING_VALIDATION_FILE the job manager could not open the RSL attribute validation file
152 READING_VALIDATION_FILE the job manager could not read the RSL attribute validation file
153 RSL_PROXY_TIMEOUT the provided RSL 'proxy_timeout' is invalid
154 INVALID_PROXY_TIMEOUT the RSL 'proxy_timeout' value is not greater than zero
155 STAGE_OUT_FAILED the job manager could not stage out a file
156 JOB_CONTACT_NOT_FOUND the job contact string does not match any which the job manager is handling
157 DELEGATION_FAILED proxy delegation failed
158 LOCKING_STATE_LOCK_FILE the job manager could not lock the state lock file
159 INVALID_ATTR an invalid globus_io_clientattr_t was used.
160 NULL_PARAMETER an null parameter was passed to the gram library
161 STILL_STREAMING the job manager is still streaming output
162 AUTHORIZATION_DENIED the authorization system denied the request
163 AUTHORIZATION_SYSTEM_FAILURE the authorization system reported a failure
164 AUTHORIZATION_DENIED_JOB_ID the authorization system denied the request - invalid job id
165 AUTHORIZATION_DENIED_EXECUTABLE the authorization system denied the request - not authorized to run the specified executable
166 RSL_USER_NAME the provided RSL 'user_name' parameter is invalid.
167 INVALID_USER_NAME the job is not running in the account named by the 'user_name' parameter.
168 LAST
GRAM Job States
Value |
GLOBUS_GRAM_PROTOCOL_JOB_STATE_... |
Description |
1 |
PENDING |
The job is waiting for resources to become available to run. |
2 |
ACTIVE |
The job has received resources and the application is executing. |
4 |
FAILED |
The job terminated before completion because an error, user-triggered cancel, or system-triggered cancel. |
8 |
DONE |
The job completed successfully |
16 |
SUSPENDED |
The job has been suspended. Resources which were allocated for this job may have been released due to some scheduler-specific reason. |
32 |
UNSUBMITTED |
The job has not been submitted to the scheduler yet, pending the reception of the GLOBUS_GRAM_PROTOCOL_JOB_SIGNAL_COMMIT_REQUEST signal from a client. |
64 |
STAGE_IN |
The job manager is staging in files to run the job. |
128 |
STAGE_OUT |
The job manager is staging out files generated by the job. |
0xFFFFF |
ALL |
A mask of all job states. |
GRAM Signals
globus_gram_protocol_job_signal_t GRAM Signals
Value |
GLOBUS_GRAM_PROTOCOL_JOB_SIGNAL_... |
Description |
1 |
CANCEL |
Cancel a job |
2 |
SUSPEND |
Suspend a job |
3 |
RESUME |
Resume a previously suspended job |
4 |
PRIORITY |
Change the priority of a job |
5 |
COMMIT_REQUEST |
Signal the job manager to commence with a job submission if the job request was accompanied by the (two_state=yes) RSL attribute. |
6 |
COMMIT_EXTEND |
Signal the job manager to wait an additional number of seconds (specified by an integer value string as the signal's argument) before timing out a two-phase job commit. |
7 |
STDIO_UPDATE |
Signal the job manager to change the way it is currently handling standard output and/or standard error. The argument for this signal is an RSL containing new @a stdout, @a stderr, @a stdout_position, @a stderr_position, or @a remote_io_url relations. |
8 |
STDIO_SIZE |
Signal the job manager to verify that streamed I/O has been completely received. The argument to this signal contains the number of bytes of stdout and stderr received, seperated by a space. The reply to this signal will be a SUCCESS message if these matched the amount sent by the job manager. Otherwise, an error reply indicating GLOBUS_GRAM_PROTOCOL_ERROR_STDIO_SIZE is returned. If standard output and standard error are merged, only one number should be sent as an argument to this signal. An argument of -1 for either stream size indicates that the client is not interested in the size of that stream. |
9 |
STOP_MANAGER |
Signal the job manager to stop managing the current job and terminate. The job continues to run as normal. The job manager will send a state change callback with the job status being FAILED and the error GLOBUS_GRAM_PROTOCOL_ERROR_JM_STOPPED. |
10 |
COMMIT_END |
Signal the job manager to clean up after the completion of the job if the job RSL contained the (two-phase = yes) relation. |
Misc
Well Known Ports
condor negotiator |
9614 |
(obsolete, dynamic in 6.7.x) |
condor collector |
9618 |
GT2 gatekeeper |
2119 |
gridftp |
2811 |
GT4 web services |
8443 |