{section: How to Rendezvous in the HTCondor Parallel Universe} {subsection: Introduction} When running parallel universe jobs, often one node of a job would like to send some small amount of data to the other nodes in the job. This can be an IP address and port, or a filename, or some other object to synchronize on. HTCondor chirp command makes this straightforward. The basic idea is that the sending process writes the information into the job ad with condor_chirp sett_job_attr, and the receivers poll this information from the job classad with condor_chirp get_job_attr. {subsection: Example submit file} Assuming that the dedicated scheduler has been configured, create a submit file like this: {code} universe = parallel executable = run.sh should_transfer_files = yes when_to_transfer_output = on_exit getenv = true output = out.$(Node) error = err.$(Node) log = log machine_count = 2 +WantRemoteIO = true queue {endcode} {subsection: The run.sh script} The run.sh script first checks to see if it is the client or server. Arbitrarily, we pick node 0 for the server, and 1 for the client. Node 0 picks a string, inserts it into the job ad attribute called JobServerAddress, and sleeps. Node 1 polls for this value, prints it out when it is available, then exits. Obviously, this value could be the address of a port listening on, or anything else the two processes would like to synchronize on. {code} #!/bin/sh CONDOR_CHIRP=`condor_config_val LIBEXEC`/condor_chirp # Tell chirp where the chirp config file is export _CONDOR_CHIRP_CONFIG=$_CONDOR_SCRATCH_DIR/.chirp_config if [[ $_CONDOR_PROCNO = "0" ]] then # I'm the server $CONDOR_CHIRP set_job_attr JobServerAddress '"This is my address"' # Do server things here... sleep 3600 exit 0 fi if [[ $_CONDOR_PROCNO = "1" ]] then # I'm the client until $CONDOR_CHIRP get_job_attr JobServerAddress > /dev/null 2>&1 do sleep 2 done JobServerAddress=`$CONDOR_CHIRP get_job_attr JobServerAddress` # I've got the address, do client things here like connect to the server.. echo "JobServerAddress is $JobServerAddress" exit 0 fi exit 0 {endcode}