{verbatim} Flow of the TransferDaemon -------------------------- Start of TD daemon: + Globally construct a TransferD object. MAIN() CONTROL: libc calls main(): + Set up daemoncore, call dc_main() which calls main_init() main_init(): + Call g_td.init(); + Parse command line and create object representing command line options. + The TransferD object has a "Features" object called m_features. This holds command line options and all their defaults. + The valid command line options are: --schedd To which schedd should the transferd register itself. --stdin Get TransferRequests from stdin. --id Used by the transferd to identify itself to the schedd. A shared "secret" used only to pair an incoming request of a registering transferd and the schedd's forking of it. This is because the schedd can fork more than one at a time and need to know which registration mmatches which reason the schedd needed it. --timeout If there is no work to do, exit in number of seconds. --shadow [DEMO option, may not be used in production] The transferd must get a transferrequest via stdin, and then will connect to a shadow to perform up/down load as requested. + If using stdin, call TransferD::accept_transfer_request(). + Read the first line of stdin, this represents an "encapsulation method" which dictates how to process the stuff that comes after. The function encap_method() converts this line to an enum. Currently, there is only one: ENCAP_METHOD_OLD_CLASSADS (and ENCAP_METHOD_UNKNOWN which means I don't know what kind to use). + If the encapsulation is validly ENCAP_METHOD_OLD_CLASSADS, then call the function accept_transfer_request_encapsulation_old_classads(). [accept_transfer_request_encapsulation_old_classads()]: + Read the TransferRequest ClassAd from stdin. This details how many additional job classads are coming down the pipe for this transfer. + Create a single TransferRequest object from the above ad. + Read multiple work ads (each representing a jobad which the FileTransfer Object will use as a unit of work for a transfer) and store them into the TransferRequest object. + Since we can only have one transfer request given to the transferd in the stdin mode, we will generate a "capability" that represents the TransferRequest and associate the transfer request with the capability. + Since we have work to do, ensure our inactivity timer will not be fired. + Mark ourselves initialized. + Tell the schedd I'm alive and ready: [g_td.register_to_schedd()] + Determine the location of the schedd from the Features object. + Get my own sinful string. + Create a DCSchedd object to the schedd. + The DCSched object knows how to register the treansferd via the schedd.register_transferd(...) call: [DCSchedd::register_transferd()] + StartCommand the TRANSFERD_REGISTER command to the schedd. + Force authentication + Send a registration request classad that includes: - ATTR_TREQ_TD_SINFUL: sinful string of TransferDaemon - ATTR_TREQ_TD_ID: id string of secret key given by the schedd on the command line. + The reponse ad from the schedd contains: - ATTR_TREQ_INVALID_REQUEST which is either TRUE or FALSE - If FALSE, then ATTR_TREQ_INVALID_REASON is defined. + If the registration is: Not valid: EXCEPT Valid: Store the socket to the schedd as m_update_sock. This is a connection that is used to send updates about TransferRequests from the transferd to the schedd. + Return from g_td.init(); + Call g_td.register_timers(): + Register: TransferD::process_active_requests_timer() This timer is commented out. I apologize because I don't know why, and this is one of the major means by which the TransferD performs its functions. I think it was commented out so all of this stuff could be merged into mainline HTCondor and not affect anything before we could finish it all. + Register: TransferD::exit_due_to_inactivity_timer() This will cause the transferd to exit if ther eis no work to perform. Coupled with the above, it means in mainline HTCondor, if the transferd starts, it will automatically exit because it won't process any of its queued work. + Call g_td.register_handlers(): + Register: DUMP_STATE -> TransferD::dump_state_handler This function will emit information to condor_squawk. + Register: TRANSFERD_CONTROL_CHANNEL -> TransferD::setup_transfer_request_handler() This function manages a permanent connection to the schedd and is the control channel between the schedd and the transferd. There are scalability issues with this architecture which were known about, but we decided to solve later once the architecture of how the schedd communicated to the transferd was more finalized. + Register: TRANSFERD_WRITE_FILES -> TransferD::write_files_handler(); This function uses some mechanism (like the file transfer object itself) to write transfered files into a local directory. It could be the spool, the initialdir, or whatever. + Register: TRANSFERD_READ_FILES -> TransferD::read_files_handler(); This function uses some mechanism (like the file transfer object itself) to read transferring files from a local directory. It could be the spool, the initialdir, or whatever. + Register a Repaer: "Reaper" -> TransferD::reaper_handler() This function handles the exiting of any process which wasn't a process devoted specifically to reading/writing files. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Handler DUMP_STATE: TransferD::dump_state_handler ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; When this handler is invoked, it constructs a classad and sends it back on the wire. The contents of this classad can includes all sorts of status information or other statistics about the transferd. Currently, it will create a classad with: Uid = getuid() OutstandingTransferRequests = and send it back to the client which requested it. Afterwards, the stream is not kept and daemoncore closes the connection. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; Handler TRANSFERD_CONTROL_CHANNEL: TransferD::setup_transfer_request_handler() ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; ASIDE: The transferd registration processes looks like this: 1. Schedd fork/execs transferd with secret key on command line. 2. TransferD connects back to the schedd with the secret key and closes the connection. 3. Schedd validates the key and sinful of the transferd with what it is expecting. 4. The schedd connects back to transferd, this is called the control connection, and the tether stays active until the transferd exits. More transfer requests, among other things, can come down the control connection. This handler manages step 4: + Make sure we're authenticated by calling rsock->triedAuthentication(). + Register_Socket the control socket to handle any future TransferRequests from the schedd: Control Socket Fd -> TransferD::accept_transfer_request_handler(); + We keep the stream alive, and return. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; TRANSFERD_WRITE_FILES -> TransferD::write_files_handler() ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; This handler is called when a client wishes to send a TransferRequest to the TransferD which results in files being written *to* the TransferD. How the actual files themselves are written depends upon the ATTR_TREQ_FTP attribute in the request ad from the client. FTP in this attribute means "File Transfer Protocol", but NOT in the same manner as the ftp program. The values of that attribute dictate HOW the transferd should accept and write those files. Currently, the only supported method is the enumeration FTP_CFTP, which means "File Transfer Protocol: HTCondor File Transfer Protocol". + Make sure we're authenticated by calling rsock->triedAuthentication(). + Get the fully qualified user name fromthe socket. + Grab the ClassAd from the socket, and lookup the capability string. This capability represents a TransferRequest association, not a FileTransfer capability as found in a job ad. + If the capability is unknown, abort the request with a classad and close the connection, otherwise, find the TransferRequest object associated with the capability. + Get the ATTR_TREQ_FTP protocol from the request ad. It can only be FTP_CFTP because I had implemented one protocol so far! + Ensure that the request is ONLY of the "upload" variety. Upload in this context means the transfer is ocurring from the client to the TransferD. + If everything is ok, send a success classad back to the client. [Now we set up a "thread" using DaemonCore CreateThread which _actually_ performs the file transfer in an asynchronous manner using the FTP protocol defined.] + Create a reaper to handle the call to TransferD::write_files_thread() which will enact the actual transfer. + Create a ThreadArg object which holds the FTP protocol enumeration and the TransferRequest. + Create the TransferD::write_files_thread() thread passing in the ThreadArg object (so it knows what to do) and associating it with the aforementioned reaper id. + TODO: If this creation fails, I don't do anything! I should. + Associate the newly created thread with its reaper id in a data structure so the transferd can refer to it later when it finishes. + Return, closing the stream to the client. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; THREAD TransferD::write_files_thread ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; The purpose of this thread, which was created in TransferD::write_files_handler(), is to make asynchronous the act of having a client upload a set of files in a TransferRequest to the TransferD. Otherwise the TransferD would have to block while the transfer happened and that would suck _a lot_. This function is running in either another thread, or another process under the TransferD. [NOTE: This function is barely implemented to just exactly perform the task at hand. It needs obvious improvement before it can be production quality.] + First thing we do is sleep(1), it is a hack that the comments explain. + We pick out the protocol and TransferRequest form the passed in ThreadArg. + TODO: We should have different actions based upon the FTP protocol here, however, we don't and just assume FTP_CFTP in the control flow. This needs to be fixed for it to be production code. + The client is actually waiting to perform the transfer on this socket, so we'll set the timeout to be very large (8 hours) so the transfer can finish without timeouts happening. + Now, we iterate over the tasks in the TransferReqeust object. Each task is a job ad that we can instantiate a FileTransfer object with. For each job ad in the TransferRequest: + Create a FileTransfer object. + SimpleInit() it. + SetPeetVersion() whatever the TranferRequest stated. + DownloadFiles() and store all the files in the local file system wherever the job ad stated it should go. + When we're done will all of them it is end of message to the client. + Send back a classad to the client that everything is ok. + Exit the thread/process with EXIT_SUCCESS. The reaper cares about this. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; REAPER TransferD::write_files_reaper ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; This handler gets called with a write_files_thread image/thread exits. The main act of this handler is to inform the schedd that the TransferRequest has been attempted (and hopefully completed!) and what the results of that was. + Ensure that the just exited transfer thread that I just caught the reaped status of is something that I actually asked to happen. This is a consistency check. + Construct a status ad which contains the capability of the TransferRequest and how the sub-thread/process succeeded or failed. + Send the status ad back to the schedd. + Destroy the TransferRequest. It is completely finished. If it failed, the idea is that the schedd will send another one with the same tasks to be performed. + If there are no more requests, then set the inactivity timer. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; TRANSFERD_READ_FILES -> TransferD::read_files_handler() ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; [This handler is a very similar analogue to TransferD::write_files_handler().] ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; THREAD TransferD::read_files_thread ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; [This thread is a very similar analogue to TransferD::write_files_thread().] The only _real_ difference is that UploadFiles() is called on the FileTransfer object. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; REAPER TransferD::read_files_reaper ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; [This reaper is a very similar analogue to TransferD::write_files_reaper().] ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; TIMER TransferD::exit_due_to_inactivity_timer() ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; + If m_inactivity_timer == 0, the timer is off and we return. + Otherwise, we subtract m_inactivity_timer from now, and see if it is greater then the timeout specified (on the command line, or by default). + If we timeout due to inactivity, we DC_Exit(TD_EXIT_TIMEOUT).from now, and see if it is greater then the timeout specified (on the command line, or by default). + If not, the timer fires again later... ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; TIMER TransferD::process_active_requests_timer() DEMO code! ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; The purpose of this timer is to see if there are any pending pending TransferRequests and to process them. Currently this code path is demo code in that it performs the first active TransferRequest it knows about, and then exits the process. ASIDE: TransferRequests have two main modes, and a third hacked mode. The first mode is ACTIVE. This means the transferd itself initiates the processing of a TransferRequest. The second mode is PASSIVE. This means that a client initiated the processing of a TransferRequest. The hacked one is ACTIVE_SHADOW. This means that the transferd is going to be initiating the handling of a TransferRequest, except it will do so only be hard coded communication with a condor_shadow. WARNING: This timer code is smushy in that it was never fully developed and/or finished. Coupled with that, there is some demo code in here that connects to a running shadow to pull files from wherever the shadow states to get them (which in this case is itself). + Iterate over the known about TransferRequests + [TREQ_MODE_ACTIVE: No behavior implemented, TODO in source] + [TREQ_MODE_PASSIVE: Don't do anything, since a client is managing it] + [TREQ_MODE_ACTIVE_SHADOW: Demo code, talk to a shadow and process it] process_active_shadow_request(): + Figure out how many jobads (AKA transfer ads) to process. However I only will be processing the first jobad in the TransferRequest! + Create a FileTransfer object. Init() it so it uses Daemoncore. + Set up a callback to TransferD::active_shadow_transfer_completed() which will get called by Daemoncore when the file transfer is completed. + Depending upon if the TransferD was invoked with 'upload' or 'download' on the command line, tell the FileTransfer object to UploadFiles() or DownloadFiles() respectively. This is a nasty hack since an active TransferRequest should know if it is uploading or downloading--the actual TransferRequest object can know this information, but who specified it wasn't completed. + If I found any active TransferRequests, return that I found some. ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; CALLBACK TransferD::active_shadow_transfer_completed() ;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;;; This is the control continuation from TransferD::process_active_requests_timer(). + Ensure the file transfer suceeded, except otherwise. + DC_Exit(); {endverbatim}