Parsing is line-by-line. Parse is basically a large table that looks at the first token in a line and then passes control to a subroutine to complete parsing the line. If a line begins with '#', it is ignored. Unfortunately, there is a parse error if the '#' comment character appears later in the line. The hairiest part of parsing is when one is parsing splices: The splice dag is parsed, and along with any sub-splices. Then the splice is "lifted" into its parent dag. It would be nice if the splice were represented in code as a subdag, along with nested dags; then we could just have subdags. The splice could then be parsed "lazily". This code is all in parse.cpp. dagman_main calls parse(); at the end, you have the dag nodes in memory. Probably need something about partial rescue dags here. (Note: this has been changed to parse in two passes for the more flexible command ordering -- there's a ticket for this.) Note that this is kind of tricky if INCLUDE commands are used...