Note: In the process of creating this document (2016-06-14) This document is an explanation of why you might want to use the DAGMan _external sub-DAG_ and _splice_ features, and which one you might want to use in a particular situation. (See {link: http://research.cs.wisc.edu/htcondor/manual/v8.5/2_10DAGMan_Applications.html#SECTION003108900000000000000} and {link: http://research.cs.wisc.edu/htcondor/manual/v8.5/2_10DAGMan_Applications.html#SECTION0031081000000000000000} for detailed information about external sub-DAGs and splices, respectively.) *When should I use one of these features?* Both external sub-DAGs and splices allow you to compose a large workflow from various sub-pieces that are defined in individual DAG files. This is the basic motivation for using either external sub-DAGs or splices: you want to create a single workflow from a number of DAG files, either because the smaller DAG files already exist, or because it's easier to deal with sub-parts of the workflow. (One use case might be that you have sub-workflows that you want to combine in different ways to make different overall workflows.) Some reasons to use external sub-DAGs or splices: *: Create a workflow from separate sub-workflows *: Dynamically create parts of the workflow (external sub-DAGs only) *: Re-try multiple nodes as a unit (external sub-DAGs only) *: Short-circuit parts of the workflow (external sub-DAGs only) *Feature comparison* Here's a table comparing external sub-DAGs and splices. Note that the bold entries are the ones that are advantageous for a given feature. | *Feature* | *External sub-DAGs* | *Splices* | *Notes* | | Incorporate separate DAG files | *yes* | *yes* | | | Rescue DAGs | *yes* | *yes* | | | DAGMan recovery | *yes* | *yes* | | | Multiple DAGMan instances | yes | *no* | | | Possible combinatorial explosion of dependencies | *no* | yes | Until we implement socket nodes for splices | | Dynamic creation of sub-workflows | *yes* | no | | | PRE/POST scripts on sub-workflows | *yes* | no | Until we implement socket nodes for splices | | Retries of sub-workflows | *yes* | no | | | Workflow-wide throttling | no | *yes* | | | Per-sub-workflow throttling | yes | no | | | Workflow-wide node categories | no | *yes* | | | Node priorities on sub-workflows | *yes* | no | | | Reduce memory footprint of large workflows | *yes?* | no | If used properly | | Per-sub-workflow file final nodes | *yes* | no | | | Abort sub-workflows individually | *yes* | no | | | Variables associated with sub-workflows | *yes* | no | | | Separate configuration for sub-workflows | yes | no | Can be good or bad | | One node status file, etc., for entire workflow | no | *yes* | | *Should I use external sub-DAGs or splices?* The simple answer is that, unless you need one of the features that's available with external sub-DAGs but not with splices, you should use splices. Splices are generally simpler... (add more stuff here) *How to use external sub-DAGs to reduce workflow memory footprint* (Coming soon!) Note: This document is valid for HTCondor version 8.5.5.