Page History

Turn Off History

This document is an explanation of why you might want to use the DAGMan external sub-DAG and splice features, and which one you might want to use in a particular situation. (See http://research.cs.wisc.edu/htcondor/manual/v8.5/2_10DAGMan_Applications.html#SECTION003108900000000000000 and http://research.cs.wisc.edu/htcondor/manual/v8.5/2_10DAGMan_Applications.html#SECTION0031081000000000000000 for detailed information about external sub-DAGs and splices, respectively.)

When should I use one of these features?

Both external sub-DAGs and splices allow you to compose a large workflow from various sub-pieces that are defined in individual DAG files. This is the basic motivation for using either external sub-DAGs or splices: you want to create a single workflow from a number of DAG files, either because the smaller DAG files already exist, or because it's easier to deal with sub-parts of the workflow. (One use case might be that you have sub-workflows that you want to combine in different ways to make different overall workflows.)

Some reasons to use external sub-DAGs or splices:

Feature comparison

Here's a table comparing external sub-DAGs and splices. Note that the bold entries are the ones that are advantageous for a given feature.

Feature External sub-DAGs Splices Notes
Incorporate separate DAG files yes yes
Rescue DAGs yes yes
DAGMan recovery yes yes
Multiple DAGMan instances yes no
Possible combinatorial explosion of dependencies no yes Until we implement socket nodes for splices
Dynamic creation of sub-workflows yes no
PRE/POST scripts on sub-workflows yes no Until we implement socket nodes for splices
Retries of sub-workflows yes no
Workflow-wide throttling no yes
Per-sub-workflow throttling yes no
Workflow-wide node categories no yes
Node priorities on sub-workflows yes no
Reduce memory footprint of large workflows yes? no If used properly
Per-sub-workflow file final nodes yes no
Abort sub-workflows individually yes no
Variables associated with sub-workflows yes no
Separate configuration for sub-workflows yes no Can be good or bad
One node status file, etc., for entire workflow no yes

Should I use external sub-DAGs or splices?

The simple answer is that, unless you need one of the features that's available with external sub-DAGs but not with splices (see the table above), you should use splices. Splices are generally simpler and have less overhead than external sub-DAGs (unless the workflow is specifically designed to minimize the external sub-DAG overhead). Also, workflow-wide throttling is generally more useful than separate throttles for sub-parts of the workflow.

How to use external sub-DAGs to reduce workflow memory footprint

(Coming soon!)

Note: This document is valid for HTCondor version 8.5.5.