Page History

Turn Off History

This is a very early draft.


Matlab is an excellent match for Condor. Millions of Matlab jobs, possibly tens of millions, have been successfully run under Condor over the years. The specifics of working with Matlab will vary from site to site, but here are some general guidelines.

This assumes basic familiarity with Matlab and Condor. It assumes that you can set up and maintain at least a small Condor pool.

Licensing

Perhaps the biggest challenge to running Matlab under Condor is licensing. Matlab is proprietary software with strict licensing terms. You might have one or two licenses for your desktop computers, but to run hundreds of simultaneous Matlab jobs you would require hundreds of licenses.

Your institution may already have acquired suitable licensing for Matlab. If you have a fixed number of licenses available, Condor's Concurrency Limit support can help.

Another option is the Matlab Compiler. The Matlab Compiler is an optional toolbox for Matlab. In general, executables created with the Matlab Compiler are not subject to Matlab's license; you are free to run as many in parallel as possible. (Check your Matlab licensing to confirm this.)

If you are using third party Matlab add-ons, you will need to check the licensing on them as well.

Executables

You will need to make the Matlab executable and supporting libraries available on any computers on which your jobs might run on. You might install Matlab on each computer, install it on a shared filesystem, or bring Matlab along with you.

For executables compiled with the Matlab Compiler, instead of the full Matlab install, you will need the MCR, the Matlab Compiler Runtime, but the general techniques are the same.

If you install Matlab locally on each computer, it is recommended to place Matlab in the same location on each computer. That way individual jobs can easily find it. If Matlab will be in different locations on each computer, you can set up STARTD_ATTRS to advertise the correct location, then the job can use a $$ expression to find it. For example, your configuration file might say:

MATLAB_PATH  = /opt/matlab/bin
STARTD_ATTRS = MATLAB_PATH

A user might specify in their submit file (simplified):

executable = my_task
arguments = $$(MATLAB_PATH)
transfer_input_files= my_task.m
queue

my_task would then be a script that did something like this:

#! /bin/sh
exec "$1"/matlab -nodisplay ./my_task.m

Similar scripts would work on Windows.

Installing on a shared filesystem is identical to installing it on each local computer, but may be less complicated for administration.

To bring Matlab along yourself, you would need to package it up, bring it along (transfer_input_files), then have a script unpack Matlab and start it. Matlab is a large piece of software, and this could be slow. It will also temporarily use a bunch of disk space with a copy of Matlab. If multiple Matlab jobs run on the same computer simultaneously, multiple copies of Matlab will be installed at once. Generally speaking this is not recommended, although it is more practical for the MCR.

Invoking

Matlab may try to create a graphical environment. Condor does not support graphical environments; it doesn't make sense to open up a user interface for a job that won't have a user directly looking at it. You may need some combination of -nosplash, -nodisplay, and possibly -nojvm to stop Matlab from creating a graphical environment. This should not be necessary for Matlab Compiler compiled jobs.

Unless you have special arrangements to use multiple CPU cores, you will want -singleCompThread so that Matlab only uses a single core.

-nosplash Disable splash screen. (No GUI support in Condor)
-nodisplay Disable GUI. (No GUI support in Condor)
-nojvm Disable GUI? Eliminate unnecessary Java (faster?)
-singleCompThread Only use one CPU core (play nicely when sharing computer)

Parallelism

In the most common configuration, Condor does not directly support "parallel" jobs, jobs that might use a system like MPI or multiple threads to take advantage of multiple CPUs at once. Condor can launch jobs that use multiple processes or threads, but by default Condor will offer them a single CPU core to run on. (They may get lucky and be able to use more than one CPU core on a computer, but that should not be relied on.)

A local installation may provide additional options for parallel, usually in the form of offering a job two or more CPUs on a single computer. Local administrators should be able to describe available functionality.

Given the default configuration, it is usually better to break your work down into multiple independent jobs. For example, if you are processing 10,000 images, instead of a single Matlab job that processes them, perhaps you could have 10,000 jobs that each process 1 image. Condor is then able to schedule your jobs across multiple computers or at least multiple cores on a single computer, giving you the speed benefits of parallelism.

To ensure that Matlab only uses one core, put this in your Matlab script:

lastN = maxNumCompThreads(1);

If you're using the Matlab Compiler, but want to use multiple threads when doing development, you could use something like this to limit Matlab to one thread only for compiled versions:

if isdeployed
    lastN = maxNumCompThreads(1);
end

Example

This example assumes that Matlab is available in /opt/Matlab/bin.

COMPLETELY UNTESTED

executable = /opt/Matlab/bin/matlab
arguments = -nodisplay -nojvm -singleCompThread  my-script.m
transfer_input_files = my-script.m
output = my-script.output
error = my-script.error
log = my-script.log
environment = "DISPLAY=:0.0 HOME=. MATLAB_PREF=."
queue

Additional Resources

Many other sites are using Matlab under Condor. Here are links to the documentation from just a few.