Skip to content

Running MATLAB PLS and SPM jobs on the cluster

Gabriel A. Devenyi edited this page Aug 14, 2018 · 5 revisions

Running MATLAB batch jobs is the best way to utilize the cluster, as it allows for flexible scheduling and is much easier to manage multiple jobs at the same time, when compared to qlogin

Your goal when producing MATLAB scripts for the cluster is to ensure your job requires no interactivity and no displays (such as using batch_plsgui)

Furthermore, you should attempt to breakup your work into the smallest independent pieces you can, i.e. if you have a MATLAB script that runs 5 analysis runs where each run in no way needs the outputs of the others, you should break up that script into 5 individual scripts.

Consider a simple PLS batch matlab script:

%mybatchjobs.m
%Setup a parallel pool to make matlab faster
matlabpool open local
%Run my analysis
batch_plsgui input1.mat
batch_plsgui input2.mat
batch_plsgui input3.mat
batch_plsgui input4.mat
batch_plsgui input5.mat

If each of the inputN.mat files is an independent run, instead you should create several files:

%mybatchjob1.m
%Setup a parallel pool to make matlab faster
number_of_cores=3; %3 or 4, depending upon SPM or PLS
d=tempname(); %get a temporary location;
mkdir(d);
cluster = parallel.cluster.Local('JobStorageLocation',d,'NumWorkers',number_of_cores);
matlabpool(cluster,number_of_cores);
%Run my analysis
batch_plsgui input1.mat
%mybatchjob2.m
%Setup a parallel pool to make matlab faster
number_of_cores=12;
d=tempname(); %get a temporary location;
mkdir(d);
cluster = parallel.cluster.Local('JobStorageLocation',d,'NumWorkers',number_of_cores);
matlabpool(cluster,number_of_cores);
%Run my analysis
batch_plsgui input2.mat

etc for each job. This extra temporary directory code is to prevent MATLAB's parallel features from clobbering other running jobs.

Now with individual MATLAB scripts, we can submit these to the cluster using qbatch

Now we need a job list for qbatch, we provide it with a MATLAB command-line call to run each of the scripts.

> cat myjobs
matlab -nodisplay -nosplash -nodesktop -r "run('./mybatchjob1.m')"
matlab -nodisplay -nosplash -nodesktop -r "run('./mybatchjob2.m')"
matlab -nodisplay -nosplash -nodesktop -r "run('./mybatchjob3.m')"
matlab -nodisplay -nosplash -nodesktop -r "run('./mybatchjob4.m')"
matlab -nodisplay -nosplash -nodesktop -r "run('./mybatchjob5.m')"

Each of these lines runs just the script for one analysis.

Finally, we setup our environment and submit our job:

> module load MATLAB/R2012a
> module load PLS #or module load SPM12
> module load qbatch
> qbatch --options '-l matlab=1' --options ' -R y' --ppj <number of slots to request> myjobs

Please note the space before the -R in the above command, which is needed for qbatch to not interpret the option for itself.

For the case of PLS jobs, the number of slots requested should be "4". For SPM jobs, the number of slots requested should be "3"

Please note that the -l matlab=1 option is essential for the cluster to manage the MATLAB licences. Failure to do so can result in your jobs running out of licences and failing.

Clone this wiki locally