I want to run several instances of matlab without running a parfor loop. The structure of my code is the following:
if k == 1
% Set some parameters here
elseif k == 2
% Set some other parameters here
...
elseif k == 10
%Set some other parameters here
end
Is there an efficient way of opening 10 instances of matlab where each instance will run for a given value of k?
I know that in a cluster with slurm I could use slurm arrays, i.e. I could add this to the beginning of the matlab code:
k = str2num(getenv('SLURM_ARRAY_TASK_ID'));
And then just to a batch submit. Anything similar that I could do on a normal computer?
In Linux, you could let a bash script write out MATLAB scripts which can then be executed in parallel. You can just use the ampersand (&) for that after each MATLAB call, but the GNU parallel software is better: you can then specify how many jobs will run in parallel.
This bash script
#!/bin/bash
# command line argument: how many scripts (jobs) in parallel?
if [[ ${1} == "" ]]; then
echo "${0} needs a parameter: N == how many scripts are made 0,1,2 ..."
exit 1
fi
N=${1};
echo "creating and running ${N} scripts ..."
# some constants
c_dir=$(pwd)
ml_ex=$(which matlab)
# create the scripts
for (( i=1; i <= ${N}; i++ )); do
cat << EOF > ${c_dir}/script${i}.m
a = ones (${i}) * $i
EOF
done
# list them, then pass this list to parallel
for f in ${c_dir}/script*.m; do
echo "${ml_ex} < $f"
done | parallel -j ${N};
# tidy up
rm -f ${c_dir}/script*.m
makes N MATLAB scripts (N is the command line parameter) and executes them in MATLAB in parallel. Each script should show a MxM matrix filled with the number M (for M = 1,2, ... N ). So the command runsN.sh 5 runs 5 copies of MATLAB at the same time.
Instead of ${ml_ex} in the script, ${ml_ex} -nodesktop -nosplash shows more clearly what happens. I have an alias to always use those options.
This maybe worth trying if you have a number of time-consuming, not very resource-demanding, completely independent jobs. I have used it for image processing.
If you use GNU parallel, you can get a setup similar to using Slurm on a cluster:
parallel -j 4 'export SLURM_ARRAY_TASK_ID={} ; matlab [...] my_script.m' ::: {1..10}
^ ^ ^ ^
| | | Bashism to express 1, 2, ..., 10
| | invoke Matlab with its args and the script
| create a SLURM_ARRAY_TASK_ID variable to fool the script
run maximum 4 "jobs" at a time
By setting the SLURM_ARRAY_TASK_ID variable explicitly in the command launched by parallel, you can use the same Matlab script both on the cluster and on your local workstation.
GNU Parallel offers many options to manage, limit, throttle, or even dispatch "jobs".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With