Personal tools
You are here: Home / HowTo / Compute Cluster / Array job examples

Array job examples

Easy example: input files are numbered

Lets say you want to run program my_computation on a list of input files called input.1 to input.200

Without array jobs you would submit it in a loop:

for i in `seq 1 200`; do qsub my_computation input.$i; done

With an array job you would do this like this:

  • you write a job script that runs the computation on a file:
#!/bin/bash
#$ -N array_test
my_computation input.$SGE_TASK_ID
  • you submit that as an array job to the cluster:
qsub -t 1-200 test_array.sh

With an array job you have the possibility to limit the number of jobs running in parallel, i.e. to occupy mostly 100 slots and kindly leave the remaining slots available for your colleagues, include the "tc" switch like this:

qsub -t 1-30000 -tc 100 analyze-30k-by-array.sh

Finally you also can adjust the tc parameter at runtime out your array jobs, using the "qalter" command.

More difficult: random filenames

If your filenames are not numbered you have to work a little harder. First you put all filenames in a file (e.q. files_to_workon) one filename per line.

Then you have a script that runs my_computation on one of the files from files_to_workon depending on $SGE_TASK_ID:

#!/bin/bash
#$ -N array_test2
FILENAME=`sed -n ${SGE_TASK_ID}p <files_to_workon`
my_computation $FILENAME

Now you can submit the array job just like the first one:

qsub -t 1-200 test_array2.sh
Document Actions