For workloads such as Monte Carlo simulations and parameter studies, it is often necessary to run the same program multiple times, often with slightly different input parameters. Rather than create a unique job file for each run and submit each of them separately, SGE offers an array job option. When combined with a tailored job script, this allows you to submit multiple similar jobs with a single command.
Array jobs can be submitted by adding the -t directive to the job script, as in the following example:
#$ -S /bin/bash #$ -q serial #$ -N myjob #$ -t 4-10:2 source /etc/profile echo Job task $SGE_TASK_ID ./my_program < input.$SGE_TASK_ID.dat
This submits the job script myjob.com as a number of tasks, each task having its own unique index number. This index number can be used by the job script to perform slightly different actions each time, e.g. reading from a different input file (as in the above example), or passing a different set of parameters to your program for each task.
The number of tasks and the values of the task index numbers are controlled by the extra arguments following the -t directive. The format is x-y:z, where x is the first index number, y the last, and the optional :z gives the step increment. The above example submits the job script 4 times, with index numbers of 4, 6, 8 and 10 (ie, first = 4, last = 10, step = 2).
Index numbers must always be positive integers.
The index number is available to the job script via the environment variable $SGE_TASK_ID, and can be used by the job script to alter what exactly is run for each task. In the above example, it is used to change the input file sent to the user application my_program. Successive tasks will read input.4.dat, input.6.dat, input.8.dat and input.10.dat.
The standard output and standard error files for each task will be unique; by default SGE will name the output file using the job name, the job ID, and the task ID.
Once you get the hang of writing flexible job scripts, job arrays make job submission much easier. They also make job management easier too. All tasks within the same job are given a different index number, but all have the same job id. An example output from qstat for an array job is given below. As with the job submission script format, each task's index number appears in ja-task-ID field:
job-ID prior name user state submit/start at queue slots ja-task-ID ----------------------------------------------------------------------------------- 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 1 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 2 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 3 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 4 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 5 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 6 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 7 204 0.500 myjob testuser r 07/31/2013 10:48:02 serial@comp005 1 8 204 0.500 myjob testuser qw 07/31/2013 10:48:01 1 9-100:1
Note that tasks still queued-and-waiting are listed together on a single line. In the above example, tasks 1 through 8 are running, while tasks 9 through 100 are still waiting to run. If you want to stop all the tasks of job ID 204 at once, you can use the normal qdel command:
For large job arrays, it may take several minutes to kill all jobs.
If you want to stop individual jobs, you can suffix the job id with the individual task id. To stop just task ID 4 of job ID 204 above, we do:
To stop the tasks IDs 1-3 of job ID 204:
Care should be taken to avoid very short jobs — on the order of a few seconds to a few minutes — as these make very inefficient use of the cluster. It takes the system several seconds both to start and finish a job, and the scheduler itself works on 15 second cycles. Very short jobs therefore end up causing a lot of idle time on the system. To avoid this, consider bunching several short tasks together into a single job array element.
The example below gives a template for this type of solution. A job array originally of 10000 individual tasks each of which ran for only a few seconds has been converted into one containing just 10 tasks, with each task containing a loop to execute the next 1,000 tasks in sequence, depending on the job task ID it receives:
#$ -S /bin/bash #$ -q serial #$ -N myjob #$ -t 1-9001:1000 source /etc/profile echo Value received: $SGE_TASK_ID x=$SGE_TASK_ID y=$(($SGE_TASK_ID+$SGE_TASK_STEPSIZE-1)) echo Running $x to $y for z in `seq $x $y`; do echo Running task $z myprogram < input.$z.data > output.$z.data done
How it works:
The -t job directive is set up the job array to run from 1 to 9001 in steps of 1000. This will result in 10 separate tasks, with SGE_TASK_ID containing values of 1, 1001, 2001, up to 9001. The shell variable x is set to the SGE_TASK_ID value, and shell variable y is set to the last value in the set of tasks to be run, using some simple shell arithmetic and the SGE_TASK_STEPSIZE shell variable, which is automatically set to the stepping size of current array (1000 in this case). The for loop sets shell variable z to all values between x and y in sequence using the standard unix tool seq. Each iteration of the loop will run myprogram using unique input and output files, based upon the value of z.
Note that while this example still produces 10,000 output files named output.$z.data for all values of z between 1 and 10,000, there are only 10 job tasks, so there will only be 10 stdout and stderr files.