OpenMP is a set of compiler directives, library routines and environment variables that can be used to specify shared-memory parallelism in C, C++ and Fortran codes.
OpenMP compiler directives can be inserted into source code to indicate to the compiler which section of code can be readily parallelised, allowing a programmer to highlight the core sections of code that can benefit from parallelism. When compiled with a special compiler flag, the compiler will create a multi-threaded version of the application that can automatically distribute the highlighted parallel sections of code to different CPUs on the same node. On the current HEC, this offers up to sixteen-way parallelism.
It should be stressed that not all codes will benefit from such attempts at parallelism, and not all sections of code can be parallelised. Users should test serial and parallel versions of their code to ensure that the parallel version is making good use of the additional processors.
A detailed explanation of the OpenMP compiler directives can found in the guides for both the PGI and Intel compiler suites (see the Further advice panel for more information).
To compile OpenMP code using the PGI compiler, compile with your normal set of PGI compiler flags, and add the compiler argument -mp.
To compile OpenMP code using the Intel compiler, compile with your normal set of Intel compiler flags, and set the -openmp.
Make sure that the correct module for your preferred compiler suite has already been added to your environment.
The following job template will run an 8-core version of the program omptest compiled with intel compilers:
#$ -S /bin/bash #$ -q parallel #$ -l np=8 #$ -l h_vmem=1G source /etc/profile module add intel ./omptest
There are four directives important to specifying OpenMP jobs:
#$ -q parallel
Standard OpenMP jobs should be submitted to the parallel queue.
Job size selection (number of cores)
#$ -l np=8
This line specifies the number of cores the job requires. As OpenMP codes require all processes to run on the same compute node, the np= syntax ensures that all job slots for the job are on the same node. As a result, the value should never be greater than the maximum number of cores on the largest compute node – this is currently 16. In the above example, 8 cores have been selected. The scheduler will launch the job with the OMP_NUM_THREADS environment variable set to the correct value (8), allowing the omptest OpenMP application to launch the correct number of threads.
The number of cores selected can be smaller than the node_type's core count. The remaining (unused) cores on the node will be used for other jobs.
Memory size selection
#$ -l h_vmem=1G
This specifies a memory resource request for the job. Note that the requested amount is per core, so the above example requests a total of 8G of memory.