The HEC currently hosts versions of the COMSOL multiphysics modelling suite, with licenses supplied by the Engineering (for version 5.2a) and Physics (for version 5.1) departments. Please check with the relevant Deptartment's IT Liaison to ensure you have access to these externally-hosted licenses.
COMSOL batch serial jobs may be run by creating a batch job control script (for example, called comsol_job.com) like the following:
#$ -S /bin/bash #$ -q serial #$ -l node_type=10Geth* #$ -l h_vmem=25G source /etc/profile module add comsol/5.2a comsol batch -prefsdir $TMPDIR -recoverydir `pwd` \ -tmpdir $TMPDIR \ -inputfile micromixer_cluster.mph -outputfile out.mph
For an introduction to serial batch jobs on the HEC, please refer to Submitting jobs on the HEC.
The above job takes the model described in micromixer_cluster.mph and places the results in out.mph. The model used in the above template is the cluster test model provided with the COMSOL installation.
The -prefsdir and -tmpdir are set to point to a temporary directory created during the job run. This is recommended as by default COMSOL writes a large number of files to the user home area, which can quickly result in going over quota.
The -recoverydir option sets the recovery directory to the current working directory from which the job was submitted.
For models suitable for parallel running, the following parallel job template can be used:
#$ -S /bin/bash #$ -q parallel #$ -l node_type=10Geth* #$ -l nodes=4,ppn=2,tpp=8 source /etc/profile module add comsol/5.2a comsol batch -nn $ARC_SGE_NP -np $ARC_SGE_TPP \ -prefsdir $TMPDIR -recoverydir `pwd` -tmpdir $TMPDIR \ -inputfile micromixer_cluster.mph -outputfile out.mph
For an introduction to parallel MPI jobs on the HEC, please see Using the Message Passing Interface (MPI) on the HEC.
It is recommended to run COMSOL in hybrid parallel mode, with one MPI process per CPU socket and the number of threads per process matching the number of CPU cores per socket. The above example requests four compute nodes of node_type 10Geth* - these nodes each have two sockets, and 8 cores per socket, resulting in a job resource request string of nodes=2,ppn=2,tpp=8. For larger jobs, the number of nodes requested should increase while the ppn (processes per node) and tpp (threads per process) values should remain the same.
The additional -nn and -np arguments used when invoking comsol pick up those values and send them to comsol's parallel job launcher.
Two important items to bear in mind when running COMSOL in parallel, to ensure that resources are not wasted:
- Significant speedup is only seen on models with several million degrees of freedom; and
- The meshing algorithm runs mostly in serial. For parallel runs, the mesh should be pre-computed in serial mode.