The High End Computing (HEC) Cluster is a centrally-run service to support researchers and research students at Lancaster who require high performance and high throughput computing. This includes computing workloads with requirements that can't be met by the Interactive Unix Service (IUS) or desktop PCs.
The combined facility offers over 6,500 cores, 28 TB of aggregate memory, 70TB of high performance filestore and 1.5PB of medium performance filestore. The service combines what was the previously separately supported services for local high performance computing (HPC) users and the local Particle Physics research group (GridPP).
The cluster operating system is Scientific Linux, with job submission handled by Son of Grid Engine (SGE). The service supports a wide variety of third-party software including numerical packages, libraries and C and Fortran compilers.
Using the HEC
The HEC has three basic components:
a login node, where users login in to submit jobs;
the compute nodes, which run those jobs; and
dedicated file systems, which share user and other files across the cluster.
From the login node, users create a batch job script which describes the tasks their job(s) are to perform in a format similar to a unix shell script. The batch job is then submitted to the SGE job scheduler which will portion out user jobs to free compute nodes. Job submission commands can be supplemented with additional information, such as requests for specific amounts of memory (for large memory jobs), or multiple nodes (in the case of parallel jobs).
Login node: The login node is a 6-core virtual machine emulating Haswell architecture, with 48GB of memory.
Compute nodes: The compute nodes consist of Viglen HX420Ti chassis, each housing four servers, for a total of 440 nodes. Compute nodes are dual-socket hexa-core or octa-core. The memory size for a standard compute node is 4G per core, with a few nodes offering double that in order to support jobs with larger memory requires.
File store: The primary file storage system is an 70TB Panasas Activestor Series 16 Storage Cluster. A series of Viglen HS424i storage nodes act as secondary file system providing 2PB of medium-performance filestore for the local GridPP initiative.
A number of statistical and numerical packages and libraries are installed in addition to Fortran 90, C and C++ compilers. Most software is accessed via environment modules.