The High End Computing (HEC) Cluster is
|a centrally-run service to support researchers and research students at Lancaster who require high performance and high throughput computing|
. This includes computing workloads with requirements that can't be met by the Interactive Unix Service (IUS) or desktop PCs.
The service combines what was the previously separately supported services for local high performance computing (HPC) users and the local Particle Physics research group (GridPP).The combined facility offers 8,800 cores, 40 TB of aggregate memory, 70TB of high performance filestore for general use and 4PB of medium performance filestore for GridPP data.
The cluster operating system is CentOS Linux, with job submission handled by Son of Grid Engine (SGE). The service supports a wide variety of third-party software including numerical packages, libraries and C and Fortran compilers.
The HEC has three basic components:
a login node, where users login in to submit jobs;
the compute nodes, which run those jobs; and
dedicated file systems, which share user and other files across the cluster.
From the login node, users create a batch job script which describes the tasks their job(s) are to perform in a format similar to a unix shell script. The batch job is then submitted to the SGE job scheduler which will portion out user jobs to free compute nodes. Job submission commands can be supplemented with additional information, such as requests for specific amounts of memory (for large memory jobs), or multiple nodes (in the case of parallel jobs).
Login node: The login node is a 6-core virtual machine emulating Haswell architecture, with 48GB of memory.
Compute nodes: The compute nodes consist 445 servers covering a variety of generations of Intel processor, offering a mixture of 16 cores (Ivy Bridge through to Broadwell architecture) or 40 cores (for Skylake). The memory size for a standard compute node is 4G per core, with a few nodes offering double that in order to support jobs with larger memory requirements. Compute node network interconnects are 10 Gbit/s low latency Ethernet.
File store: The primary file storage system is an 70TB Panasas Activestor Series 16 Storage Cluster. A series of Viglen HS424i storage nodes act as secondary file system providing 4PB of medium-performance filestore for the local GridPP initiative.
A number of statistical and numerical packages and libraries are installed in addition to Fortran 90, C and C++ compilers. Most software is accessed via environment modules.
Introduction to Linux by Robin Long
Unix Tutorial (generic unix) - from the University of California at Berkeley
Search for Unix books using the Lancaster University Library OneSearch facility
overview hec HEC H.E.C heck