HPC2N - Support: The Batch system

The Batch system

 

Once a parallel program has been successfully compiled it can be run on multi-processor/multi-core computing nodes directly or, in production environment, by means of a batch system. Batch system keeps track of available system resources and takes care of scheduling jobs of multiple users running their tasks simultaneously. It typically organizes submitted jobs into a three-part priority queue (running, idle, blocked). The batch system is also used to enforce local system resource usage and job scheduling policies.

At HPC2N the batch system on Akka is composed of two parts. Torque: a system resource manager (allocates and enforces limits on nodes, processors, memory, etc.), and Maui: a job scheduler (handles job scheduling policies). The jobs are scheduled according to a set of policy rules and priorities which gives the user access as fairly as possible with respect to allotted resources.

The new cluster, Abisko, runs SLURM. Read more about that here.