Batch System - Akka/Torque

The Batch system - Akka/Torque

 

Once a parallel program has been successfully compiled it can be run on multi-processor/multi-core computing nodes directly or, in production environment, by means of a batch system. Batch systems keeps track of available system resources and takes care of scheduling jobs of multiple users running their tasks simultaneously. It typically organizes submitted jobs into a three-part priority queue (running, idle, blocked). The batch system is also used to enforce local system resource usage and job scheduling policies.

The batch system on Akka is composed of two parts. Torque: a system resource manager (allocates and enforces limits on nodes, processors, memory, etc.), and Maui: a job scheduler (handles job scheduling policies). The jobs are scheduled according to a set of policy rules and priorities which gives the user access as fairly as possible with respect to allotted resources.