documentation

Slurm MPI examples

Slurm MPI examples

This example shows a job with 28 task and 14 tasks per node. This matches the normal nodes on Kebnekaise.

#!/bin/bash
# Example with 28 MPI tasks and 14 tasks per node.
#
# Project/Account (use your own)
#SBATCH -A hpc2n-1234-56
#
# Number of MPI tasks
#SBATCH -n 28
#
# Number of tasks per node
#SBATCH --tasks-per-node=14
#
# Runtime of this jobs is less then 12 hours.
#SBATCH --time=12:00:00

# Clear the environment from any previously loaded modules
module purge > /dev/null 2>&1

# Load the module environment suitable for the job

Submit File Design

SLURM Submit File Design

To best use the resources with Slurm you need to have some basic information about the application you want to run.

Slurm will do its best to fit your job into the cluster, but you have to give it some hints of what you want it to do.

The parameters described below can be given directly as arguments to srun and sbatch.

If you don't give SLURM enough information, it will try to fit your job for best throughput (lowest possible queue time). This approach will not always give the best performance for your job.

Tips, SLURM

Tips, SLURM

Job status

  • In order to see your jobs only, and no others, run
    $ squeue -u <username>
    

Interactive running

  • Using salloc, you get an interactive shell to run your jobs in, when your nodes are allocated. This works like an interactive shell (-I) does in PBS - including the fact that you cannot use the window while you wait for - perhaps - a long time before the job starts.

SLURM commands and information

SLURM commands and information

There are many more commands than the ones we have chosen to look at below, but they are the most commonly used ones. You can find more information on the SLURM homepage: SLURM documentation

You can run programs either by giving all the commands on the command line or by submitting a job script. If you ask for the resources on the command line, you will wait for the program to run before you can use the window again (unless you can send it to the background with &).

Batch systems

The Batch system

Once a parallel program has been successfully compiled it can be run on multi-processor/multi-core computing nodes directly or, in production environment, by means of a batch system. Batch systems keeps track of available system resources and takes care of scheduling jobs of multiple users running their tasks simultaneously. It typically organizes submitted jobs into some sort of prioritized queue. The batch system is also used to enforce local system resource usage and job scheduling policies.

Pages

Updated: 2024-04-17, 14:47