Examples, scripts (job submission files)

Example job submission files


'Job submission file' is the official SLURM name for the file you use to submit your program and ask for resources from the job scheduler. I will be using it interchangeable with 'script'.

Just like the commands to the batch scheduler is prefaced with #PBS for Torque/PBS job scripts, there is a preface to the job scheduler in SLURM (SLURM directives). It is #SBATCH. You can also add normal shell commands to the script.

All SLURM directives can be given on the command line instead of in the script.

Remember - the scripts and all programs called by them, must be executable!

The examples below assume you are submitting the job from the same directory your program is located in - otherwise you need to give the full path.

A simple script

Asking for 1 node and 4 tasks, then running the program

#!/bin/bash
srun -A <account> -N 1 -n 4 my_program
Another way of of asking for resources, is to give the requirements as SLURM directives.

Asking for 2 nodes, with 4 tasks distributed across them, running for no longer than 30 minutes in the account <account>. Running the (non-MPI) program "my_program".

#!/bin/bash
#SBATCH -A <account>
#SBATCH -N 2
#SBATCH -n 4
#SBATCH --time=00:30:00

srun ./my_program
Running two executables per node (two serial jobs).

The scripts job1.batch and job2.batch could be very simple scripts, only containing a line with "./my_program".

You then submit them both with a script like this.

#!/bin/bash
#SBATCH -A <account>
#SBATCH -N 1
#SBATCH -n 2
#SBATCH --time=00:30:00 

# Use '&' to move the first job to the background
srun -n 1 ./job1.batch &
sunr -n 1 ./job2.batch 

# Use 'wait' as a barrier to collect both executables when they are done.
wait
Naming output/error files

Normally, SLURM produces one combined output file called slurm-<jobid>.out containing all output and errors from the run (though files created by the program itself will of course also be created). If you wish to rename the output and error files, and get them in separate files, you can do something similar to this:

#!/bin/bash
#SBATCH -A <account> 
#SBATCH -N 2
#SBATCH -n 2
#SBATCH --time=00:05:00 
#SBATCH --error=job.%J.err 
#SBATCH --output=job.%J.out

srun ./my_program
Use a set of nodes exclusively for a job
#!/bin/bash
#SBATCH -A <account> 
#SBATCH -N 4
#SBATCH --exclusive
#SBATCH --time=00:05:00 

srun ./my_program
Running a MPI job

2 nodes, 48 processors, 1 hour, memory per cpu=4000 MB

#!/bin/bash
#SBATCH -A <account>
#SBATCH -N 2
# use --exclusive to get the whole nodes exclusively for this job
#SBATCH --exclusive
#SBATCH --time=01:00:00
# This job needs 4GB of memory per mpi-task (=mpi ranks, =cores)
# and since the amount of memory on the nodes is 2500MB per core
# when using all 48 cores we have to use 2 nodes and only half
# the cores
#SBATCH --mem-per-cpu=4000

module add openmpi/<compiler>

srun -n 48 ./mpi_largemem

Notes

  • Load any needed modules in the script, unless already done before job is submitted (remember, SLURM exports the environment as per default, unless you do set --export=NONE)
An MPI job which reports various useful information as well

2 nodes, 24 processors spread over these two nodes.

#!/bin/bash
#SBATCH -A <account>
#SBATCH -N 2
#SBATCH -n 24
# Spread the tasks evenly among the nodes
#SBATCH --ntasks-per-node=12
#SBATCH --time=35:15:00
# Want the node exlusively
#SBATCH --exclusive 

echo "Starting at `date`"
echo "Running on hosts: $SLURM_NODELIST"
echo "Running on $SLURM_NNODES nodes."
echo "Running on $SLURM_NPROCS processors."
echo "Current working directory is `pwd`"

srun ./mpi_bigmem
echo "Program finished with exit code $? at: `date`"
Running fewer MPI tasks than the cores you have available

 

#!/bin/bash 
# ask for 4 full nodes
#SBATCH -N 4
#SBATCH --exclusive       
# ask for 1 day and 3 hours of run time
#SBATCH -t 1-03:00:00   
# Account name to run under 
#SBATCH -A <account>
# a sensible name for the job
#SBATCH -J my_job_name  

# run only 1 MPI task/process on a node, even if they have 48 cores.
# It assumes OpenMPI are being used. 
srun -n 4 --ntasks-per-node=1 ./my_program
VASP serial
#!/bin/bash
#SBATCH -o vasp.%j.out
#SBATCH -J <job_name>
#SBATCH -A <account>
#SBATCH -n 1
#SBATCH -t 24:00:00

# Load modules, unless already done before job is submitted (remember,
# SLURM export the environment as per default, unless you 
# set --export=NONE)
module load vasp

# Change to the directory with your VASP input files (remember,
# SLURM jobs will run in the directory the job is submitted from, 
# per default)
cd mydir

vasp
VASP MPI
#!/bin/bash
#SBATCH -o vasp.%j.out
#SBATCH -J <job_name>
#SBATCH -A <account>
#SBATCH -N 4
#SBATCH -n 48
# spread the tasks evenly among the nodes
#SBATCH --ntasks-per-node=12
#SBATCH -t 24:00:00

# Load modules, unless already done before job is submitted (remember,
# SLURM export the environment as per default, unless you 
# set --export=NONE
module load vasp

# Change to the directory with your VASP input files (remember,
# SLURM jobs will run in the directory the job is submitted from, 
# per default)
cd mydir

srun vasp.NGZhalf.mpi