To best use the resources with Slurm you need to have some basic information about the application you want to run.
Slurm will do its best to fit your job into the cluster, but you have to give it some hints of what you want it to do.
The parameters described below can be given directly as arguments to srun and sbatch.
If you don't give SLURM enough information, it will try to fit your job for best throughput (lowest possible queue time). This approach will not always give the best performance for your job.
To get the best performance, you will need to know the following:
Some extra parameters that might be usefull:
For basic examples for different types, see the following pages:
Some applications may have special needs, in order to get them running at full speed.
Look at the application specific pages for more information about any such special requirements.
Some commonly used programs are listed below.
The account is your project id, this is mandatory.
#SBATCH -A SNIC000-00-000
You can find your project id by running:
The number of tasks is for most usecases the number of processes you want to start. The default value is one (1).
An example could be the number of MPI tasks or the number of serial programs you want to start.
#SBATCH -n 48
If your application is multi threaded (OpenMP/...) this number indicates the number of cores each task can use.
The default value is one (1).
#SBATCH -c 6
If you are running on Abisko, this should preferable be a multiple of six (since that is the number of cores in one socket): 6, 12, 18, 24, 30, 36, 42, or 48 (the number of cores in a node). The reason for this is that you will be accounted for this number of cores (per task), as the smallest accountable unit on Abisko is one socket.
On Kebnekaise there are 28 cores per node.
If your application requires more than the maximum number of available cores in one node (48 for abisko, 28 for kebnekaise) it might be wise to set the number of tasks per node, depending on your job. This is the (minimum) number of tasks allocated per node.
Remember that the number of cores is the product of the number of tasks, times the number of cores per task. This should preferably be a multiple of six on Abisko (the number of cores in one socket): 6, 12, 18, 24, 30, 36, 42, or 48, as you will be accounted for this number of cores (per node). The reason for this is that 6 cores/one socket, is the smallest accountable unit on Abisko.
There are 28 cores per node on Kebnekaise, so this is the maximum number of tasks per node for that system.
If you don't set this option, Slurm will try to spread the task(s) over as few available nodes as possible. This can result in a job with 42 tasks on one node, and 6 on another, for a 48 task job (on abisko). If you let slurm spread your job it is more likely to start faster, but the performance of the job might be hurting. If you are using more than 48 cores (abisko) / 28 cores (kebnekaise) and are unsure of how your application behaves, it is probably a good thing to put an even spread over the number of required nodes.
There is no need to tell slurm how many nodes that you job needs. It will do the math.
|Abisko bigmem||10750 MB|
|Kebnekaise largemem||41666 MB|
Each core has a limited amount of memory available. If your job requires more memory than the default, you can allocate more cores for your task with (-c).
If, for instance, you need 5000MB/task on Abisko, set "-c 2".
# I need 2 x 2500MB (5000MB) of memory for my job on Abisko. #SBATCH -c 2
This will allocate two (2) cores with 2500MB each. If your code is not multi-threaded (using only one core per task) the other one will just add its memory to your job.
If your job requires more than 120000MB / node on Abisko, there is a limited number of nodes with 512000MB memory accessable by selecting the bigmem partition of the cluster. You do this by setting: -p bigmem.
#SBATCH -p bigmem
If your job requires more than 126000MB / node on Kebnekaise, there is a limited number of nodes with 3072000MB memory, which you may be allowed to use (contact firstname.lastname@example.org). They are accessed by selecting the largemem partition of the cluster. You do this by setting: -p largemem.
#SBATCH -p largemem
If you know the runtime (wall clock time) of your job, it is beneficial to set this value as accurately as possible.
Smaller jobs are more likely to fit into slots of unused space faster.
Note: Please add some extra time to account for variances in the system.
The maximum allowed runtime of any job is seven (7) days.
The format is:
D-HH:MM:SS (D=Day(s), HH=Hour(s), MM=Minute(s), SS=Second(s))
# Runtime limit 2 days, 12hours #SBATCH --time 2-12:00:00
You can also use the --min-time option to set a minimum time for your job.
If you use this, Slurm will try to find a slot with more than --min-time and less than --time. This is useful if your job does periodic checkpoints of data and can restart from that point. This technique can be used to fill openings in the system, that no big jobs can fill, and so allows for better throughput of your jobs.
# Runtime limit 2 days, 12hours #SBATCH --time 2-12:00:00 # # Minimum runtime limit 1 days, 12hours #SBATCH --min-time 1-12:00:00
It is possible to set the number of nodes that slurm should allocate for your job.
This should only be used together with --ntasks-per-node or with --exclusive.
But in almost every case it is better to let slurm calculate the number of nodes required for your job, from the number of tasks, the number of cores per task, and the number of tasks per node.
The output (stdout) and error (stderr) output from your program can be collected with the help of the --output and --error options to sbatch.
# Send stderr of my program into <jobid>.error #SBATCH --error=%J.error # Send stdout of my program into <jobid>.output #SBATCH --output=%J.output
The files in the example will end up in the working directory of you job.
Slurm can send mail to you when certain event types occur. Valid type values are: BEGIN, END, FAIL, REQUEUE, and ALL (any state change).
# Send mail when job ends #SBATCH --mail-type=END
Note: We recommend that you do NOT include a command for the batch system to send an email when the job has finished, particularly if you are running large amounts of jobs. The reason for this is that many mail servers have a limit and may block accounts (or domains) temporarily if they send too many mails. Instead use
scontrol show job <jobid>
squeue -l -u <username>
to see the status of your job(s).
--exclusive can be used with -N (number of nodes) to get all the cores, and memory, on the node(s) exclusively for your job.
# Request complete nodes #SBATCH --exclusive