'Job submission file' is the official SLURM name for the file you use to submit your program and ask for resources from the job scheduler. Here we will be using it interchangeably with 'script' or 'batch script'.
Commands to the batch scheduler is prefaced with #SBATCH, these are also called directives. You can also add normal shell commands to the script.
All SLURM directives can be given on the command line instead of in the script.
Remember - the scripts and all programs called by them, must be executable!
The examples below assume you are submitting the job from the same directory your program is located in - otherwise you need to give the full path.
Asking for 4 tasks, then running the program
#!/bin/bash srun -A <account> -n 4 my_program
Asking for 4 tasks distributed across 2 nodes, running for no longer than 30 minutes in the account <account>. Running the program "my_program".
#!/bin/bash #SBATCH -A <account> #SBATCH -n 4 #SBATCH --ntasks-per-node=2 #SBATCH --time=00:30:00 srun ./my_program
Submit the job with
assuming the above script is named my_jobscript.
The scripts job1.batch and job2.batch could be very simple scripts, only containing a line with "./my_program".
You then submit them both with a script like this.
#!/bin/bash #SBATCH -A <account> #SBATCH -n 2 #SBATCH --time=00:30:00 # Use '&' to start the first job in the background srun -n 1 ./job1.batch & srun -n 1 ./job2.batch # Use 'wait' as a barrier to collect both executables when they are done. If not the batch job will finish when the job2.batch program finishes and kill job1.batch if it is still running. wait
Normally, SLURM produces one output file called slurm-<jobid>.out containing the combined standard output and errors from the run (though files created by the program itself will of course also be created). If you wish to rename the output and error files, and get them in separate files, you can do something similar to this:
#!/bin/bash #SBATCH -A <account> #SBATCH -n 2 #SBATCH --time=00:05:00 #SBATCH --error=job.%J.err #SBATCH --output=job.%J.out srun ./my_program
#!/bin/bash #SBATCH -A <account> #SBATCH -N 4 #SBATCH --exclusive #SBATCH --time=00:05:00 srun ./my_program
2 nodes, 48 cores, 1 hour, memory per task=4000 MB. The example below is for Abisko, which has 48 cores per node. Change the number of nodes accordingly for the cluster you are submitting jobs to.
#!/bin/bash #SBATCH -A <account> #SBATCH -N 2 # use --exclusive to get the whole nodes exclusively for this job #SBATCH --exclusive #SBATCH --time=01:00:00 # This job needs 5GB of memory per mpi-task (=mpi ranks, =cores) # and since the amount of memory on the nodes is 2500MB per core # when using all 48 cores we have to use 2 nodes and only half # the cores #SBATCH -c 2 module add openmpi/<compiler> mpirun -n 48 ./mpi_largemem
Abisko example. 2 nodes, 24 tasks spread evenly over these two nodes. Change number of nodes according for the cluster you are working on, you would need to spread over more nodes if your cluster has fewer cores per node.
#!/bin/bash #SBATCH -A <account> #SBATCH -n 24 # Spread the tasks evenly among the nodes #SBATCH --ntasks-per-node=12 #SBATCH --time=35:15:00 # Want the node exlusively #SBATCH --exclusive echo "Starting at `date`" echo "Running on hosts: $SLURM_NODELIST" echo "Running on $SLURM_NNODES nodes." echo "Running $SLURM_NTASKS tasks." echo "Current working directory is `pwd`" mpirun ./mpi_bigmem echo "Program finished with exit code $? at: `date`"
#!/bin/bash # ask for 4 full nodes #SBATCH -N 4 #SBATCH --exclusive # ask for 1 day and 3 hours of run time #SBATCH -t 1-03:00:00 # Account name to run under #SBATCH -A <account> # a sensible name for the job #SBATCH -J my_job_name # run only 1 MPI task/process on a node, even if they have 48 cores. # It assumes OpenMPI are being used. mpirun -n 4 --ntasks-per-node=1 ./my_program
#!/bin/bash #SBATCH -o vasp.%j.out #SBATCH -J <job_name> #SBATCH -A <account> #SBATCH -n 1 #SBATCH -t 24:00:00 # Load modules, unless already done before job is submitted (remember, # SLURM export the environment as per default, unless you # set --export=NONE) module load vasp vasp
Example is for Abisko. Change the number of tasks per node according to the number of cores per node on your cluster.
#!/bin/bash #SBATCH -o vasp.%j.out #SBATCH -J <job_name> #SBATCH -A <account> #SBATCH -n 48 # spread the tasks evenly among the nodes #SBATCH --ntasks-per-node=12 #SBATCH -t 24:00:00 # Load modules, unless already done before job is submitted (remember, # SLURM export the environment as per default, unless you # set --export=NONE module load vasp mpirun vasp_std
Here we run several jobs after each other. The example is for Kebnekaise. On Abisko it is better to pick a number of cores that is divisible by 6. Of course, a similar example will work for serial jobs. Just remove the srun from the command.
#!/bin/bash #SBATCH -A <account> # Asking for one hour. Adjust accordingly. Remember, the time must be long enough # for each of the jobs to complete #SBATCH -t 01:00:00 # Ask for the number of cores the job needing most cores will use. It is better to pick jobs that run on about the # same number of cores, so as not to waste cores by them doing nothing during the jobs that might need fewer #SBATCH -n 14 # You only need to give number of cores and tasks if it is different from the total number you asked for # You can also do other things, like copying files and such # Here is run 14 tasks with 2 cores per task. I also throw the output (and errors) to the file myoutput*. # If your job creates output in a file on its own, there is no reason to do this. # I then copy my output somewhere else and then run another executable and another copy ... srun -n 14 -c 2 ./a.out > myoutput1 2>&1 cp myoutput1 /pfs/nobackup/home/u/username/mydatadir srun -n 14 -c 2 ./b.out > myoutput2 2>&1 cp myoutput2 /pfs/nobackup/home/u/username/mydatadir srun -n 14 -c 2 ./c.out > myoutput3 2>&1 cp myoutput3 /pfs/nobackup/home/u/username/mydatadir ...
Here we run several jobs at the same time. Make sure you ask for enough cores that all jobs can run at the same time, and have enough memory. Of course, this will also work for serial jobs - just remove the srun from the command line.
Notice the "&" at the end of each srun command. Also note the "wait" command at the end of the batch script. It is important as it makes sure the batch job wait until all the simultaneous sruns are completed.
#!/bin/bash #SBATCH -A <account> # Asking for one hour. Adjust accordingly. Remember, the time must be long enough # for all of the jobs to complete, even the longest #SBATCH -t 02:00:00 # Ask for the total number of cores the jobs need #SBATCH -n 56 srun -n 14 --cpu_bind=cores ./a.out & srun -n 28 --cpu_bind=cores ./b.out & srun -n 14 --cpu_bind=cores ./c.out & ... wait