'Job submission file' is the official SLURM name for the file you use to submit your program and ask for resources from the job scheduler. Here we will be using it interchangeably with 'script' or 'batch script'.
Commands to the batch scheduler is prefaced with #SBATCH, these are also called directives. You can also add normal shell commands to the script.
All SLURM directives can be given on the command line instead of in the script.
Remember - the scripts and all programs called by them, must be executable!
The examples below assume you are submitting the job from the same directory your program is located in - otherwise you need to give the full path.
Asking for 4 tasks, then running the program
#!/bin/bash srun -A <account> -n 4 my_program
Asking for 4 tasks distributed across 2 nodes, running for no longer than 30 minutes in the account <account>. Running the program "my_program".
#!/bin/bash #SBATCH -A <account> #SBATCH -n 4 #SBATCH --ntasks-per-node=2 #SBATCH --time=00:30:00 srun ./my_program
Submit the job with
assuming the above script is named my_jobscript.
The scripts job1.batch and job2.batch could be very simple scripts, only containing a line with "./my_program".
You then submit them both with a script like this.
#!/bin/bash #SBATCH -A <account> #SBATCH -n 2 #SBATCH --time=00:30:00 # Use '&' to start the first job in the background srun -n 1 ./job1.batch & srun -n 1 ./job2.batch # Use 'wait' as a barrier to collect both executables when they are done. If not the batch job will finish when the job2.batch program finishes and kill job1.batch if it is still running. wait
Normally, SLURM produces one output file called slurm-<jobid>.out containing the combined standard output and errors from the run (though files created by the program itself will of course also be created). If you wish to rename the output and error files, and get them in separate files, you can do something similar to this:
#!/bin/bash #SBATCH -A <account> #SBATCH -n 2 #SBATCH --time=00:05:00 #SBATCH --error=job.%J.err #SBATCH --output=job.%J.out srun ./my_program
#!/bin/bash #SBATCH -A <account> #SBATCH -N 4 #SBATCH --exclusive #SBATCH --time=00:05:00 srun ./my_program
2 nodes, 48 cores, 1 hour, memory per task=4000 MB. The example below is for Abisko, which has 48 cores per node. Change the number of nodes accordingly for the cluster you are submitting jobs to.
#!/bin/bash #SBATCH -A <account> #SBATCH -N 2 # use --exclusive to get the whole nodes exclusively for this job #SBATCH --exclusive #SBATCH --time=01:00:00 # This job needs 5GB of memory per mpi-task (=mpi ranks, =cores) # and since the amount of memory on the nodes is 2500MB per core # when using all 48 cores we have to use 2 nodes and only half # the cores #SBATCH -c 2 module add openmpi/<compiler> srun -n 48 ./mpi_largemem
Abisko example. 2 nodes, 24 tasks spread evenly over these two nodes. Change number of nodes according for the cluster you are working on, you would need to spread over more nodes if your cluster has fewer cores per node.
#!/bin/bash #SBATCH -A <account> #SBATCH -n 24 # Spread the tasks evenly among the nodes #SBATCH --ntasks-per-node=12 #SBATCH --time=35:15:00 # Want the node exlusively #SBATCH --exclusive echo "Starting at `date`" echo "Running on hosts: $SLURM_NODELIST" echo "Running on $SLURM_NNODES nodes." echo "Running $SLURM_NTASKS tasks." echo "Current working directory is `pwd`" srun ./mpi_bigmem echo "Program finished with exit code $? at: `date`"
#!/bin/bash # ask for 4 full nodes #SBATCH -N 4 #SBATCH --exclusive # ask for 1 day and 3 hours of run time #SBATCH -t 1-03:00:00 # Account name to run under #SBATCH -A <account> # a sensible name for the job #SBATCH -J my_job_name # run only 1 MPI task/process on a node, even if they have 48 cores. # It assumes OpenMPI are being used. srun -n 4 --ntasks-per-node=1 ./my_program
#!/bin/bash #SBATCH -o vasp.%j.out #SBATCH -J <job_name> #SBATCH -A <account> #SBATCH -n 1 #SBATCH -t 24:00:00 # Load modules, unless already done before job is submitted (remember, # SLURM export the environment as per default, unless you # set --export=NONE) module load vasp vasp
Example is for Abisko. Change the number of tasks per node according to the number of cores per node on your cluster.
#!/bin/bash #SBATCH -o vasp.%j.out #SBATCH -J <job_name> #SBATCH -A <account> #SBATCH -n 48 # spread the tasks evenly among the nodes #SBATCH --ntasks-per-node=12 #SBATCH -t 24:00:00 # Load modules, unless already done before job is submitted (remember, # SLURM export the environment as per default, unless you # set --export=NONE module load vasp srun vasp_std