Quick-start guide: Running your first job
1. Login to akka (or other cluster)
ssh -X username@akka.hpc2n.umu.se
Note: If you stay logged in for more than 24 hours your Kerberos ticket will expire. Create a new with the command kinit.
2. You need to submit jobs from the parallel filesystem
Read more about the parallel filesystem here.
- Check your home directory. There should be an entry for /pfs
-
If there isn't, you need to make a symbolic link (change u/username/ to relevant):
$ ln -s /pfs/nobackup$HOME $HOME/pfs
-
Change to directory pfs
$ cd ~/pfs
- make a directory here for your files and copy everything you need to run your job there: executable, script, datafiles, etc.
3. Compiling and linking your program
You need to load the compiler module as well as some library modules, like MPI, if you are going to use them. The same goes for any application programs (you can read more about those here). All the examples below loads the default version, unless otherwise specified.
| Compiler | Akka | Loading | Compiler command (C, C++, Fortran) |
|---|---|---|---|
| PathScale | ✓ | module load psc | pathcc, pathCC, pathf90 |
| GCC | ✓ | already in path | gcc,g++,gfortran |
| PGI | ✓ | module load pgi | pgcc,pgCC,pgf77,pgf90,pgf95 |
| Intel | ✓ | module load intel-compiler | icc, ifort |
| MPI library | Akka | Loading | Compiler command (C, C++, Fortran) |
|---|---|---|---|
| OpenMPI | ✓ |
module load openmpi/compiler, where compiler=psc|pgi|gcc|intel |
mpicc,mpiCC,mpif77,mpif90 |
| Mvapich | ✓ |
module load mvapich/compiler, where compiler=psc|pgi|intel |
mpicc,mpicxx,mpif77,mpif90 |
Read more about linking with MPI here and here.
Examples below will mostly be for a Fortran 90 program, using the PathScale compilers and OpenMPI. Aside from the modules loaded below, you also need to load the relevant compiler you wish to compiler with. See here for further examples.
-
BLAS
- Installed on: Akka
- Loading: module load gotoblas2/psc
-
Compile/link command:
pathf90 program.f90 -o program $GOTO2_LDFLAGS -lgoto2p (threaded) pathf90 program.f90 -o program $GOTO2_LDFLAGS -lgoto2 (non-threaded)
- Local Blas info
-
CERNLIB
- Installed on: Akka
- Loading: module load cernlib
- linking info
- Local Cernlib info
-
LAPACK
- Installed on: Akka
-
Loading:
module load lapack/psc module load gotoblas2/psc
-
Compile/link command:
Fortran 77: pathf90 program.f -o program $LAPACK_LDFLAGS $GOTO2_LDFLAGS -llapack -lgoto2 Fortran 90: pathf90 program.f90 -o program $LAPACK_LDFLAGS $GOTO2_LDFLAGS -llapack -lgoto2
- Local Lapack info
-
BLACS
- Installed on: Akka
-
Loading: a module is needed for both Blacs and MPI:
module load openmpi/psc module load blacs/psc
-
Compile/link command:
C: mpicc program.c -o program $BLACS_LDFLAGS -lblacsCinit_OMPI -lblacs_OMPI F 77: mpif77 program.f -o program $BLACS_LDFLAGS -lblacsF77init_OMPI -lblacs_OMPI
- Local Blacs info
-
SCALAPACK
- Installed on: Akka
-
Loading: Modules are needed for Scalapack, MPI, compiler, Blacs, Blas, and Lapack:
module load openmpi/psc module load gotoblas2/psc module load lapack/psc module load blacs/psc module load scalapack/psc
-
Compile/link command:
Fortran: $SCALAPACK_LDFLAGS -lscalapack -lblacsF77init_OMPI -lblacs_OMPI $GOTO2_LDFLAGS $LAPACK_LDFLAGS -llapack -lgoto2 C: $SCALAPACK_LDFLAGS -lscalapack -lblacsCinit_OMPI -lblacs_OMPI $GOTO2_LDFLAGS $LAPACK_LDFLAGS -llapack -lgoto2
- Local Scalapack info here
-
FFTW
- Installed on: Akka
-
Loading: module load openmpi/psc, module load libfftw (or module load libfftw/2.1.5 or module load libfftw/3.3 for one of the versions supporting MPI, in which case you must also module load
openmpi/psc or other compiler) -
Compile/link command:
FFTW 3.2.2: pathcc program.c -lfftw3 -lm $FFTW_INCLUDE $FFTW_LDFLAGS FFTW 2.1.5: mpicc program.c -lfftw3 (or -lfftw for 2.1.5) -lm $FFTW_INCLUDE $FFTW_LDFLAGS
- Local FFTW info
-
RECSY
- Installed on: Akka
-
Loading:
module load gotoblas2/psc module load recsy/psc
-
Compiler/link command:
pathf90 program.f -o program $RECSY_LDFLAGS $GOTO2_LDFLAGS -lrecsy -lgoto2
- Local Recsy info
-
SLICOT
- Installed on: Akka
-
Loading:
module load gotoblas2/psc module load lapack/psc module load psc module load libslicot/psc
-
Compile/link command:
gfortran: gfortran program.f -o program $SLICOT_LDFLAGS $LAPACK_LDFLAGS $GOTO2_LDFLAGS -lslicot_gfortran -llapack -lgoto2 PGI or PSC: pgf90/pathf90 program.f -o program $SLICOT_LDFLAGS $LAPACK_LDFLAGS $GOTO2_LDFLAGS -lslicot -llapack -lgoto2
- Local Slicot info
-
PARMETIS
- Installed on: Akka
- Loading: module load parmetis
-
Compile/link command (\ means the command is continued on next line since it was too long to fit here):
pathf90 program.f -o program -lparmetis $PARMETIS_INCLUDE \ $PARMETIS_LDFLAG $PARMETIS_LIBS Using the bundled Metis: pathf90 program.f -o program -lparmetis -lmetis $METIS_INCLUDE \ $METIS_LDFLAG $METIS_LIBS
- Local Parmetis info
4. Example MPI job script
Here is an example of a job script for submitting a MPI program. I have added a great many comments which are of course not needed. Remember that it need to be submitted from the parallel filesystem, and the same directory that your executable file is located in.
Note that ## is used for comments. Only one # must be used before commands to the batch system.
#!/bin/bash ## Name of the script - unnecessary, but useful to help find the job in a long list #PBS -N Parallel ## Names of output and error files. You can call them what you will. If you ## don't give them a name, the output, respectively the error file, will be called ## <job script name>.o<jobid> and <job script name>.e<jobid>. ## #PBS -o mpi_hello.out #PBS -e mpi_hello.err #PBS -m ae ## asking for 2 nodes and 2 processes. Note: Using nodes=1:ppn=8 reserves ## a full node on akka. Runtime (walltime) in this example is 4 minutes ## max. #PBS -l nodes=2:ppn=2 #PBS -l walltime=00:04:00 ## Change to the directory you submitted from and load MPI module (use ## same as compiled with) cd $PBS_O_WORKDIR module add openmpi/psc ## the command pernode makes sure only one process runs per node. Can be ## good if each process wants to run several threads on the node - then a ## thread could run on each processor on the node mpiexec -pernode ./mpi_hello
5. Submitting the job script
qsub <script_name>
6. Checking the jobs progress
There are several ways to do this.
- qstat -a: This gives a very long list of all jobs for all users. Bad if there are many jobs.
- qstat -a -u <username>: This only shows your own jobs.
- checkjob <jobid>: This gives more info. You get the jobid when you submit your job.
- showq -u <username>: Slightly different info.
7. Running interactively
You should only use this while testing, and not for running actual jobs. This should be done through a batch job script.
An interactive job means you ask for a specific number of resources, just like in a batch job, but instead of sending the job away, you will be logged in to the desired resources through an interactive session. Just as for batch jobs, it may take a while before you get the resources you ask for, especially if you want many processors on the same node. The advantage is that you can test several runs immediately instead of having to wait a long time in each case, only to learn you had made an error an needed to resubmit.
To run interactively, you must set the flag -I. In the below example we ask for an interactive job, with 2 processors on 2 nodes (a total of 4 processors), and a walltime of 30 min:
qsub -I -l nodes=2:ppn=2,walltime=00:30:00
Remember to change back to the directory you have your data and executable in. If you submitted the interactive job from that directory, you can return to it with this command:
cd $PBS_O_WORKDIR
You must reload any modules you loaded before starting the interactive job, such as compilers and MPI libraries.
8. Examples
8.1 Running a Fortran 90 MPI program as an interactive job
In this example we are going to run on akka. We are going to ask for 4 processors on each of 2 nodes. We are going to ask for them in 30 min. We are furthermore going to use OpenMPI and PathScale compilers.
$ ssh akka $ cd pfs/ $ mkdir my_testdir $ cd my_testdir $ cp /home/u/user123/*.f90 . $ qsub -I -l nodes=2:ppn=4,walltime=00:30:00 qsub: waiting for job 617425.p-mn01.hpc2n.umu.se to start qsub: job 617425.p-mn01.hpc2n.umu.se ready $ cd $PBS_O_WORKDIR $ module load openmpi/psc $ mpif90 mpi_hello.f90 -o mpi_hello $ mpiexec mpi_hello hello_parallel.f: Number of tasks= 8 My rank= 7 My name=p-bc3301.hpc2n.umu.se hello_parallel.f: Number of tasks= 8 My rank= 3 My name=p-bc4011.hpc2n.umu.se hello_parallel.f: Number of tasks= 8 My rank= 1 My name=p-bc4011.hpc2n.umu.se hello_parallel.f: Number of tasks= 8 My rank= 2 My name=p-bc4011.hpc2n.umu.se hello_parallel.f: Number of tasks= 8 My rank= 5 My name=p-bc3301.hpc2n.umu.se hello_parallel.f: Number of tasks= 8 My rank= 0 My name=p-bc4011.hpc2n.umu.se hello_parallel.f: Number of tasks= 8 My rank= 4 My name=p-bc3301.hpc2n.umu.se hello_parallel.f: Number of tasks= 8 My rank= 6 My name=p-bc3301.hpc2n.umu.se $
You can use
cat $PBS_NODEFILE
to see which processors you have got. As you can see, you got 4 on each of two cores.
$ cat $PBS_NODEFILE p-bc4011 p-bc4011 p-bc4011 p-bc4011 p-bc3301 p-bc3301 p-bc3301 p-bc3301 $
The interactive job can be ended at any time with the command
exit
8.2 Running a Fortran 90 MPI program as a batch job
In this example we are going to run on akka. We are going to ask for 2 processors on each of 2 nodes. We are going to ask for a walltime of 4 minutes, which should be more than enough. Furthermore, we are going to use OpenMPI and PathScale compilers.
$ ssh akka $ cd pfs/ $ mkdir my_testdir $ cd my_testdir $ cp /home/u/user123/*.f90 .
You can either compile the program now - if you have access to the same architecture (which you do when you are submitting to say, akka, and are already logged in to akka) - or you can add the compilation as part of the job script.
There is an example job script here were compilation is done on the compute nodes instead of on the submit node.
Write a job script (I have called it mpi_submit). Here is an example. As mentioned above, we ask for 2 processors on each of 2 nodes, and a walltime of 4 minutes. Also, we are going to use -pernode, which assures that only one process is run per node. This means we expect to get to MPI processes. Further, we are going to ask for 2200 MB physical memory and 2900 MB Virtual + physical memory.
#!/bin/bash ## Name of the script - unnecessary, but useful ## to help find the job in a long list #PBS -N Parallel ## Names of output and error files. You can call them what you will. If you ## don't give them a name, the output, respectively the error file, will be called ## <job script name>.o<jobid> and <job script name>.e<jobid>. ## #PBS -o mpi_hello.out #PBS -e mpi_hello.err #PBS -m ae ## asking for 2 nodes and 2 processes. Note: Using nodes=1:ppn=8 reserves ## a full node on akka. Runtime (walltime) in this example is 4 minutes ## max. #PBS -l nodes=2:ppn=2 #PBS -l walltime=00:04:00 # memory requirements, physical memory (pmem) at least 2200 mb and # virtual + physical memory (pvmem) at least 2900 mb #PBS -l pmem=2200mb #PBS -l pvmem=2900mb ## Change to the directory you submitted from and load MPI module (use ## same as compiled with) cd $PBS_O_WORKDIR module add openmpi/psc ## the command pernode makes sure only one process runs per node. Can be ## good if each process wants to run several threads on the node - then a ## thread could run on each processor on the node mpiexec -pernode ./mpi_hello
Submit the job script (mpi_submit):
$ qsub mpi_submit 617652.p-mn01.hpc2n.umu.se $
You can now check the status of the job:
$ checkjob 617652 checking job 617652 State: Running Creds: user:<username> group:folk account:DEFAULT class:batch qos:DEFAULT WallTime: 00:00:00 of 00:04:00 SubmitTime: Tue Mar 16 11:29:55 (Time Queued Total: 00:02:09 Eligible: 00:02:09) StartTime: Tue Mar 16 11:32:04 Total Tasks: 4 Req[0] TaskCount: 4 Partition: DEFAULT Network: [NONE] Memory >= 2200M Disk >= 0 Swap >= 2900M Opsys: [NONE] Arch: [NONE] Features: [NONE] Dedicated Resources Per Task: PROCS: 1 MEM: 2200M SWAP: 2900M Allocated Nodes: [p-bc1110:2][p-bc1105:2] IWD: [NONE] Executable: [NONE] Bypass: 0 StartCount: 1 PartitionMask: [ALL] Flags: RESTARTABLE PREEMPTOR Reservation '617652' (00:00:00 -> 00:04:00 Duration: 00:04:00) PE: 4.41 StartPriority: 224434 $
or
$ showq -u <username>
ACTIVE JOBS--------------------
JOBNAME USERNAME STATE PROC REMAINING STARTTIME
617652 <username> Running 4 00:04:00 Tue Mar 16 11:32:04
1 Active Job 4498 of 5336 Processors Active (84.30%)
667 of 667 Nodes Active (100.00%)
IDLE JOBS----------------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
0 Idle Jobs
BLOCKED JOBS----------------
JOBNAME USERNAME STATE PROC WCLIMIT QUEUETIME
Total Jobs: 1 Active Jobs: 1 Idle Jobs: 0 Blocked Jobs: 0
$
It is running. We asked for 4 minutes, and there are 4 minutes remaining of the running. Since it really was a very short program, we could have just asked for 30 seconds and that would have been enough.
The job may sometimes sit in the queue for a while before it starts running, so it may well take a good deal longer than the walltime you asked for.
When the job has completed, you have new files in your directory:
$ ls mpi_hello.err mpi_hello.out mpi_hello* mpi_hello.f90 mpi_submit $
mpi_hello.err contains any error that was generated by the job and mpi_hello.out comtains the output of your job:
$ cat mpi_hello.err $ cat mpi_hello.out hello_parallel.f: Number of tasks= 2 My rank= 0 My name=p-bc1110.hpc2n.umu.se hello_parallel.f: Number of tasks= 2 My rank= 1 My name=p-bc1105.hpc2n.umu.se $
8.3 Compiling and running a Fortran 90 program that uses Lapack
In this example we are going to compile a Fortran 90 program that uses Lapack. We are going to run on akka and we are going to use the PathScale compilers. We will submit the program as a batch job.
$ ssh akka $ cd pfs/ $ mkdir my_testdir $ cd my_testdir $ cp /home/u/user123/*.f90 .
Then we load the PathScale compilers and then compiler the Fortran 90 program. Note that -lgoto is the Goto BLAS library which is needed for Lapack. It is linked twice to get the maximum speed out of it by using the Lapack functions that are built into gotoblas too. Remember to also load the modules lapack/psc and gotoblas2/psc.
$ module load psc $ module load lapack/psc $ module load gotoblas2/psc $ pathf90 -lgoto -llapack -lgoto lapack_test.f90 -o lapack_test
We are now read to run the program. First we need a job script. This will work nicely (our Fortran 90 Program that uses Lapack are called lapack_test):
#!/bin/bash ## Name of the script - unnecessary, but useful ## to help find the job in a long list #PBS -N Lapack_Test_Run ## Names of output and error files. You can call them what you will. If you ## don't give them a name, the output, respectively the error file, will be called ## <job script name>.o<jobid> and <job script name>.e<jobid>. ## #PBS -o Lapack_test.out #PBS -e Lapack_test.err #PBS -m ae ## asking for 1 node and 1 processor. Note: Using nodes=1:ppn=8 reserves ## a full node on akka. Runtime (walltime) in this example is 4 minutes ## max. I only ask for 1 node and processor since this is a serial job. #PBS -l nodes=1:ppn=1 #PBS -l walltime=00:04:00 # memory requirements, physical memory (pmem) at least 2200 mb and # virtual + physical memory (pvmem) at least 2900 mb #PBS -l pmem=2200mb #PBS -l pvmem=2900mb ## Change to the directory you submitted from and load MPI module (use ## same as compiled with) cd $PBS_O_WORKDIR module add psc module add lapack/psc module add gotoblas2/psc ./lapack_test
Then I submit it with (I called my script lapack.submit):
$ qsub lapack.submit
9. More local information
- The Filesystem
- Using modules
- Installed compilers
- Compiler Usage
- Common compiler flags
- Libraries and linking
- makefiles
- Using the batchsystem
- Further linking examples
