Running your first job on Akka

Quick-start guide: Running your first job

1. Login to akka (or other cluster)

ssh -X username@akka.hpc2n.umu.se

Note: If you stay logged in for more than 24 hours your Kerberos ticket will expire. Create a new with the command kinit.

2. You need to submit jobs from the parallel filesystem

Read more about the parallel filesystem here.

  • Check your home directory. There should be an entry for /pfs
  • If there isn't, you need to make a symbolic link (change u/username/ to relevant):
    $ ln -s /pfs/nobackup$HOME $HOME/pfs
    
  • Change to directory pfs
    $ cd ~/pfs
    
  • make a directory here for your files and copy everything you need to run your job there: executable, script, datafiles, etc.

3. Compiling and linking your program

You need to load the compiler module as well as some library modules, like MPI, if you are going to use them. The same goes for any application programs (you can read more about those here). All the examples below loads the default version, unless otherwise specified. 

Compiler Akka Loading Compiler command (C, C++, Fortran)
PathScale module load psc pathcc, pathCC, pathf90
GCC already in path gcc,g++,gfortran
PGI module load pgi pgcc,pgCC,pgf77,pgf90,pgf95
Intel module load intel-compiler icc, ifort

 

MPI library Akka Loading Compiler command (C, C++, Fortran)
OpenMPI module load openmpi/compiler,
where compiler=psc|pgi|gcc|intel
mpicc,mpiCC,mpif77,mpif90
Mvapich module load mvapich/compiler,
where compiler=psc|pgi|intel
mpicc,mpicxx,mpif77,mpif90

 

Read more about linking with MPI here and here.

Examples below will mostly be for a Fortran 90 program, using the PathScale compilers and OpenMPI. Aside from the modules loaded below, you also need to load the relevant compiler you wish to compiler with. See here for further examples. 

  • BLAS
    • Installed on: Akka
    • Loading: module load gotoblas2/psc
    • Compile/link command:
      pathf90 program.f90 -o program $GOTO2_LDFLAGS -lgoto2p  (threaded)
      pathf90 program.f90 -o program $GOTO2_LDFLAGS -lgoto2 (non-threaded)
      
    • Local Blas info
  • CERNLIB
  • LAPACK
    • Installed on: Akka
    • Loading:
      module load lapack/psc
      module load gotoblas2/psc
      
    • Compile/link command:
      Fortran 77: 
      pathf90 program.f -o program $LAPACK_LDFLAGS $GOTO2_LDFLAGS -llapack -lgoto2
      Fortran 90: 
      pathf90 program.f90 -o program $LAPACK_LDFLAGS $GOTO2_LDFLAGS -llapack -lgoto2
      
    • Local Lapack info
  • BLACS
    • Installed on: Akka
    • Loading: a module is needed for both Blacs and MPI:
      module load openmpi/psc
      module load blacs/psc 
      
    • Compile/link command:
      C: mpicc program.c -o program $BLACS_LDFLAGS -lblacsCinit_OMPI -lblacs_OMPI
      F 77: mpif77 program.f -o program $BLACS_LDFLAGS -lblacsF77init_OMPI -lblacs_OMPI
      
    • Local Blacs info
  • SCALAPACK
    • Installed on: Akka
    • Loading: Modules are needed for Scalapack, MPI, compiler, Blacs, Blas, and Lapack:
      module load openmpi/psc
      module load gotoblas2/psc
      module load lapack/psc
      module load blacs/psc
      module load scalapack/psc
      
    • Compile/link command:
      Fortran: 
      $SCALAPACK_LDFLAGS -lscalapack -lblacsF77init_OMPI -lblacs_OMPI $GOTO2_LDFLAGS $LAPACK_LDFLAGS -llapack -lgoto2
      
      C: 
      $SCALAPACK_LDFLAGS -lscalapack -lblacsCinit_OMPI -lblacs_OMPI $GOTO2_LDFLAGS $LAPACK_LDFLAGS -llapack -lgoto2
      
    • Local Scalapack info here
  • FFTW
    • Installed on: Akka
    • Loading: module load openmpi/psc, module load libfftw (or module load libfftw/2.1.5 or module load libfftw/3.3 for one of the versions supporting MPI, in which case you must also module load openmpi/psc or other compiler)
    • Compile/link command:
      FFTW 3.2.2: pathcc program.c -lfftw3 -lm $FFTW_INCLUDE $FFTW_LDFLAGS
      FFTW 2.1.5: mpicc program.c -lfftw3 (or -lfftw for 2.1.5) -lm $FFTW_INCLUDE $FFTW_LDFLAGS
      
    • Local FFTW info
  • RECSY
    • Installed on: Akka
    • Loading:
      module load gotoblas2/psc
      module load recsy/psc
      
    • Compiler/link command:
      pathf90 program.f -o program $RECSY_LDFLAGS $GOTO2_LDFLAGS -lrecsy -lgoto2 
      
    • Local Recsy info
  • SLICOT
    • Installed on: Akka
    • Loading:
      module load gotoblas2/psc
      module load lapack/psc
      module load psc
      module load libslicot/psc
      
    • Compile/link command:
      gfortran: gfortran program.f -o program $SLICOT_LDFLAGS $LAPACK_LDFLAGS $GOTO2_LDFLAGS  -lslicot_gfortran -llapack -lgoto2
      PGI or PSC: pgf90/pathf90 program.f -o program $SLICOT_LDFLAGS $LAPACK_LDFLAGS $GOTO2_LDFLAGS -lslicot -llapack -lgoto2
      
    • Local Slicot info
  • PARMETIS
    • Installed on: Akka
    • Loading: module load parmetis
    • Compile/link command (\ means the command is continued on next line since it was too long to fit here):
      pathf90 program.f -o program -lparmetis $PARMETIS_INCLUDE \
      $PARMETIS_LDFLAG $PARMETIS_LIBS 
      Using the bundled Metis: 
      pathf90 program.f -o program -lparmetis -lmetis $METIS_INCLUDE \
      $METIS_LDFLAG $METIS_LIBS 
    • Local Parmetis info

4. Example MPI job script

Here is an example of a job script for submitting a MPI program. I have added a great many comments which are of course not needed. Remember that it need to be submitted from the parallel filesystem, and the same directory that your executable file is located in.

Note that ## is used for comments. Only one # must be used before commands to the batch system. 

#!/bin/bash
## Name of the script - unnecessary, but useful to help find the job in a long list 
#PBS -N Parallel
 
## Names of output and error files. You can call them what you will. If you 
## don't give them a name, the output, respectively the error file, will be called 
## <job script name>.o<jobid> and <job script name>.e<jobid>. 
## 
#PBS -o mpi_hello.out
#PBS -e mpi_hello.err
#PBS -m ae

## asking for 2 nodes and 2 processes. Note: Using nodes=1:ppn=8 reserves
## a full node on akka. Runtime (walltime) in this example is 4 minutes 
## max.
#PBS -l nodes=2:ppn=2
#PBS -l walltime=00:04:00
 
## Change to the directory you submitted from and load MPI module (use 
## same as compiled with) 
cd $PBS_O_WORKDIR
module add openmpi/psc
 
## the command pernode makes sure only one process runs per node. Can be
## good if each process wants to run several threads on the node - then a
## thread could run on each processor on the node
mpiexec -pernode ./mpi_hello

5. Submitting the job script

qsub <script_name>

6. Checking the jobs progress

There are several ways to do this.

  • qstat -a: This gives a very long list of all jobs for all users. Bad if there are many jobs.
  • qstat -a -u <username>: This only shows your own jobs.
  • checkjob <jobid>: This gives more info. You get the jobid when you submit your job.
  • showq -u <username>: Slightly different info.

7. Running interactively

You should only use this while testing, and not for running actual jobs. This should be done through a batch job script.

An interactive job means you ask for a specific number of resources, just like in a batch job, but instead of sending the job away, you will be logged in to the desired resources through an interactive session. Just as for batch jobs, it may take a while before you get the resources you ask for, especially if you want many processors on the same node. The advantage is that you can test several runs immediately instead of having to wait a long time in each case, only to learn you had made an error an needed to resubmit.

To run interactively, you must set the flag -I. In the below example we ask for an interactive job, with 2 processors on 2 nodes (a total of 4 processors), and a walltime of 30 min:

qsub -I -l nodes=2:ppn=2,walltime=00:30:00

Remember to change back to the directory you have your data and executable in. If you submitted the interactive job from that directory, you can return to it with this command:

cd $PBS_O_WORKDIR

You must reload any modules you loaded before starting the interactive job, such as compilers and MPI libraries.

8. Examples

8.1 Running a Fortran 90 MPI program as an interactive job

In this example we are going to run on akka. We are going to ask for 4 processors on each of 2 nodes. We are going to ask for them in 30 min. We are furthermore going to use OpenMPI and PathScale compilers.

$ ssh akka
$ cd pfs/
$ mkdir my_testdir
$ cd my_testdir
$ cp /home/u/user123/*.f90 .
$ qsub -I -l nodes=2:ppn=4,walltime=00:30:00
qsub: waiting for job 617425.p-mn01.hpc2n.umu.se to start
qsub: job 617425.p-mn01.hpc2n.umu.se ready

$ cd $PBS_O_WORKDIR
$ module load openmpi/psc
$ mpif90 mpi_hello.f90 -o mpi_hello
$ mpiexec mpi_hello
hello_parallel.f: Number of tasks=  8 My rank=  7 My name=p-bc3301.hpc2n.umu.se
hello_parallel.f: Number of tasks=  8 My rank=  3 My name=p-bc4011.hpc2n.umu.se
hello_parallel.f: Number of tasks=  8 My rank=  1 My name=p-bc4011.hpc2n.umu.se
hello_parallel.f: Number of tasks=  8 My rank=  2 My name=p-bc4011.hpc2n.umu.se
hello_parallel.f: Number of tasks=  8 My rank=  5 My name=p-bc3301.hpc2n.umu.se
hello_parallel.f: Number of tasks=  8 My rank=  0 My name=p-bc4011.hpc2n.umu.se
hello_parallel.f: Number of tasks=  8 My rank=  4 My name=p-bc3301.hpc2n.umu.se
hello_parallel.f: Number of tasks=  8 My rank=  6 My name=p-bc3301.hpc2n.umu.se
$ 

You can use

cat $PBS_NODEFILE

to see which processors you have got. As you can see, you got 4 on each of two cores.

$ cat $PBS_NODEFILE
p-bc4011
p-bc4011
p-bc4011
p-bc4011
p-bc3301
p-bc3301
p-bc3301
p-bc3301
$ 

The interactive job can be ended at any time with the command

exit

8.2 Running a Fortran 90 MPI program as a batch job

In this example we are going to run on akka. We are going to ask for 2 processors on each of 2 nodes. We are going to ask for a walltime of 4 minutes, which should be more than enough. Furthermore, we are going to use OpenMPI and PathScale compilers.

$ ssh akka
$ cd pfs/
$ mkdir my_testdir
$ cd my_testdir
$ cp /home/u/user123/*.f90 .

You can either compile the program now - if you have access to the same architecture (which you do when you are submitting to say, akka, and are already logged in to akka) - or you can add the compilation as part of the job script.

There is an example job script here were compilation is done on the compute nodes instead of on the submit node.

Write a job script (I have called it mpi_submit). Here is an example. As mentioned above, we ask for 2 processors on each of 2 nodes, and a walltime of 4 minutes. Also, we are going to use -pernode, which assures that only one process is run per node. This means we expect to get to MPI processes. Further, we are going to ask for 2200 MB physical memory and 2900 MB Virtual + physical memory.

#!/bin/bash
## Name of the script - unnecessary, but useful 
## to help find the job in a long list 
#PBS -N Parallel
 
## Names of output and error files. You can call them what you will. If you 
## don't give them a name, the output, respectively the error file, will be called 
## <job script name>.o<jobid> and <job script name>.e<jobid>. 
## 
#PBS -o mpi_hello.out
#PBS -e mpi_hello.err
#PBS -m ae

## asking for 2 nodes and 2 processes. Note: Using nodes=1:ppn=8 reserves
## a full node on akka. Runtime (walltime) in this example is 4 minutes 
## max.
#PBS -l nodes=2:ppn=2
#PBS -l walltime=00:04:00
 
# memory requirements, physical memory (pmem) at least 2200 mb and  
# virtual + physical memory (pvmem) at least 2900 mb 
#PBS -l pmem=2200mb
#PBS -l pvmem=2900mb

## Change to the directory you submitted from and load MPI module (use 
## same as compiled with) 
cd $PBS_O_WORKDIR
module add openmpi/psc
 
## the command pernode makes sure only one process runs per node. Can be
## good if each process wants to run several threads on the node - then a
## thread could run on each processor on the node
mpiexec -pernode ./mpi_hello

Submit the job script (mpi_submit):

$ qsub mpi_submit
617652.p-mn01.hpc2n.umu.se
$ 

You can now check the status of the job:

$ checkjob 617652


checking job 617652

State: Running
Creds:  user:<username>  group:folk  account:DEFAULT  class:batch  qos:DEFAULT
WallTime: 00:00:00 of 00:04:00
SubmitTime: Tue Mar 16 11:29:55
  (Time Queued  Total: 00:02:09  Eligible: 00:02:09)

StartTime: Tue Mar 16 11:32:04
Total Tasks: 4

Req[0]  TaskCount: 4  Partition: DEFAULT
Network: [NONE]  Memory >= 2200M  Disk >= 0  Swap >= 2900M
Opsys: [NONE]  Arch: [NONE]  Features: [NONE]
Dedicated Resources Per Task: PROCS: 1  MEM: 2200M  SWAP: 2900M
Allocated Nodes:
[p-bc1110:2][p-bc1105:2]


IWD: [NONE]  Executable:  [NONE]
Bypass: 0  StartCount: 1
PartitionMask: [ALL]
Flags:       RESTARTABLE PREEMPTOR

Reservation '617652' (00:00:00 -> 00:04:00  Duration: 00:04:00)
PE:  4.41  StartPriority:  224434

$ 

or

$ showq -u <username>
ACTIVE JOBS--------------------
JOBNAME            USERNAME      STATE  PROC   REMAINING            STARTTIME

617652             <username>    Running     4    00:04:00  Tue Mar 16 11:32:04

     1 Active Job     4498 of 5336 Processors Active (84.30%)
                       667 of  667 Nodes Active      (100.00%)

IDLE JOBS----------------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


0 Idle Jobs

BLOCKED JOBS----------------
JOBNAME            USERNAME      STATE  PROC     WCLIMIT            QUEUETIME


Total Jobs: 1   Active Jobs: 1   Idle Jobs: 0   Blocked Jobs: 0
$ 

It is running. We asked for 4 minutes, and there are 4 minutes remaining of the running. Since it really was a very short program, we could have just asked for 30 seconds and that would have been enough.

The job may sometimes sit in the queue for a while before it starts running, so it may well take a good deal longer than the walltime you asked for.

When the job has completed, you have new files in your directory:

$ ls
mpi_hello.err  mpi_hello.out  mpi_hello*  mpi_hello.f90  mpi_submit
$ 

mpi_hello.err contains any error that was generated by the job and mpi_hello.out comtains the output of your job:

$ cat mpi_hello.err
$ cat mpi_hello.out
hello_parallel.f: Number of tasks=  2 My rank=  0 My name=p-bc1110.hpc2n.umu.se
hello_parallel.f: Number of tasks=  2 My rank=  1 My name=p-bc1105.hpc2n.umu.se
$ 
8.3 Compiling and running a Fortran 90 program that uses Lapack

In this example we are going to compile a Fortran 90 program that uses Lapack. We are going to run on akka and we are going to use the PathScale compilers. We will submit the program as a batch job.

$ ssh akka
$ cd pfs/
$ mkdir my_testdir
$ cd my_testdir
$ cp /home/u/user123/*.f90 .

Then we load the PathScale compilers and then compiler the Fortran 90 program. Note that -lgoto is the Goto BLAS library which is needed for Lapack. It is linked twice to get the maximum speed out of it by using the Lapack functions that are built into gotoblas too. Remember to also load the modules lapack/psc and gotoblas2/psc.

$ module load psc
$ module load lapack/psc
$ module load gotoblas2/psc
$ pathf90 -lgoto -llapack -lgoto lapack_test.f90 -o lapack_test

We are now read to run the program. First we need a job script. This will work nicely (our Fortran 90 Program that uses Lapack are called lapack_test):

#!/bin/bash
## Name of the script - unnecessary, but useful 
## to help find the job in a long list 
#PBS -N Lapack_Test_Run
 
## Names of output and error files. You can call them what you will. If you 
## don't give them a name, the output, respectively the error file, will be called 
## <job script name>.o<jobid> and <job script name>.e<jobid>. 
## 
#PBS -o Lapack_test.out
#PBS -e Lapack_test.err
#PBS -m ae

## asking for 1 node and 1 processor. Note: Using nodes=1:ppn=8 reserves
## a full node on akka. Runtime (walltime) in this example is 4 minutes 
## max. I only ask for 1 node and processor since this is a serial job. 
#PBS -l nodes=1:ppn=1
#PBS -l walltime=00:04:00
 
# memory requirements, physical memory (pmem) at least 2200 mb and  
# virtual + physical memory (pvmem) at least 2900 mb 
#PBS -l pmem=2200mb
#PBS -l pvmem=2900mb

## Change to the directory you submitted from and load MPI module (use 
## same as compiled with) 
cd $PBS_O_WORKDIR
module add psc
module add lapack/psc
module add gotoblas2/psc 
 
./lapack_test

Then I submit it with (I called my script lapack.submit):

$ qsub lapack.submit 

9. More local information