GROMACS

Software name: 
GROMACS
Policy 

GROMACS is available to users at HPC2N under the condition that published work include citation of the program. GROMACS is Free Software, available under the GNU General Public License.

General 

GROMACS (GROningen MAchine for Chemical Simulations) is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

Description 

GROMACS was first developed in Herman Berendsens group, department of Biophysical Chemistry of Groningen University. It is a team effort, with contributions from several current and former developers all over world.

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

It is primarily designed for biochemical molecules like proteins, lipids and nucleic acids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non-biological systems, e.g. polymers.

GROMACS supports all the usual algorithms you expect from a modern molecular dynamics implementation.

Availability 

On HPC2N we have GROMACS available as a module on Abisko and Kebnekaise.

On Kebnekaise, some of the versions of GROMACS have suffixes:

  • -hybrid means it is built with OpenMP/MPI
  • -mt means it is built with OpenMP, but not MPI

Newer versions (2016.2 and forward) are built with both, as default.

Usage at HPC2N 

To use the gromacs module, add it to your environment. Use:

module spider gromacs

or

ml spider gromacs

to see which versions are available, as well as how to load the module and the needed prerequisites. There are several versions. 

Loading the module should set all the needed environmental variables as well as the path. 

Note that while the case does not matter when you use "ml spider", it is necessary to match the case when loading the modules.

You can read more about loading modules on our Accessing software with Lmod page and our Using modules (Lmod) page.

Gromacs on GPUs (Kebnekaise)

In order to access the GPU aware version of Gromacs, you need to load one of the versions compiled with CUDA, and then load the prerequisites given that includes CUDA.

Example

$ ml spider GROMACS/2016-hybrid

---------------------------------------------------------------------------------------
  GROMACS: GROMACS/2016-hybrid
---------------------------------------------------------------------------------------
    Description:
      GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the
      Newtonian equations of motion for systems with hundreds to millions of
      particles. - Homepage: http://www.gromacs.org

    You will need to load all module(s) on any one of the lines below before the "GROMACS/2016-hybrid" module is available to load.

      GCC/5.4.0-2.26  CUDA/8.0.44  OpenMPI/2.0.1
      GCC/5.4.0-2.26  OpenMPI/1.10.3
      GCC/6.2.0-2.27  OpenMPI/2.0.1
 
    Help:
      GROMACS is a versatile package to perform molecular dynamics,
       i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles. - Homepage: http://www.gromacs.org

You then need to load

ml GCC/5.4.0-2.26
ml CUDA/8.0.44
ml OpenMPI/2.0.1
ml GROMACS/2016-hybrid

Setup and running

There are some differences between how you run Gromacs versions 4.x and newer versions. The focus in this documentation is on newer versions (mainly version 2016.x). You can find manuals for older versions here: ftp://ftp.gromacs.org/pub/manual/

When you have loaded Gromacs and its prerequisites, you can find the executables etc. under the directory pointed to by the environment variable $EBROOTGROMACS.

You can get some information about the particular version of Gromacs, do

gmx -version

To run Gromacs, you first need to prepare various input files (.gro and .pdb for molecular structure, .top for topology and the main parameters file, .mdp).

The following steps are needed to do your setup (adapted from the Gromacs homepage's Getting started guide):

1) The molecular topology file is generated by the program

gmx pdb2gmx

gmx pdb2gmx translates a pdb structure file of any peptide or protein to a molecular topology file. This topology file contains a complete description of all the interactions in your peptide or protein.

2) When gmx pdb2gmx is executed to generate a molecular topology, it also translates the structure file (pdb file) to a GROMOS structure file (gro file). The main difference between a pdb file and a gromos file is their format and that a gro file can also hold velocities. However, if you do not need the velocities, you can also use a pdb file in all programs.

3) To generate a box of solvent molecules around the peptide, the program

gmx solvate

is used. First the program

gmx editconf

should be used to define a box of appropriate size around the molecule. gmx solvate solvates a solute molecule (the peptide) into any solvent. The output of gmx solvate is a gromos structure file of the peptide solvated. gmx solvate also changes the molecular topology file (generated by gmx pdb2gmx) to add solvent to the topology.

4) The Molecular Dynamics Parameter (mdp) file contains all information about the Molecular Dynamics simulation itself e.g. time-step, number of steps, temperature, pressure etc. The easiest way of handling such a file is by adapting a sample mdp file. A sample mdp file is available from the Gromacs homepage.

5) The next step is to combine the molecular structure (gro file), topology (top file) MD-parameters (mdp file) and (optionally) the index file (ndx) to generate a run input file (tpr extension). This file contains all information needed to start a simulation with GROMACS. The

gmx grompp

program processes all input files and generates the run input tpr file.

6) Once the run input file is available, we can start the simulation. The program which starts the simulation is called

gmx mdrun

or

gmx_mpi mdrun

The only input file of gmx mdrun that you usually need in order to start a run is the run input file (tpr file). The typical output files of gmx mdrun are the trajectory file (trr file), a logfile (log file), and perhaps a checkpoint file (cpt file).

Submit file examples

Below is some examples on how to run Gromacs, gmx mdrun, jobs on Abisko or Kebnekaise. The main differences between the clusters is the number of cores per node, and the amount of memory per node.

Note that in either case, you must run from the parallel file system.

When using gmx mdrun (and gmx_mpi mdrun) it is important to specify the -ntomp option. If not gmx(_mpi) mdrun will try to use all the cores on the node by adding multiple OpenMP threads to each (MPI) task. If the batch job does not have the whole node allocated (using --exclusive, --ntasks-per-node=48 (on Abisko, 28 on standard Kebnekaise nodes) or other means) this will result in overallocation of the cores resulting in severely reduced performance.

Single node using all cores

#!/bin/bash
# Change to your actual SNIC project number
#SBATCH -A SNICXXXX-YY-ZZ
# Asking for 30 hours walltime
#SBATCH -t 30:00:00
# Ask for a full node
#SBATCH -N 1
#SBATCH -n 28
#SBATCH --exclusive

# Load the module for gromacs and its prerequisites.
# This is for GROMACS/2016-hybrid on Kebnekaise, with no GPUs
ml GCC/6.2.0-2.27
ml OpenMPI/2.0.1
ml GROMACS/2016-hybrid

srun -n $SLURM_CPUS_ON_NODE gmx mdrun -ntomp 1 -deffnm md_0

Single node using 6 cores on Abisko

#!/bin/bash
# Change to your actual SNIC project number
#SBATCH -A SNICXXXX-YY-ZZ
# Asking for 30 hours walltime
#SBATCH -t 30:00:00
# Ask for 6 cores
#SBATCH -c 6

# Load the modules for gromacs and its prerequisites.
ml GCC/6.3.0-2.27
ml OpenMPI/2.0.2
ml GROMACS/2016.3

srun gmx mdrun -ntmpi $SLURM_CPUS_ON_NODE -ntomp 1 -deffnm md_0

Using more than one node, MPI only based mdrun, on Abisko

#!/bin/bash
# Change to your actual SNIC project number
#SBATCH -A SNICXXXX-YY-ZZ
# Asking for 30 hours walltime
#SBATCH -t 30:00:00
# Ask for 192 tasks for use by MPI
#SBATCH -n 192

# Load the modules for gromacs and its prerequisites.
ml GCC/6.3.0-2.27
ml OpenMPI/2.0.2
ml GROMACS/2016.3 

# Automatic selection of options to gmx_mpi mdrun depending on parameters given to SBATCH
if [ -n "$SLURM_CPUS_PER_TASK" ]; then
    md="gmx_mpi mdrun -ntomp $SLURM_CPUS_PER_TASK"
else
    md="gmx_mpi mdrun -ntomp 1"
fi

srun $md -deffnm md_0

Using more than one node, MPI and OpenMP, on Abisko

NOTE: Can only be used when using cut-off scheme Verlet
Here is an example of when -ntomp will be set to something else then "1" by using the "-c" parameter to SBATCH requesting mutliple cores per MPI tasks (-n).
In this case the environment variable "SLURM_CPUS_PER_TASK" will be set to 6 (#SBATCH -c 6).

#!/bin/bash
# Change to your actual SNIC project number
#SBATCH -A SNICXXXX-YY-ZZ
# Asking for 30 hours walltime
#SBATCH -t 30:00:00
# Ask for 32 tasks for use by MPI
#SBATCH -n 32
# Ask for 6 cores per MPI-task
#SBATCH -c 6

# Load the modules for gromacs and its prerequisites.
ml GCC/6.3.0-2.27
ml OpenMPI/2.0.2
ml GROMACS/2016.3 

# Automatic selection of options to gmx_mpi mdrun depending on parameters given to SBATCH
if [ -n "$SLURM_CPUS_PER_TASK" ]; then
    md="gmx_mpi mdrun -ntomp $SLURM_CPUS_PER_TASK"
else
    md="gmx_mpi mdrun -ntomp 1"
fi

srun $md -deffnm md_0

Using MPI and GPUs, on Kebnekaise

#!/bin/bash
# Change to your actual SNIC project number
#SBATCH -A SNICXXXX-YY-ZZ
# Name of the job
#SBATCH -J Gromacs-gpu-job
# Asking for one hour of walltime
#SBATCH -t 01:00:00
# Ask for 4 mpi processes
#SBATCH -n 4
# Ask for 7 threads
#SBATCH -c 7
# Remember the total number of cores = 28 (kebnekaise) so that
# n x c = 28 (running on a single node)
# Asking for 2 GPUs
#SBATCH --gres=gpu:k80:2
#SBATCH -p batch

# It is always best to do a ml purge before loading other modules
ml purge

ml GCC/5.4.0-2.26  
ml CUDA/8.0.44
ml OpenMPI/2.0.1
ml GROMACS/2016-hybrid

if [ -n "$SLURM_CPUS_PER_TASK" ]; then
    mdargs="-ntomp $SLURM_CPUS_PER_TASK"
else
    mdargs="-ntomp 1"
fi

#The number of OpenMP threads should be equal to the number of cores/process (c)
export OMP_NUM_THREADS=$SLURM_CPUS_PER_TASK
srun -n $SLURM_NTASKS gmx_mpi mdrun $mdargs -dlb yes  -v -deffnm <scriptname>

 

Using MPI and KNLs, on Kebnekaise

#!/bin/bash
# Change to your actual SNIC project number
#SBATCH -A SNICXXXX-YY-ZZ
# Name of the job
#SBATCH -J Gromacs-knl-job
# Asking for one hour of walltime
#SBATCH -t 01:00:00
# Ask for 68 mpi processes
#SBATCH -n 68
# Ask for 4 threads per core
#SBATCH --threads-per-core=4
# Ask for memory
#SBATCH --constraint=hemi
# Ask for knl queue
#SBATCH -p knl

# It is always best to do a ml purge before loading other modules
ml purge

ml GCC/6.3.0-2.27  OpenMPI/2.0.2
ml GROMACS/2016.3

if [ -n "$SLURM_CPUS_PER_TASK" ]; then
    mdargs="-ntomp $SLURM_CPUS_PER_TASK"
else
    mdargs="-ntomp 1"
fi

# The number of mpi processes is equal to n x number of threads =
# 68 x 4 = 272 in this case example
gmx mdrun -ntmpi 272 $mdargs -v -deffnm <scriptname>

Submit the script with

sbatch <scriptname>

In the case of KNL nodes we obtained the best performance by using Hemisphere cluster mode (hemi keyword above).

More information about batchscripts here.

A comparison of runs on the various types of nodes on Kebnekaise is displayed below. We evaluated the performance of different GROMACS implementations including MPI-Only (with 28 cores), KNL (with  272 MPI processes), and GPU (with 4 MPI processes and 7 OpenMP threads). The figure below shows the best performance of GROMACS on a single node obtained by changing the values of input parameters (MPI number of processes and OpenMP number of threads). The benchmark case consisted of 158944 particles, using 1 fs. for time step and a cutoff of 1.2 nm. for real space electrostatics calculations. Particle mesh Ewald was used to solve long-range electrostatic interactions.

profiling_gromacs.png

Additional info 

Documentation is available on the Gromacs documentation page and the Gromacs Online Reference page.

Citations (Principal Papers)

  1. Berendsen, et al. (1995) Comp. Phys. Comm. 91: 43-56. (DOI, Citations of this paper)
  2. Lindahl, et al. (2001) J. Mol. Model. 7: 306-317. (DOI, Citations of this paper)
  3. van der Spoel, et al. (2005) J. Comput. Chem. 26: 1701-1718. (DOI, Citations of this paper)
  4. Hess, et al. (2008) J. Chem. Theory Comput. 4: 435-447. (DOI, Citations of this paper)
  5. Pronk, et al. (2013) Bioinformatics 29 845-854. (DOICitations of this paper)
  6. Páll, et al. (2015) Proc. of EASC 2015 LNCS, 8759 3-27. (DOI, arxivCitations)
  7. Abraham, et al. (2015) SoftwareX 1-2 19-25 (DOI, Citations)
Updated: 2017-12-14, 12:27