Compiler Flags

Compiler flags

[Intel Compilers | Gnu Compiler Collection (GCC) Compilers | PathScale Compilers | Portland Group (PGI) Compilers ]

This page contains information about some of the more important/popular flags for the compilers available at HPC2N. The flags below can all be taken to be valid for the Fortran and C/C++ compilers alike, as well as for compiling with the MPI libraries included (remember to load the proper modules - see the page Abisko: Installed compilers or Kebnekaise: Installed compilers for more information about that.)

Intel Compilers

The Intel compilers are installed on Abisko and Kebnekaise. Note that we only have a limited number of licenses. Because of this, you may sometimes experince that there is none available. Wait for a little while and then try again. 

  • -fast This option maximizes speed across the entire program.  
  • -g Produce symbolic debug information in object file. The compiler does not support the generation of debugging information in assemblable files. If you specify the -g option, the resulting object  file  will contain debugging information, but the assemblable file will not. The -g option changes the default optimization from -O2 to -O0. It is often a good idea to add -traceback also, so the compiler generates extra information in the object file to provide source file traceback information.
  • -debug all Enables generation of enhanced debugging information. You need to also specify -g
  • -O0 Disable optimizations. Use if you want to be certain of getting correct code. Otherwise use -O2 for speed.
  • -O Same as -O2
  • -O1 Optimize to favor code size and code locality. Disables loop unrolling. -O1 may improve performance for applications with very large code size, many branches, and execution time not dominated by code within loops. In most cases, -O2 is recommended over -O1.
  • -O2 (default) Optimize for code speed. This is the generally recommended optimization level.
  • -O3 Enable -O2 optimizations and in addition, enable more aggressive optimizations such as loop and memory access transformation, and prefetching. The -O3 option optimizes for maximum speed, but may not improve performance for some programs. The -O3 optimizations may slow down code in some cases compared to -O2 optimizations. Recommended for applications that have loops with heavy use of floating point calculations and process large data sets.
  • -Os Enable speed optimizations, but disable some optimizations that increase code size for small speed benefit.
  • -fpe{0,1,3} Allows some control over floating-point exception (divide by zero, overflow, invalid operation, underflow, denormalized number, positive infinity, negative infinity or a NaN) handling for the main program at runtime. Fortran only. Default is -fpe3 meaning all floating-point exceptions are disabled and floating-point underflow is gradual, unless you explicitly specify a compiler option that enables flush-to-zero. The default value may slow runtime performance.
  • -qopenmp Enable the parallelizer to generate multi-threaded code based on the  OpenMP directives. The code can be executed in parallel on both uniprocessor and multiprocessor systems.
  • -parallel Enable the auto-parallelizer to generate multi-threaded code for loops that can be safely executed in parallel. The -parallel option enables the auto-parallelizer if either the -O2 or -O3 optimization option is also on (the default is -O2). You might need  to set the KMP_STACKSIZE environment variable to an appropriately large size, like 16m, to use this option.

To read about other flags, and for further information, look in the man files. They can be accessed like this:

$ module load intel
$ man ifort
$ man icc 

Here are some links to places with more information:

GNU Compiler Collection (GCC)

  • -o file Place output in file 'file'.
  • -c Compile or assemble the source files, but do not link.
  • -mfma Use FMA4 instructions. Only valid on Abisko. Is activated by default. Can be deactivated with -mno-fma.
  • -fopenmp Enable handling of the OpenMP directives.
  • -g Produce debugging information in the operating systems native format.
  • -O or -O1 Optimize. The compiler tried to reduce code size and execution time. 
  • -O2 Optimize even more. GCC performs nearly all supported optimizations that do not involve a space-speed tradeoff. 
  • -O3 Optimize even more. The compiler will also do loop unrolling and function inlining. RECOMMENDED
  • -O0 Do not optimize. This is the default.
  • -Os Optimize for size.
  • -Ofast Disregard strict standards compliance. -Ofast enables all -O3 optimizations. It also enables optimizations that are not valid for all standard-compliant programs. It turns on -ffast-math and the Fortran-specific -fno-protect-parens and -fstack-arrays.
  • -ffast-math Sets the options -fno-math-errno, -funsafe-math-optimizations, -ffinite-math-only, -fno-rounding-math, -fno-signaling-nans and -fcx-limited-range.
  • -l library Search the library named 'library' when linking.

To read about other flags, and for further information, look in the man files. They can be accessed by first loading the module (see Abisko: Installed Compilers or Kebnekaise: Installed Compilers) and then doing either:

$ man gfortran
$ man gcc
$ man g++

Here are links to places with more information:

PathScale compilers (only on Abisko)

  • -apo invoke autoparallization
  • -g Specify debugging support and indicate level of information produced by the computer.
  • -mfma Use FMA4 instructions. Only valid on Abisko. Is activated by default. Can be deactivated with -mno-fma.
  • -mp  Include OpenMP directives. To autoparallize parts of the code which does not contain OpenMP directives, specify -apo also. Note that -apo should come before -mp when you specify them. IMPORTANT: When mixing OpenMP and MPI one have to set the environment variable PSC_OMP_AFFINITY=FALSE in the submitfile to get the expected behaviour.
  • -O[n] Specify the level of optimization desired. n can be one of the following:
    • 0 Turn off all optimizations.
    • 1 Turn on local optimizations that can be done quickly.
    • 2 Turn on extensive optimization. This is the default. The optimizations done are generally conservative, in the sense that they are virtually always beneficial, provide improvements commensurate to the compile time spent to achive them, and avoid changes which affect such things as floating point accuracy. RECOMMENDED
    • 3 Turn on aggressive optimization. The optimizations at this level are distinguished from -O2 by them generally seeking highest-guality generated code even if it requires extensive compile time. They may include optimizations that are generally beneficial, but may hurt performance. This includes, but is not limited to turning on the Loop Nest Optimizer.
    • s Specify that code size is to be given priority in tradeoffs with execution time.
    • If no value is selected, 2 is assumed.

To read about other flags, and for further information, look in the man files. They can be accessed like this (remember to load the module first):

$ module load psc 
$ man pathf95
$ man pathcc
$ man pathCC

A complete list of available options can be found in the eko man page:

$ module load psc 
$ man eko

Here is a link to a place with more information:

Portland group (PGI) compilers

  • -fast Chooses generally optimal flags for the target platform. This sets the optimization level to a minimum of 2.
  • -fastsse Set of optimizations. It vectorizes loops and uses Streaming SIMD Extensions (SSE/SSE2) which utilize Opteron's eight 128 bit registers and usually produces faster code.
  • -g Generate symbolic debug information. This also sets the optimization level to zero, unless a -O switch is present on the command line. Symbolic debugging may give confusing results if an optimization other than zero is selected.
  • -Mfma (default -Mnofma) Generate (don't generate) fused multiply-add (FMA) instructions for targets that support it.  FMA instructions are generally faster than separate multiply-add instructions, and can generate higher precision results since the multiply result is not rounded before the addition.  However, because of this, the result may be different than the unfused multiply and add instructions.  FMA instructions are enabled with higher optimization levels.
  • -mp Include OpenMP directives.
  • -O[level] Set the optimization level. If -O is not specified, then the default level is 1 if -g is not specified, and 0 if -g is specified. If a number is not specified with -O, then the optimization level is set to 2. The optimization levels are:
    • 0 A basic block is generated for each statement. No scheduling is done between statements. No global optimizations are done.
    • 1 Scheduling within extended basic blocks is performed. Some register allocation is performed. No global optimizations are performed.
    • 2 All level 1 optimizations are performed. In addition, traditional scalar optimizations, such as induction recognition and loop invariant motion are performed by the global optimizer. RECOMMENDED
    • 3 All level 1 and 2 optimizations are performed. In addition, this level enables more aggressive code hoisting and scalar replacement optimizations that may or may not be profitable.

To read about other flags, and for further information, look in the man files. They can be accessed like this (remember to load the module first. See how on either 'Abisko: Installed Compilers' or 'Kebnekaise: Installed Compilers'):

$ man pgf77
$ man pgf90
$ man pgf95
$ man pgcc
$ man pgCC

Here is a link to a place with more information:

Updated: 2017-03-22, 15:55