Software - Application programs: HDF5

HDF5

Policy

HDF5 is freely available to users at HPC2N.

General

HDF5 is a data model, library, and file format for storing and managing data.

Description

HDF5 supports an unlimited variety of datatypes, and is designed for flexible and efficient I/O and for high volume and complex data. HDF5 is portable and is extensible, allowing applications to evolve in their use of HDF5. The HDF5 Technology suite includes tools and applications for managing, manipulating, viewing, and analyzing data in the HDF5 format.

The HDF5 technology suite includes:

  • A versatile data model that can represent very complex data objects and a wide variety of metadata.
  • A completely portable file format with no limit on the number or size of data objects in the collection.
  • A software library that runs on a range of computational platforms, from laptops to massively parallel systems, and implements a high-level API with C, C++, Fortran 90, and Java interfaces.
  • A rich set of integrated performance features that allow for access time and storage space optimizations.
  • Tools and applications for managing, manipulating, viewing, and analyzing the data in the collection.

Availability

On HPC2N we have HDF5 available as a module on Akka and Abisko.

Usage at HPC2N

To use the HDF5 module, add it to your environment:

module add hdf5/<compiler>

Where <compiler> is one of intel | pgi | psc | gcc. Note that you must give the compiler. The module will then also load the compiler. There are a number of different versions of hdf5, which can be seen with the command

module avail hdf5

Loading the module should set any needed environmental variables as well as the path. 

There are example programs in the examples directory of the HDF5 installation. They can be found here: /afs/hpc2n.umu.se/lap/hdf5/1.8.7/src/hdf5-1.8.7/dist/examples (or similar for other versions).

HDF5 is used by adding calls to your program, depending on whether you want to create a new data file, read from an existing data file, or write to an existing data file.

Remember, you must add the HDF header file

#include "hdf5.h"  | C
#include "H5Cpp.h" | C++
USE HDF5           | Fortran

to the header of your program.

File Access Modes

  • H5Fcreate accepts H5F_ACC_EXCL or H5F_ACC_TRUNC
  • H5Fopen accepts H5F_ACC_RDONLY or H5F_ACC_RDWR
  • H5F_ACC_EXCL If the file already exists, H5Fcreatefails. If the file does not exist, it is created and opened with read-write access. (Default)
  • H5F_ACC_TRUNC If the file already exists, the file is opened with read-write access, and new data will overwrite any existing data. If the file does not exist, it is created and opened with read-write access.
  • H5F_ACC_RDONLY An existing file is opened with read-only access. If the file does not exist, H5Fopen fails. (Default)
  • H5F_ACC_RDWR An existing file is opened with read-write access. If the file does not exist, H5Fopen fails.

Creating a file

  • Define the file creation property list
  • Define the file access property list
  • Create the file
  • More information here about creating a file, opening an exisiting file, and closing a file.
Examples from HDF homepage

Creating an HDF5 file using property list defaults

file_id = H5Fcreate ("SampleFile.h5", H5F_ACC_EXCL,
    H5P_DEFAULT, H5P_DEFAULT)

Creating an HDF5 file using property lists

fcplist_id = H5Pcreate (H5P_FILE_CREATE)
  <...set desired file creation properties...>
faplist_id = H5Pcreate (H5P_FILE_ACCESS)
  <...set desired file access properties...>
file_id = H5Fcreate ("SampleFile.h5", H5F_ACC_EXCL, fcplist_id, faplist_id)

Opening an HDF5 file (read-only access)

faplist_id = H5Pcreate (H5P_FILE_ACCESS)
status = H5Pset_fapl_stdio (faplist_id)
file_id = H5Fopen ("SampleFile.h5", H5F_ACC_RDONLY, faplist_id)

Closing an HDF5 file

status = H5Fclose (file_id)

Viewing a file with h5dump

Included with the HDF5 distribution is a command-line utility called h5dump. This is a program for inspecting the contents of a HDF5 file. It displays ASCII output formatted according to the HDF5 DDL grammar.

Displaying the content of file.h5:

h5dump SampleFile.h5 

This is how the 'default' file will look, before any datasets or groups have been created, and no data has been written:

    HDF5 "file.h5" {
    GROUP "/" {
    }
    }

You can read more about the program h5dump here.

The HDF5 DDL grammar is described in this document.

File Function Summaries

File Property Lists

Additional information regarding file structure and access are passed to H5Fcreateand H5Fopenthrough property list objects. Property lists provide a portable and extensible method of modifying file properties via simple API functions. There are two kinds of file-related property lists:

  • File creation property lists
  • File access property lists

You can read more about file property lists in the HDF5 User Guide.

Code Examples

The following examples (for version 1.8.X) are taken from the HDF5 User Guide - Code Examples.

Compiling

First, you must load the HDF5 module. This is done with

module load hdf5/<compiler>/<version>

You must specify compiler and version. You can see a list with

module avail hdf5

Then, you compile with

  • h5cc hdf5_program.c (for C programs)
  • h5c++ hdf5_program.cpp (for C++ programs)
  • h5fc program.f90 (for Fortran 90 programs)

You can get more help with this by typing

module help hdf5/<compiler>/<version>

Examples, compiling, running

HDF5, version 1.8, Pathscale compiler.

C (h5ex_d_chunk.c - get from download link above)

p-bc9901 [~]$ module load hdf5/psc/1.8
p-bc9901 [~]$ h5cc h5ex_d_chunk.c -o h5ex_d_chunk
p-bc9901 [~]$ ./h5ex_d_chunk
Original Data:
 [   1   1   1   1   1   1   1   1]
 [   1   1   1   1   1   1   1   1]
 [   1   1   1   1   1   1   1   1]
 [   1   1   1   1   1   1   1   1]
 [   1   1   1   1   1   1   1   1]
 [   1   1   1   1   1   1   1   1]

Storage layout for DS1 is: H5D_CHUNKED

Data as written to disk by hyberslabs:
 [   0   1   0   0   1   0   0   1]
 [   1   1   0   1   1   0   1   1]
 [   0   0   0   0   0   0   0   0]
 [   0   1   0   0   1   0   0   1]
 [   1   1   0   1   1   0   1   1]
 [   0   0   0   0   0   0   0   0]

Data as read from disk by hyperslab:
 [   0   1   0   0   0   0   0   1]
 [   0   1   0   1   0   0   1   1]
 [   0   0   0   0   0   0   0   0]
 [   0   0   0   0   0   0   0   0]
 [   0   1   0   1   0   0   1   1]
 [   0   0   0   0   0   0   0   0]
p-bc9901 [~]$

Fortran 90 (h5ex_d_chunk.f90 - get from download link above)

p-bc9901 [~]$ module load hdf5/psc/1.8
p-bc9901 [~]$ h5fc h5ex_d_chunk.f90 -o h5ex_d_chunk
p-bc9901 [~]$ ./h5ex_d_chunk 

Original Data:
 [  1  1  1  1  1  1  1  1 ]
 [  1  1  1  1  1  1  1  1 ]
 [  1  1  1  1  1  1  1  1 ]
 [  1  1  1  1  1  1  1  1 ]
 [  1  1  1  1  1  1  1  1 ]
 [  1  1  1  1  1  1  1  1 ]

Storage layout for DS1 is: H5D_CHUNKED

Data as written to disk by hyberslabs:
 [  0  1  0  0  1  0  0  1 ]
 [  1  1  0  1  1  0  1  1 ]
 [  0  0  0  0  0  0  0  0 ]
 [  0  1  0  0  1  0  0  1 ]
 [  1  1  0  1  1  0  1  1 ]
 [  0  0  0  0  0  0  0  0 ]

Data as read from disk by hyperslab:
 [  0  1  0  0  0  0  0  1 ]
 [  0  1  0  1  0  0  1  1 ]
 [  0  0  0  0  0  0  0  0 ]
 [  0  0  0  0  0  0  0  0 ]
 [  0  1  0  1  0  0  1  1 ]
 [  0  0  0  0  0  0  0  0 ]
p-bc9901 [~]$ 

Additional info

You can find more information at the following locations: