The Decl'HDF5 plugin
WARNING This documentation is a work in progress and does not reflect the full Decl'HDF5 potential.
The Decl'HDF5 plugin enables one to read and write data from HDF5 files in a declarative way. Decl'HDF5 does not support the full HDF5 feature set but offers a simple declarative interface to access a large subset of it.
Configuration grammar
At its root, the Decl'HDF5 configuration is made of either a single FILE_DESC
or a list of FILE_DESC
s.
<tt>FILE_DESC</tt>
A FILE_DESC
specifies a list of actions to execute in one file. It is specified by a key/value map that contains at least the file
key. Other keys are optional. The possible values for the keys are as follow:
file
: a string that can contain $-expressions and specified the name of the file thisFILE_DESC
refers to.write
: aDATA_SECTION
that defaults to an empty one. ThisDATA_SECTION
describes writes to execute.read
: aDATA_SECTION
that defaults to an empty one. ThisDATA_SECTION
describes reads to execute.on_event
: a string identifying an event when the whole file is accessed. If not specified, each data is written when it is exposed and the file is opened and closed every time.when
: a $-expression specifying a default condition to test before executing the reads and writes of thisFILE_DESC
. This can be replaced by a more specific condition inside theDATA_SECTION
.communicator
: a $-expression referencing a MPI communicator to use for HDF5 parallel synchronized reads and writes. It defaults to MPI_COMM_SELF which stands for sequential writes. In case of data-triggered (vs. event-triggered) reads and writes, this can be replaced inside theDATA_SECTION
.datasets
: a key-value map associating a PDI type to string keys. Each string is the name of a dataset to create in the file on first access, with the type described in the value.
<tt>DATA_SECTION</tt>
The DATA_SECTION
describes a set of I/O (read or write) to execute. A data section can take multiple forms:
-
a list of strings, each being either:
- the name of a PDI data to write in a dataset with the same name ( e.g.
data
, wheredata
is both a PDI data and the HDF5 dataset name) - the name of a PDI data to write in an attribute of the same name preceded by a
#
and the name of the dataset or datagroup to which the attribute applies ( e.g.dset#attr
, whereattr
is both a PDI data and the HDF5 attribute name applied to thedset
dataset)
- the name of a PDI data to write in a dataset with the same name ( e.g.
-
a key-value map where each key is the name of a PDI data to write and the value is either a single
DATA_IO_DESC
or a list ofDATA_IO_DESC
s describing the I/O (read or write) to execute.
The first case behaves as if each data had its DATA_IO_DESC
specified with all default values.
<tt>DATA_IO_DESC</tt>
A DATA_IO_DESC
is a key-velue map describing one I/O (read or write) to execute. All keys are optional and have default values. The possible values for the keys are as follow:
dataset
: a $-expression identifying the name of the dataset to access in the file. If not specified this defaults to the name of the data. On writing, if the dataset does not exist in the file and is not specified in theFILE_DESC
then a dataset with the same size as the memory selection is automatically created.attribute
: a $-expression identifying the path of the attribute to access in the file. The path is specified as a/
-separated path to the object on which the attribute is set (a dataset or a datagroup) followed by a#
sign and the name of the attribute itself (e.g.group/dset#attr
). If theattribute
key is specified, then only thewhen
key can also be specified, no other key can be present.size_of
: a $-expression identifying the path of the dataset that size should be written to data descriptor.when
: a $-expression specifying a condition to test before executing the I/O operation (read or write). This defaults to the value specified in theFILE_DESC
if present or to unconditional I/O otherwise.communicator
: a $-expression referencing a MPI communicator to use for HDF5 parallel synchronized I/O operation (read or write). Specifying communicator at this level is incompatible with event-triggered (vs. data-triggered) This defaults to the value specified in theFILE_DESC
if present or to sequential (MPI_COMM_SELF) I/O otherwise.memory_selection
: aSELECTION_DESC
specifying the selection of data in memory to read or write. It defaults to selecting the whole data.dataset_selection
: aSELECTION_DESC
specifying the selection of data in the file data to write or read. This is only valid if theattributes
: a key-value map specifying the set of attributes to read from (respectively, write to) the file when the associated dataset is read (respectively, written). Each key is the name of an attribute of the dataset. Each value is a $-expression (evaluated when the dataset is accessed) specifying the value to the attribute.
<tt>SELECTION_DESC</tt>
A SELECTION_DESC
is a key-value map that describes the selection of a subset of data from a larger set. All keys are optional and have default values. The possible values for the keys are as follow:
size
is either a single $-expression or a list of $-expressions. It describes the size of the selection in each dimension.start
is either a single $-expression or a list of $-expressions. It describes the number of point to skip at the beginning in each dimension.
Memory selection default values:
- If the
size
is not specified, it defaults to size of the whole data in each dimension. - If the
start
is not specified it defaults to 0 in all dimensions.
Dataset selection default values:
-
If the
size
is not specified:- if the number of dimensions match that of the memory, the size defaults to that of the memory,
- otherwise, the size default to the whole dataset.
-
If the
start
is not specified it defaults to 0 in all dimensions.
full configuration example
metadata: # small values for which PDI keeps a copy
width: int # per proc. width including ghost
height: int # per proc. height including ghost
pwidth: int # nb proc. in the x dim
pheight: int # nb proc. in the y dim
iter: int # curent iteration id
coord: { type: array, subtype: int, size: 2 } # coordinate of the process as [x, y]
data: # values that are not copied by PDI
main_field:
type: array
subtype: double
size: [$width, $height]
plugins:
mpi: # loading MPI_Comm predefines (e.g. $MPI_COMM_WORLD)
decl_hdf5: # a list of file to write to (can be a single element)
file: data${coord[0]}x${coord[1]}.h5 # the file in which to write the data (required)
on_event: newiter # the event that triggers these actions (default: trigger on data expose)
when: "$iter>0 & $iter<11" # a condition when to actually trigger the actions (default: always true)
communicator: $MPI_COMM_SELF # the MPI communicator used for HDF5 parallel synchronized write (default: $MPI_COMM_SELF, sequential write)
datasets: # a list of datasets inside the file created on first access
data/array: # a dataset name, datasets referenced but not defined are created just big enough to fit the data
type: array
subtype: double # type of the data in the dataset
size: [10, $width-2, $width-2] # size of the dataset
write: # a list or map of data to write (default: empty)
main_field: # name of the data, it contains either a list or a single write to execute
- dataset: data/array # a dataset name (default: the data name)
when: "$iter>0&$iter<11" # an additional condition when to actually trigger the actions (default: always true)
communicator: $MPI_COMM_SELF # the MPI communicator used for HDF5 parallel synchronized write (default: that of the file)
memory_selection:
size: [$width-2, $height-2] # number of elements to transfer in each dimension (default: size of the full data)
start: [1, 1] # coordinate of the start point in memory relative to the shared data (default: 0 in each dimensions)
dataset_selection:
size: [1, $width-2, $width-2] # number of elements to transfer in each dimension, must amount to the same number as the memory selection (default: size of memory slab)
start: [$iter, 0, 0] # coordinate of the start point in the file relative to the dataset (default: 0 in each dimensions)
attributes:
size: ($width-2)*($width-2)
width: $width-2
height: $width-2
read: # a list or map of data to read, similar to write (default: empty)
- another_value