Please, help us to better serve our user community by answering the following short survey: https://www.hdfgroup.org/website-survey/
HDF5 2.0.0.2ad0391
API Reference
Loading...
Searching...
No Matches
HDF5 Optimizations APIs (H5DO)

Detailed Description

Bypassing default HDF5 behavior in order to optimize for specific use cases (H5DO)

HDF5 functions described is this section are implemented in the HDF5 High-level library as optimized functions. These functions generally require careful setup and testing as they enable an application to bypass portions of the HDF5 library's I/O pipeline for performance purposes.

These functions are distributed in the standard HDF5 distribution and are available any time the HDF5 High-level library is available.

Functions

H5_HLDLL herr_t H5DOappend (hid_t dset_id, hid_t dxpl_id, unsigned axis, size_t extension, hid_t memtype, const void *buf)
 Appends data to a dataset along a specified dimension.
 
H5_HLDLL herr_t H5DOwrite_chunk (hid_t dset_id, hid_t dxpl_id, uint32_t filters, const hsize_t *offset, size_t data_size, const void *buf)
 Writes a raw data chunk from a buffer directly to a dataset in a file.
 
H5_HLDLL herr_t H5DOread_chunk (hid_t dset_id, hid_t dxpl_id, const hsize_t *offset, uint32_t *filters, void *buf)
 Reads a raw data chunk directly from a dataset in a file into a buffer.
 

Function Documentation

◆ H5DOappend()

H5_HLDLL herr_t H5DOappend ( hid_t  dset_id,
hid_t  dxpl_id,
unsigned  axis,
size_t  extension,
hid_t  memtype,
const void *  buf 
)

Appends data to a dataset along a specified dimension.


Parameters
[in]dset_idDataset identifier
[in]dxpl_idDataset transfer property list identifier
[in]axisDataset Dimension (0-based) for the append
[in]extensionNumber of elements to append for the axis-th dimension
[in]memtypeThe memory datatype identifier
[in]bufBuffer with data for the append
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.

The H5DOappend() routine extends a dataset by extension number of elements along a dimension specified by a dimension axis and writes buf of elements to the dataset. Dimension axis is 0-based. Elements’ type is described by memtype.

This routine combines calling H5Dset_extent(), H5Sselect_hyperslab(), and H5Dwrite() into a single routine that simplifies application development for the common case of appending elements to an existing dataset.

For a multi-dimensional dataset, appending to one dimension will write a contiguous hyperslab over the other dimensions. For example, if a 3-D dataset has dimension sizes (3, 5, 8), extending the 0th dimension (currently of size 3) by 3 will append 3*5*8 = 120 elements (which must be pointed to by the buffer parameter) to the dataset, making its final dimension sizes (6, 5, 8).

If a dataset has more than one unlimited dimension, any of those dimensions may be appended to, although only along one dimension per call to H5DOappend().

Since
1.10.0

◆ H5DOread_chunk()

H5_HLDLL herr_t H5DOread_chunk ( hid_t  dset_id,
hid_t  dxpl_id,
const hsize_t offset,
uint32_t *  filters,
void *  buf 
)

Reads a raw data chunk directly from a dataset in a file into a buffer.


Parameters
[in]dset_idIdentifier for the dataset to be read
[in]dxpl_idTransfer property list identifier for this I/O operation
[in]offsetLogical position of the chunk's first element in the dataspace
[in,out]filtersMask for identifying the filters used with the chunk
[in]bufBuffer containing the chunk read from the dataset
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.
Deprecated:

This function was deprecated in favor of the function H5Dread_chunk() as of HDF5-1.10.3. In HDF5 1.10.3, the functionality of H5DOread_chunk() was moved to H5Dread_chunk().

For compatibility, this API call has been left as a stub which simply calls H5Dread_chunk(). New code should use H5Dread_chunk().

The H5DOread_chunk() reads a raw data chunk as specified by its logical offset in a chunked dataset dset_id from the dataset in the file into the application memory buffer buf. The data in buf is read directly from the file bypassing the library's internal data transfer pipeline, including filters.

dxpl_id is a data transfer property list identifier.

The mask filters indicates which filters are used with the chunk when written. A zero value indicates that all enabled filters are applied on the chunk. A filter is skipped if the bit corresponding to the filter's position in the pipeline (0 ≤ position < 32) is turned on.

offset is an array specifying the logical position of the first element of the chunk in the dataset's dataspace. The length of the offset array must equal the number of dimensions, or rank, of the dataspace. The values in offset must not exceed the dimension limits and must specify a point that falls on a dataset chunk boundary.

buf is the memory buffer containing the chunk read from the dataset in the file.

Example
The following code illustrates the use of H5DOread_chunk() to read a chunk from a dataset:
#include <zlib.h>
#include <math.h>
#define DEFLATE_SIZE_ADJUST(s) (ceil(((double)(s)) * 1.001) + 12)
:
:
size_t buf_size = CHUNK_NX*CHUNK_NY*sizeof(int);
const Bytef *z_src = (const Bytef *)(direct_buf);
Bytef *z_dst; /* Destination buffer */
uLongf z_dst_nbytes = (uLongf)DEFLATE_SIZE_ADJUST(buf_size);
uLong z_src_nbytes = (uLong)buf_size;
int aggression = 9; /* Compression aggression setting */
uint32_t filter_mask = 0;
size_t buf_size = CHUNK_NX * CHUNK_NY * sizeof(int);
/* For H5DOread_chunk() */
void *readbuf = NULL; /* Buffer for reading data */
const Bytef *pt_readbuf; /* Point to the buffer for data read */
hsize_t read_chunk_nbytes; /* Size of chunk on disk */
int read_dst_buf[CHUNK_NX][CHUNK_NY]; /* Buffer to hold un-compressed data */
/* Create the data space */
if ((dataspace = H5Screate_simple(RANK, dims, maxdims)) < 0)
goto error;
/* Create a new file */
if ((file = H5Fcreate(FILE_NAME5, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT)) < 0)
goto error;
/* Modify dataset creation properties, i.e. enable chunking and compression */
if ((cparms = H5Pcreate(H5P_DATASET_CREATE)) < 0)
goto error;
if ((status = H5Pset_chunk(cparms, RANK, chunk_dims)) < 0)
goto error;
if ((status = H5Pset_deflate(cparms, aggression)) < 0)
goto error;
/* Create a new dataset within the file using cparms creation properties */
if ((dset_id = H5Dcreate2(file, DATASETNAME, H5T_NATIVE_INT, dataspace, H5P_DEFAULT, cparms,
H5P_DEFAULT)) < 0)
goto error;
/* Initialize data for one chunk */
for (i = n = 0; i < CHUNK_NX; i++)
for (j = 0; j < CHUNK_NY; j++)
direct_buf[i][j] = n++;
/* Allocate output (compressed) buffer */
outbuf = malloc(z_dst_nbytes);
z_dst = (Bytef *)outbuf;
/* Perform compression from the source to the destination buffer */
ret = compress2(z_dst, &z_dst_nbytes, z_src, z_src_nbytes, aggression);
/* Check for various zlib errors */
if (Z_BUF_ERROR == ret) {
fprintf(stderr, "overflow");
goto error;
}
else if (Z_MEM_ERROR == ret) {
fprintf(stderr, "deflate memory error");
goto error;
}
else if (Z_OK != ret) {
fprintf(stderr, "other deflate error");
goto error;
}
/* Write the compressed chunk data repeatedly to cover all the
* * chunks in the dataset, using the direct write function. */
for (i = 0; i < NX / CHUNK_NX; i++) {
for (j = 0; j < NY / CHUNK_NY; j++) {
status = H5DOwrite_chunk(dset_id, H5P_DEFAULT, filter_mask, offset, z_dst_nbytes, outbuf);
offset[1] += CHUNK_NY;
}
offset[0] += CHUNK_NX;
offset[1] = 0;
}
if (H5Fflush(dataset, H5F_SCOPE_LOCAL) < 0)
goto error;
if (H5Dclose(dataset) < 0)
goto error;
if ((dataset = H5Dopen2(file, DATASETNAME1, H5P_DEFAULT)) < 0)
goto error;
offset[0] = CHUNK_NX;
offset[1] = CHUNK_NY;
/* Get the size of the compressed chunk */
ret = H5Dget_chunk_storage_size(dataset, offset, &read_chunk_nbytes);
readbuf = malloc(read_chunk_nbytes);
pt_readbuf = (const Bytef *)readbuf;
/* Use H5DOread_chunk() to read the chunk back */
if ((status = H5DOread_chunk(dataset, H5P_DEFAULT, offset, &read_filter_mask, readbuf)) < 0)
goto error;
ret =
uncompress((Bytef *)read_dst_buf, (uLongf *)&buf_size, pt_readbuf, (uLong)read_chunk_nbytes);
/* Check for various zlib errors */
if (Z_BUF_ERROR == ret) {
fprintf(stderr, "error: not enough room in output buffer");
goto error;
}
else if (Z_MEM_ERROR == ret) {
fprintf(stderr, "error: not enough memory");
goto error;
}
else if (Z_OK != ret) {
fprintf(stderr, "error: corrupted input data");
goto error;
}
/* Data verification here */
:
:
#define H5F_ACC_TRUNC
Definition H5Fpublic.h:30
@ H5F_SCOPE_LOCAL
Definition H5Fpublic.h:89
#define H5P_DEFAULT
Definition H5Ppublic.h:220
#define H5P_DATASET_CREATE
Definition H5Ppublic.h:60
uint64_t hsize_t
Definition H5public.h:301
herr_t H5Pset_chunk(hid_t plist_id, int ndims, const hsize_t dim[])
Sets the size of the chunks used to store a chunked layout dataset.
herr_t H5Pset_deflate(hid_t plist_id, unsigned level)
Sets deflate (GNU gzip) compression method and compression level.
hid_t H5Dopen2(hid_t loc_id, const char *name, hid_t dapl_id)
Opens an existing dataset.
herr_t H5Dget_chunk_storage_size(hid_t dset_id, const hsize_t *offset, hsize_t *chunk_bytes)
Returns the amount of storage allocated within the file for a raw data chunk in a dataset.
hid_t H5Dcreate2(hid_t loc_id, const char *name, hid_t type_id, hid_t space_id, hid_t lcpl_id, hid_t dcpl_id, hid_t dapl_id)
Creates a new dataset and links it into the file.
herr_t H5Dclose(hid_t dset_id)
Closes the specified dataset.
H5_HLDLL herr_t H5DOread_chunk(hid_t dset_id, hid_t dxpl_id, const hsize_t *offset, uint32_t *filters, void *buf)
Reads a raw data chunk directly from a dataset in a file into a buffer.
H5_HLDLL herr_t H5DOwrite_chunk(hid_t dset_id, hid_t dxpl_id, uint32_t filters, const hsize_t *offset, size_t data_size, const void *buf)
Writes a raw data chunk from a buffer directly to a dataset in a file.
hid_t H5Fcreate(const char *filename, unsigned flags, hid_t fcpl_id, hid_t fapl_id)
Creates an HDF5 file.
herr_t H5Fflush(hid_t object_id, H5F_scope_t scope)
Flushes all buffers associated with a file to storage.
hid_t H5Screate_simple(int rank, const hsize_t dims[], const hsize_t maxdims[])
Creates a new simple dataspace and opens it for access.
#define H5T_NATIVE_INT
Definition H5Tpublic.h:813
hid_t H5Pcreate(hid_t cls_id)
Creates a new property list as an instance of a property list class.
#define RANK
Definition h5import.h:335
Version
1.10.3 Function deprecated in favor of H5Dread_chunk.
Since
1.10.2, 1.8.19

◆ H5DOwrite_chunk()

H5_HLDLL herr_t H5DOwrite_chunk ( hid_t  dset_id,
hid_t  dxpl_id,
uint32_t  filters,
const hsize_t offset,
size_t  data_size,
const void *  buf 
)

Writes a raw data chunk from a buffer directly to a dataset in a file.


Parameters
[in]dset_idIdentifier for the dataset to write to
[in]dxpl_idTransfer property list identifier for this I/O operation
[in]filtersMask for identifying the filters in use
[in]offsetLogical position of the chunk's first element in the dataspace
[in]data_sizeSize of the actual data to be written in bytes
[in]bufBuffer containing data to be written to the chunk
Returns
Returns a non-negative value if successful; otherwise, returns a negative value.
Deprecated:

This function was deprecated in favor of the function H5Dwrite_chunk() of HDF5-1.10.3. The functionality of H5DOwrite_chunk() was moved to H5Dwrite_chunk().

For compatibility, this API call has been left as a stub which simply calls H5Dwrite_chunk(). New code should use H5Dwrite_chunk().

The H5DOwrite_chunk() writes a raw data chunk as specified by its logical offset in a chunked dataset dset_id from the application memory buffer buf to the dataset in the file. Typically, the data in buf is preprocessed in memory by a custom transformation, such as compression. The chunk will bypass the library's internal data transfer pipeline, including filters, and will be written directly to the file.

dxpl_id is a data transfer property list identifier.

filters is a mask providing a record of which filters are used with the chunk. The default value of the mask is zero (0), indicating that all enabled filters are applied. A filter is skipped if the bit corresponding to the filter's position in the pipeline (0 ≤ position < 32) is turned on. This mask is saved with the chunk in the file.

offset is an array specifying the logical position of the first element of the chunk in the dataset's dataspace. The length of the offset array must equal the number of dimensions, or rank, of the dataspace. The values in offset must not exceed the dimension limits and must specify a point that falls on a dataset chunk boundary.

data_size is the size in bytes of the chunk, representing the number of bytes to be read from the buffer buf. If the data chunk has been precompressed, data_size should be the size of the compressed data.

buf is the memory buffer containing data to be written to the chunk in the file.

Attention
Exercise caution when using H5DOread_chunk() and H5DOwrite_chunk(), as they read and write data chunks directly in a file. H5DOwrite_chunk() bypasses hyperslab selection, the conversion of data from one datatype to another, and the filter pipeline to write the chunk. Developers should have experience with these processes before using this function. Please see HDF5 High Level Optimizations for more information.
Note
H5DOread_chunk() and H5DOwrite_chunk() are not supported under parallel and do not support variable length types.
Example
The following code illustrates the use of H5DOwrite_chunk to write an entire dataset, chunk by chunk:
#include <zlib.h>
#include <math.h>
#define DEFLATE_SIZE_ADJUST(s) (ceil(((double)(s)) * 1.001) + 12)
:
:
size_t buf_size = CHUNK_NX*CHUNK_NY*sizeof(int);
const Bytef *z_src = (const Bytef *)(direct_buf);
Bytef *z_dst; /* Destination buffer */
uLongf z_dst_nbytes = (uLongf)DEFLATE_SIZE_ADJUST(buf_size);
uLong z_src_nbytes = (uLong)buf_size;
int aggression = 9; /* Compression aggression setting */
uint32_t filter_mask = 0;
size_t buf_size = CHUNK_NX * CHUNK_NY * sizeof(int);
/* Create the data space */
if ((dataspace = H5Screate_simple(RANK, dims, maxdims)) < 0)
goto error;
/* Create a new file */
if ((file = H5Fcreate(FILE_NAME5, H5F_ACC_TRUNC, H5P_DEFAULT, H5P_DEFAULT)) < 0)
goto error;
/* Modify dataset creation properties, i.e. enable chunking and compression */
if ((cparms = H5Pcreate(H5P_DATASET_CREATE)) < 0)
goto error;
if ((status = H5Pset_chunk(cparms, RANK, chunk_dims)) < 0)
goto error;
if ((status = H5Pset_deflate(cparms, aggression)) < 0)
goto error;
/* Create a new dataset within the file using cparms creation properties */
if ((dset_id = H5Dcreate2(file, DATASETNAME, H5T_NATIVE_INT, dataspace, H5P_DEFAULT, cparms,
H5P_DEFAULT)) < 0)
goto error;
/* Initialize data for one chunk */
for (i = n = 0; i < CHUNK_NX; i++)
for (j = 0; j < CHUNK_NY; j++)
direct_buf[i][j] = n++;
/* Allocate output (compressed) buffer */
outbuf = malloc(z_dst_nbytes);
z_dst = (Bytef *)outbuf;
/* Perform compression from the source to the destination buffer */
ret = compress2(z_dst, &z_dst_nbytes, z_src, z_src_nbytes, aggression);
/* Check for various zlib errors */
if (Z_BUF_ERROR == ret) {
fprintf(stderr, "overflow");
goto error;
}
else if (Z_MEM_ERROR == ret) {
fprintf(stderr, "deflate memory error");
goto error;
}
else if (Z_OK != ret) {
fprintf(stderr, "other deflate error");
goto error;
}
/* Write the compressed chunk data repeatedly to cover all the
* * chunks in the dataset, using the direct write function. */
for (i = 0; i < NX / CHUNK_NX; i++) {
for (j = 0; j < NY / CHUNK_NY; j++) {
status =
H5DOwrite_chunk(dset_id, H5P_DEFAULT, filter_mask, offset, z_dst_nbytes, outbuf);
offset[1] += CHUNK_NY;
}
offset[0] += CHUNK_NX;
offset[1] = 0;
}
/* Overwrite the first chunk with uncompressed data. Set the filter mask to
* * indicate the compression filter is skipped */
filter_mask = 0x00000001;
offset[0] = offset[1] = 0;
if (H5DOwrite_chunk(dset_id, H5P_DEFAULT, filter_mask, offset, buf_size, direct_buf) < 0)
goto error;
/* Read the entire dataset back for data verification converting ints to longs */
if (H5Dread(dataset, H5T_NATIVE_LONG, H5S_ALL, H5S_ALL, H5P_DEFAULT, outbuf_long) < 0)
goto error;
/* Data verification here */
:
:
#define H5S_ALL
Definition H5Spublic.h:32
herr_t H5Dread(hid_t dset_id, hid_t mem_type_id, hid_t mem_space_id, hid_t file_space_id, hid_t dxpl_id, void *buf)
Reads raw data from a dataset into a provided buffer.
#define H5T_NATIVE_LONG
Definition H5Tpublic.h:823
Version
1.10.3 Function deprecated in favor of H5Dwrite_chunk.
Since
1.8.11