10211
Example how to write large data to an HDF5 file piece by piece
Topic:
This article shows an example of how to write a large set of data to an HDF5 file by breaking it down into smaller pieces and writing it to the file piece by piece.
Discussion:
The program "test_write_h5d" (shown below) generates a set of data that is random is size*. It then creates a HDF5 file and opens it. The data is then broken into 20x20 data segments and each segment is written to the file within a FOR loop.
NOTE*: It's not entirely randomly size. The dimensions are rounded to the closest multiple of 20.
pro test_write_h5d
compile_opt idl2
;Create a new HDF 5 file
file = 'mytest_h5_file.h5'
fid = H5F_CREATE(file)
;randomly creating initial dimensions
random_number1 = sort(randomu(seed, 1000))
random_number2 = sort(randomu(seed, 2000))
dim1 = random_number1[0]
dim2 = random_number2[0]
;create a big data
data = hanning(dim1,dim2)
;define the step size for
;each dimension
step1 = 20
step2 = 40
;determine the numbers of
;steps needed to write entire
;data set.
nstep1 = dim1/step1
nstep2 = dim2/step2
;redefine the dimensions to be "nice"
;with the step size
dim1 = nstep1*step1
dim2 = nstep2*step2
;create a big data
data = hanning(dim1,dim2)
; extract an small segment of the
; data from the larger array
data_segment = data[0:(step1-1),0:(step2-1)]
; create a datatype
datatype_id = H5T_IDL_CREATE(data)
; create a dataspace, allow the dataspace to be extendable
dataspace_id = H5S_CREATE_SIMPLE([step1,step2],max_dimensions=[-1,-1])
; create the dataset
dataset_id = H5D_CREATE(fid,'Hanning', datatype_id,dataspace_id, chunk_dimensions=[step1,step2])
; extend the size of the dataset to fit the data
H5D_EXTEND,dataset_id,size(data_segment,/dimensions)
; write the data to the dataset
H5D_WRITE,dataset_id,data_segment
;Now do the same thing with the rest of
;the data in a piece by piece fashion
for ind1 = 0L, nstep1-1 do begin
for ind2 = 0L, nstep2-1 do begin
;if the file data space for the
;iterator exist, close it
if (isa(iter_data_space_id)) then begin
H5S_CLOSE, iter_data_space_id
endif
;if the memory data space for the
;iterator exist, close it
if (isa(iter_data_space_id2)) then begin
H5S_CLOSE, iter_data_space_id2
endif
;Determine which indices of the data array
;to start the next segment of the data
start1 = ind1 * step1
start2 = ind2 * step2
;Define the data segment to be written
;to the HDF file
data_segment = data[start1:(start1+step1-1),start2:(start2+step2-1)]
;Extend the data set by the step size. The number
;being entered as the dimensions is the new TOTAL
;elements in each dimension (not the change).
h5d_extend, dataset_id, [start1+step1, start2+step2]
;Generate a new dataspace
iter_data_space_id = h5d_get_space(dataset_id)
;Select the slab of data that should include the new data
H5S_SELECT_HYPERSLAB, iter_data_space_id, [start1,start2], $
[step1,step2], /RESET
;Create the memory data space to
iter_data_space_id2 = h5s_create_simple([step1,step2])
;Write the data to the file using file data space
;and memory data space generated in this loop
h5d_write, dataset_id, data_segment, $
FILE_SPACE_ID=iter_data_space_id,$
MEMORY_SPACE_ID=iter_data_space_id2
endfor
endfor
; close some identifiers
H5S_CLOSE, iter_data_space_id
H5S_CLOSE, iter_data_space_id2
H5S_CLOSE,dataspace_id
H5D_CLOSE,dataset_id
H5T_CLOSE,datatype_id
H5F_CLOSE,fid
help, data
;quickly read the document out
h5_list, file
in_dat = H5_GETDATA(file, '/Hanning')
s=surface(in_dat)
end