How to reduce the time to read data - NV5 Geospatial - Help

NV5 Geospatial

NV5 Geospatial

NV5 GEOSPATIAL SOFTWARE

NV5 Geospatial

|

Forums
Search Forum
Advanced SearchTopicsPosts

Prev

Prev Go to previous topic

Next

Go to next topic

Last Post 11 Sep 2006 09:08 AM by anon

How to reduce the time to read data

1 Replies

Sort:


You are not authorized to post a reply.

Author

Messages

anon

New Member

Posts:

New Member

11 Sep 2006 09:08 AM
Hello all: I have a HDF5 file contain 18306 groups, each group have a two dimensional array with size [5,500] . Thus I have 18306*500(about 9 million) particles , each particle contians 5 information. In my idl program(http://163.23.210.1/~slchen/plot_ptcls.pro) , I read them group by group and then combine them into a single two dimensional array. It takes two hours to read values from the HDF5 file . Its very time consuming. I want to know if there is any method to reduce the processing time. Please help me to get the related information. Such as if I need to change my programming method in IDL or if there is any solution by clusterDL/mpiDL Thank you very much! cincerely,

Deleted User

New Member

Posts:

New Member

11 Sep 2006 09:08 AM
In your program you have a loop that cycles 18,306 times executing this line: ptcls = [[ptcls],[ptcl]] That looks like a very inefficient approach to array-building. Every time it executes if must allocate a new array address and memory block, it must copy its current contents to the new memory location, and the operating system must eventually free its old memory location. The efficient way to provide memory for a large array is to allocate its memory space in the fewest possible calls, preferably just one. Another IDL trick is to reference the array's memory space via a pointer. Thus, what happens if you try code like this: nColumns = 5 nRowsPerGroup = 500 nRows = nRowsPerGroup * n_grps ptcls_ptr = ptr_new(lonarr(nColumns, nRows)) for i = 0, n_grps-1 do begin ptcl = h5d_read(dataset_id) ; copy latest 'ptcl' starting at the memory address [0, i * 500] (ptcls_ptr)[0, i nRowsPerGroup] = ptcl endfor This might be a little faster than the pointer-free algorithm: ptcls = lonarr(nColumns * nRows) for ... ptcls[, (i 500):((i+1) * 500 - 1)] = ptcl because of some subscript expansion work that IDL does whenever it sees array subscripts addressed like this [*, startIndex:endIndex]. Anyhow, that is one thing. The other is: Shouldn't you be adding a lot of HDF5 CLOSE-type calls to your code. It seems that every H5D_OPEN at the beginning of your 18,306-cycle FOR loop should have a matchin H5D_CLOSE call at the end of the FOR loop, shouldn't it? And, eventually, shouldn't you have an H5F_CLOSE after you are done importing all the data? James Jones

You are not authorized to post a reply.

Forums Products IDL