X
16201

Tips & Tricks for Efficient IDL Programming


Programmers can significantly decrease execution time and memory usage by following these tips and tricks for efficient IDL programming:

1. IDL supplies a number of built-in functions and procedures to perform common operations. These system-supplied functions have been optimized and are faster than writing the equivalent operations in IDL with loops and subscripting. In particular, array operations that are performed on the entire array are faster than the equivalent operations performed in a loop on single elements of an array. A common operation is to find the sum of the elements in an array or subarray. Using the IDL supplied TOTAL function is 10 times faster than summing the elements with a FOR loop.

2. The order in which an expression is evaluated can have a significant effect on program speed. Consider the following expression:
    B = A * 16. / max(A)

This statement multiplies every element in A by 16 and then divides each element by the value of the maximum element of A. The number of operations required is twice the number of elements in A.

A much faster way of computing the same result is:

    B = 16./max(A) * A

In this case, the statement divides 16 by the maximum value of A. Then, it multiples every element of A by that scalar. The number of operations required is the same as the number of elements in A -- half of what is required for the previous statement.

3. Try to avoid IF statements within loops. When an IF statement appears in the middle of a loop with each element of an array in a conditional, the loop can be eliminated most of the time by using logical array expressions.

Slow:

    for i = 0L, n_elements(data)-1L do $
       if (data[i] le 30) then $
          data[i] = 30

Faster:


    dex = where(data le 30, count)
    if (count gt 0) then data[dex] = 30

Even faster:


    data = ((data gt 30) * data) + ((data le 30) * 30)

Fastest:


    data = data > 30

4. Eliminate invariant expressions (i.e. assignments of constants, e.g. "mythreshold = 200") from inside loops.

5. Access large arrays by memory order. Arrays in IDL are column-major order. The important thing to remember is that IDL indexes arrays as [column, row]. The upper left-hand element of a matrix is considered to be [0,0]. This is the format used by FORTRAN, and is traditionally associated with image processing because it keeps all the elements of a single image scanline together. In contrast C and Visual Basic use row-major order and indexes its data as [row,column].

When an array is larger than or close to the working set size (the amount of physical memory available for the process) it should be accessed in memory order, meaning access it by accessing rows first.

    for x = 0, 511 do for y = 0, 511 do arr[x, y] = ....

is very inefficient because to read the first column 250,000 bytes of data must be read into physical memory. This process must be repeated for each column, requiring the entire array to be read and written almost 512 times. By exchanging the two FOR loops the computing time can be reduced by a factor of 50:

    for y = 0, 511 do for x = 0, 511 do arr[x, y] = ....

6. Setting the NOZERO keyword in BYTARR, COMPLEXARR, DCOMPLEXARR, FLTARR, INTARR, LONARR, MAKE_ARRAY, OBJARR, and PTRARR will prevent the elements in the resulting array from being set to zero. This saves time, but the array elements contain random values, so this is only a smart strategy if you know that all elements of the array will be explicitly assigned before the program needs to use the entirety of this array

7. The REPLICATE_INPLACE procedure updates an existing array by replacing all or selected parts of it with a specified value. This procedure is faster and uses less memory than the IDL function REPLICATE or than direct assignment.

8. If the /NO_COPY keyword setting is used in PTR_NEW, VERT_T3D, WIDGET_BASE, WIDGET_BUTTON, WIDGET_CONTROL, WIDGET_DRAW, WIDGET_DROPLIST, WIDGET_LABEL, WIDGET_LIST, WIDGET_SLIDER, WIDGET_TABLE, and WIDGET_TEXT, IDL takes the data from the source and attaches it directly to the destination, thus avoiding a copy operation. However, it has the side effect of causing the source variable to become undefined.

9. Use of the TEMPORARY function minimizes memory use when performing operations on large arrays. TEMPORARY reassigns to a new variable the memory referred to by its argument, deleting the old variable reference in the process (though the reassignment may be to a variable that reuses the original name). Assume that A is a large array. To add 1 to each element of A this is one coding option:

    A = A + l

This statement creates a new array for the result of the addition and assigns the result to A before freeing the old allocation of A. Therefore, this operation needs 2*sizeof(A) of memory to perform the operation. The statement

    A = temporary(A) + 1

needs no additional space.

10. Try not to use "*" for array indexing on the left hand side of a statement. Instead, use "0." For instance, for the array

    B = intarr(200, 200, 3)

it is much slower to use the following notation

    B[*,*,1] = insertData

than to use

    B[0,0,1] = insertData

The first notation is inefficient because the IDL interpreter will allocate/build a 200 x 200 long integer array of subscript indexes to substitute for the wildcard token. The latter notation does not need an array of subscripts; it performs direct copy to memory. (Note that 'insertData' must fit in the 200 x 200 x 2 memory space that defines the borders of the subarray that starts at 'B[0,0,1]' in order to avoid an array-out-of-bounds error.)

11. Each incremental call to a function or procedure is expensive, so try to avoid excessive calling and one-line functions.

12. Free objects and pointers when they are no longer needed.