X
11094 Rate this article:
No rating

Subtle Mistakes in IDL Programming

Anonym

Through my years of IDL programming, I've made my fair share of typos, boo-boos, and seemingly minuscule mistakes that end up causing significant problems. Some, such as misspelled variable names, are easy to spot. Others, however, are trickier to track down. Sometimes it takes hours of debugging to uncover a problem that was created several hundred lines earlier. 

I've compiled a list of a few of these mistakes that are easy for even an experienced programmer to make. Some of these may not be obvious at the time of writing the code, but they can cause unexpected results downstream. I have had personal experience with all of these throughout my career.

1. Scalars and one-element arrays are not always the same.

One of the conveniences that IDL offers is that it usually doesn't care whether a variable is a scalar or a one-element array. In other words, IDL assumes these two variables are the same:

a = 1
b = [1]
IF a EQ b THEN PRINT, 'They are the same.' ELSE PRINT, 'They are different.'

IDL will print:

They are the same.

Much of IDL's functionality can accept either of these, and the end result is the same.Therefore, when writing code, I usually don't bother checking to see if a variable is a scalar or array before passing it along to another IDL call. However, this is not always true. 

Try running the following code for example. This code creates a hash with three entries, and each entry is a hash with three numeric entries. Then it looks for keys containing the string '2' and creates a new variable which is set to the value of the hash's field for the resulting key.

h = hash('h1', hash('a', 1, 'b', 2, 'c', 3), $
  'h2', hash('d', 4, 'e', 5, 'f', 6), $
  'h3', hash('g', 7, 'h', 8, 'i', 9))
keys = (h.Keys()).ToArray()
result = Where(StrMatch(keys, '*2*'))

h2 = h[keys[result]]

So the new variable, h2, should be a hash containing three key/value pairs, 'd'=4, 'e'=5. 'f'=6, right?

help, h2
H2              HASH  <ID=48 NELEMENTS=1>

Wait, what?

As it turns out, the WHERE function always returns an array, even if there is only one result. Consequently, indexing the keys by a one-element array resulted in a one-element array, and when the array was passed into the hash's brackets, it interpreted the call differently than I, the programmer, intended. The hash assumed that because I provided an array of keys, I must have wanted a hash that represented a subset-by-key, when really I was looking for the contents of a single field in the hash (See Hash Access for details on how to "retrieve a single element" vs. "retrieve multiple elements and create a new hash").

The result from the WHERE function is a good example of when to be careful with treating variables as scalars when they are actually arrays. The easy fix to this example problem is to turn the result into a scalar by indexing by the first element.

h2 = h[keys[result[0]]]
help, h2
H2              HASH  <ID=11 NELEMENTS=3>

Now I have the result that I was expecting.

2. Catch, /CANCEL does not clear the ERROR_STATE.

Here is some hypothetical code, which is placed at the top of a routine, that catches an error and calls an error recovery routine. After recovering from the error, the code returns from the routine back to the calling routine.

Catch, err
if err ne 0 then begin
  Catch, /CANCEL
  my_error_recovery_procedure
  return
endif

The call to Catch, /CANCEL in this block serves the purpose of turning off the catch that was set just above it. This way, if an error (which may not be related to the original error) occurs in my_error_recovery_procedure, the catch block will not enter an infinite loop of throwing and catching. 

If the error recovery in this example was intended to clean up and put my program back into a clean state so that it can continue functioning, then I've made the mistake of not clearing IDL's error state. I once wrote a routine that was part of a larger program, which had a catch block just like this one. At the time, I had forgotten that cancelling the catch does not reset !ERROR_STATE. At the very end of the larger program (thousands of lines later), someone else wrote code that checked !ERROR_STATE and threw a dialog message if an error was reported in the program. This led to lengthy debugging to see where the error was happening.

The easy fix for this is to call MESSAGE, /RESET just before returning, and !ERROR_STATE will be reset back to its clean state.

3. LUNs or file handles must be freed when an error occurs.

Ah, more mistakes arising with error handling...

It's usually a good idea for any subroutine that opens a file to have its own catch block. This way, the file can be closed within the catch block. If the file is opened, an error occurs before the file is closed, and the error is caught somewhere outside of the subroutine (i.e. if the caller of the routine has its own catch block), then the file will be locked and one of IDL's LUNs will be in use indefinitely. Eventually IDL will run out of LUNs.

Given a routine that has this call somewhere in it:

OpenR, unit, file, /GET_LUN

here is an example catch block that can be placed at the top of the routine. If you do not wish to reset the error state, you can instead reissue the error after the LUN has been freed.

Catch, err
if err ne 0 then begin
  Catch, /CANCEL
  if N_Elements(unit) gt 0 then Free_Lun, unit
  ; Reissue the error.
  Message, /REISSUE_LAST
endif

 

Similary, for an HDF5 file:

file_id = H5F_Open(file)

here is an example catch block that ensures it gets closed:

 

Catch, err
if err ne 0 then begin
  Catch, /CANCEL
  if N_Elements(file_id) gt 0 then H5F_Close, file
  ; Reissue the error.
  Message, /REISSUE_LAST
endif

 

4. IDL will pass-by-reference

Some languages differentiate between when a variable is passed-by-value and when it is passed-by-reference. In IDL, however, this is not the case. It should be assumed that if an argument is modified within the function, the caller of the routine will get the modification, whether it expects it or not. This is true for both arguments and keywords.

If a variable will be modified in a routine, but you do not wish to have the modified value make its way out of the routine, then you will need to make a copy of the variable, like this:

PRO myroutine, MY_KEYWORD=myKeywordIn

if Isa(myKeyword_In) then myKeyword = myKeywordIn

; From here on, use the variable "myKeyword," and myKeywordIn will not be modified.

Note: Depending on the variable type, sometimes you may want to use N_ELEMENTS or another form of existence checking rather than the plain ISA function, or you may want to specify keywords on ISA.

Here is a neat trick - you can check to see if the variable's life extends outside the routine by using ARG_PRESENT, and if it doesn't then you can "steal" the value to save memory, like this:

PRO myroutine, MY_KEYWORD=myKeywordIn
if (Isa(myKeyword_In)) then myKeyword = Arg_Present(myKeywordIn) ? myKeywordIn : Temporary(myKeywordIn)

5. IDL's path may contain an outdated version of a file

Here's a frustrating experience I've had before: I ran some code that called a subroutine located in a different file. Something went wrong. Wait, I fixed the error in that file, so why was it still failing? I put a breakpoint in the file to debug, and IDL didn't stop at the breakpoint. Then I realized that IDL was running a different version of the file, which was located somewhere else in the path. 

To make it even more confusing, the routine could be running from a save file. 

There are a few things that can help with this confusion. First, IDL has a preference that will check for duplicate routines. In IDL's preferences dialog, select, the "IDL" category and then select "Paths." Next, check the box that says, "Warn when a routine is on IDL's path more than once." Now the IDL console's "Problems" tab will show when a routine is duplicated and where the different versions of the routine exist. If you are having trouble with one routine in particular, you can call ROUTINE_INFO in the command line, providing the routine name and the /SOURCE keyword, and IDL will tell you the source of where it's calling the routine. 

6. IDL is sometimes case-sensitive

 

Another of IDL's conveniences is that most of the time it is not case-sensitive. For instance, FOR and for are compiled into the same thing. ThisVariable and thisVariable would be treated the same. The same goes for N_ELEMENTS or N_Elements. I personally like this because I can spend less head space focusing on the language and more effort focusing on the logic and the algorithm that I'm writing. 

There are, however, the occasional times that IDL is case-sensitive, and problems will arise if the case gets mixed up. Hash keys, for example, are case-sensitive (unless /FOLD_CASE was set when creating the hash). 

h = hash('a', 1, 'A', 'Hello')
help, h['a']

<Expression>    INT       =        1

help, h['A']

<Expression>    STRING    = 'Hello'

File I/O is another time when it's good to consider case-sensitivity. For instance, file names are not case-sensitive on Windows, but on Unix they are.