X
9717 Rate this article:
No rating

Robust detection of defined variables

Anonym

My recent work on the upcoming ENV 5.2 API got me thinking about how to validate input parameters correctly. One of my goals in writing a robust ENVI API is to identify any and all invalid inputs and then throw helpful error messages guiding the user in how to fix their inputs. The most common approach is to call N_Elements on the variable name and compare that against 0 (since all even numbers are treated as a logical false value without compile_opt logical_predicate):

if (N_Elements(bands) gt 0) then ...

While writing my unit test suites, I found a way to confound this type of test. Any class that inherits from IDL_Object can override the _overloadSize() method, which changes the value returned by N_Elements and Size(/DIMENSION). The new Hash and List classes do just this, as you can see when we compare the results from a NullObject and the Hash object:

IDL> o = Obj_New()

IDL> N_Elements(o)

           1

IDL> Size(o)

           0          11           1

IDL> Size(o, /DIMENSIONS)

           0

IDL> h = Hash()

IDL> N_Elements(h)

           0

IDL> Size(h)

           0          11           0

IDL> Size(h, /DIMENSIONS)

           0

If a user passes in an empty Hash or List object, then the N_Elements test above won't behave as expected. 

My next approach was to use the ISA function, to identify variables that weren't undefined:

if (ISA(bands)) then ...

But there are at least two exceptions to this rule as well.  If you call ISA() on a NullObject or NullPointer, it returns 0, not 1 as I had hoped, though it will detect empty Hash or List objects:

IDL> print, ISA(Obj_New())

   0

IDL> print, ISA(Ptr_New())

   0

IDL> print, ISA(Hash())

   1

IDL> print, ISA(List())

   1

My next approach was to use the Typename() function, which does fit the bill:

if (Typename(bands) ne 'UNDEFINED') then ...

As we can see, this approach is based on the fact that any undefined variable will return the string 'UNDEFINED', while anything else will return some other string.

IDL> print, Typename(Obj_New())

OBJREF

IDL> print, Typename(Ptr_New())

POINTER

IDL> print, Typename(Hash())

HASH

IDL> print, Typename(List())

LIST

IDL> print, Typename(foo)

UNDEFINED

IDL> print, Typename(!null)

UNDEFINED

IDL> print, Typename([])

UNDEFINED

IDL> print, Typename({})

UNDEFINED

The downside to this approach is that it is using string compares, which aren't as efficient as the comparison of a single Boolean variable. The better alternative is to use Size(/TYPE) which returns one of IDL's integer type codes:

if (Size(bands, /TYPE) ne 0) then ...

This is doing a simple integer comparison.  This now behaves as we want it to:

IDL> print, Size(Obj_New(), /TYPE)

          11

IDL> print, Size(Ptr_New(), /TYPE)

          10

IDL> print, Size(Hash(), /TYPE)

          11

IDL> print, Size(List(), /TYPE)

          11

IDL> print, Size(foo, /TYPE)

           0

IDL> print, Size(!null, /TYPE)

           0

IDL> print, Size([], /TYPE)

           0

IDL> print, Size({}, /TYPE)

           0

As you can see, there are a few different ways to express an undefined variable, which can be a problem. When a particular keyword is not used in a function call, then it is truly undefined, same if the variable passed in for that keyword has never been set.  But the last three cases I showed are different ways to express a null variable. There is the explicit !null literal, the empty array syntax [], and the empty struct syntax {}. These last two may look funky, but they have their places in IDL 8.X.  Back in the IDL 7.1 and earlier days, it was common to see code like this:

outData = [0]

for i = 0, N_Elements(inData)-1 do begin

  outData = [ outData, inData[i] ]

endfor

outData = outData[1: *]

 

We have to create a dummy array element to concatenate to, then strip off the first element. Since IDL 8.0 this is a little simpler:

outData = []

for i = 0, N_Elements(inData)-1 do begin

  outData = [ outData, inData[i] ]

endfor

 

The empty struct construct can be used for repeated calls to Create_Struct() to append new tags. In either case if the for loop doesn't get executed then we have an empty array or struct.

With the advent of List and Hash, we can also get an empty array or struct from their ToArray() and ToStruct() methods respectively when the container object has no elements. With ENVI Services Engine accepting REST requests containing JSON inputs, we could end up with empty Hash and List objects.

You may ask who cares, a !null or empty array is just as ignorable as an undefined keyword, and in many cases this is a reasonable assumption. But there are times when we want to discriminate between a !null and an undefined, and none of the above tests will be able to accomplish this. In fact, we do this in enviTask, where calling oTask.ParamName=!null has the effect of unsetting that parameter's value, and allowing its default value to be returned, if it is set. The way to accomplish this is with the /NULL keyword to ISA:

IDL> ISA(Obj_New(), /NULL)

   0

IDL> ISA(Ptr_New(), /NULL)

   0

IDL> ISA(Hash(), /NULL)

   0

IDL> ISA(List(), /NULL)

   0

IDL> ISA(foo, /NULL)

   0

IDL> ISA(!null, /NULL)

   1

IDL> ISA([], /NULL)

   1

IDL> ISA({}, /NULL)

   1

So putting it all together, the best way to validate undefined vs null vs non-null is:

if (Size(val, /TYPE) eq 0) then begin

...

endif else if (ISA(val, /NULL)) then begin

...

endelse