Internal: Floating Point Errors- using Traps

Anonym Thursday, May 21, 1998

Upate 01/21/13: This seems like technical information from a long time ago. However, this might be good to reference if some weird floating point error is occurring on a customers strange old system.

Topic:

There is a good deal of confusion over handling floating point errors with and without floating point traps. The discussion below is from Dave Stern on this topic.

Note: Much of this is also explained in the Programming chapter, Chapther 11 of the User's Guide.

Discussion:

If floating point traps are enabled, the computer causes an "exception" (also called "trap" or "fault") when the error is detected. The program aborts the offending instruction, before it has completed, and jumps to a "signal" or "trap" handler routine in IDL which logs the error. Unfortunately, the trap handler doesn't know what the result of the illegal operation should be, or into which register or memory location the result should go. Indeed, on most modern RISC machines with an instruction "pipeline", a number of instructions AFTER the offending operation have already been executed, or are part way thru the execution process.

So, in this case, you get notification of the error, at the time it occurs, but the result contains garbage.

Another complication factor, is that no two machines implement exceptions the same way. For example, VAX machines don't use the IEEE standard, and have no way of representing NaN or Infinity.

When traps are disabled, the IEEE standard specifies the result of each possible illegal operation, usually either NaN (not a number) or Inf (infinity). When an error is detected an error status bit is usually set in a register and execution along the normal path continues. When the statement or program is finished, IDL checks this register and prints the warning message. The problem here is that you only know an error occured, not WHERE it occured.

It seems logical to ask, "Why can't we have it both ways?". You can, but the cost is to explicitly check the result of each floating point operation for Nan or Inf, using the finite() function, and issue an error message. I estimate that this would make floating point operations slower by a factor of from two to five.