10878 Rate this article:

Minimizing Rounding Errors with IDL's TOTAL function

The idea for my blog topic this week came from a question from Ron Kneusel about the loss of precision using IDL’s TOTAL function. IDL functions are often implemented in a way that makes them run fast, since processing speed is important. This is certainly true for the TOTAL function. If you are not using any of the keywords and additional arguments, then it is basically adding to the sum in the input array order. The exception is if you have enough elements in the array to trigger the multi-threaded code path, which I will ignore for now.
The issue with loss of precision is therefore best seen if you start with the largest term and keep adding smaller and smaller terms. At some point, the small term gets so small compared to the large sum that the additional terms get lost in round off.
Here is an example of a sum that we know should approach 128 (with a deviation in the 52 decimal place):
IDL> data = (1-1d/128) ^ lindgen(15000)
IDL> 128 - total(data)
So, we are getting an error in the 13th decimal place in this case. This array starts with the largest term and ends with the smallest term, so that is the worst case for the TOTAL implementation. If we reverse the order of the terms, we get:
IDL> 128 - total(reverse(data))
The error gets 10 times smaller in this case. I decided to compare the algorithm that Ron suggested, which is called Kahan sum, with a divide-and-conquer scheme that I have previously used for execution on a massively parallel GPU (which runs fast with 10000’s of independent threads). The Kahan sum algorithm can be found on Wikipedia and the IDL code is pretty short, (but very slow):
; Kahan algorithm, from wikipedia
function KahanSum, data
  sum = 0d
  c = 0d
  for i=0, data.length-1 do begin
    y = data[i] - c
    t = sum + y
    c = (t - sum) - y
    sum = t
  return, sum
The massively parallel algorithm will run much faster than the Kahan Sum, and the IDL code is listed here:
; Divide and concure total,
; algorithm lends itself to massively parallel execution
function total_mp, data
  compile_opt idl2,logical_predicate
  n = ishft(1ull,total(ishft(1ull,lindgen(63)) lt n_elements(data),/integer))
  pad = make_array(n, type=size(data,/type))
  pad[0] = data[*]
  while n gt 1 do begin
    pad[0:n/2-1] += pad[n/2:n-1]
    n /= 2
  return, pad[0]
Here are the result with the decreasing magnitude terms:
IDL> 128 - KahanSum(data)
IDL> 128 - total_mp(data)
So, in this case the error is similar to the best case sorted input to TOTAL. As an independent test, I also tried randomly ordering the terms, and both KahanSum and TOTAL_MP, are still consistent on the order of 5e-14:
IDL> 128 - KahanSum(data[order])
IDL> 128 - total_mp(data[order])
IDL> 128 - total(data[order])
My conclusion is that the TOTAL_MP example is just as accurate as the Kahan sum, and has the added advantage of allowing for massively parallel execution, as opposed to the highly sequential execution needed for the Kahan algorithm.