X

NV5 Geospatial Blog

Each month, NV5 Geospatial posts new blog content across a variety of categories. Browse our latest posts below to learn about important geospatial information or use the search bar to find a specific topic or author. Stay informed of the latest blog posts, events, and technologies by joining our email list!



Not All Supernovae Are Created Equal: Rethinking the Universe’s Measuring Tools

Not All Supernovae Are Created Equal: Rethinking the Universe’s Measuring Tools

6/3/2025

Rethinking the Reliability of Type 1a Supernovae   How do astronomers measure the universe? It all starts with distance. From gauging the size of a galaxy to calculating how fast the universe is expanding, measuring cosmic distances is essential to understanding everything in the sky. For nearby stars, astronomers use... Read More >

Using LLMs To Research Remote Sensing Software: Helpful, but Incomplete

Using LLMs To Research Remote Sensing Software: Helpful, but Incomplete

5/26/2025

Whether you’re new to remote sensing or a seasoned expert, there is no doubt that large language models (LLMs) like OpenAI’s ChatGPT or Google’s Gemini can be incredibly useful in many aspects of research. From exploring the electromagnetic spectrum to creating object detection models using the latest deep learning... Read More >

From Image to Insight: How GEOINT Automation Is Changing the Speed of Decision-Making

From Image to Insight: How GEOINT Automation Is Changing the Speed of Decision-Making

4/28/2025

When every second counts, the ability to process geospatial data rapidly and accurately isn’t just helpful, it’s critical. Geospatial Intelligence (GEOINT) has always played a pivotal role in defense, security, and disaster response. But in high-tempo operations, traditional workflows are no longer fast enough. Analysts are... Read More >

Thermal Infrared Echoes: Illuminating the Last Gasp of a Dying Star

Thermal Infrared Echoes: Illuminating the Last Gasp of a Dying Star

4/24/2025

This blog was written by Eli Dwek, Emeritus, NASA Goddard Space Flight Center, Greenbelt, MD and Research Fellow, Center for Astrophysics, Harvard & Smithsonian, Cambridge, MA. It is the fifth blog in a series showcasing our IDL® Fellows program which supports passionate retired IDL users who may need support to continue their work... Read More >

A New Era of Hyperspectral Imaging with ENVI® and Wyvern’s Open Data Program

A New Era of Hyperspectral Imaging with ENVI® and Wyvern’s Open Data Program

2/25/2025

This blog was written in collaboration with Adam O’Connor from Wyvern.   As hyperspectral imaging (HSI) continues to grow in importance, access to high-quality satellite data is key to unlocking new insights in environmental monitoring, agriculture, forestry, mining, security, energy infrastructure management, and more.... Read More >

1345678910Last
13862 Rate this article:
5.0

Threaded Processing In IDL

Anonym

The IDL_IDLBridge is a useful feature which helps you perform multi-process operations within a single IDL process.  While there are functions which make use of a systems thread pool, most of those functions are math routines.  The IDL_IDLBridge allows you to utilize unused threads on your system.  To start out, let's look at some code which reads in each file in IDL's installation directory and computes the average number of characters per line.

compile_opt idl2

 

; Let's find all files inside of IDL's examples/data directory

filepath = filepath('')

filelist = file_search(filepath,'*', /TEST_REGULAR)

cpl = 0

 

tic

; Loop through the files and calculate the characters per line

foreach file, filelist do begin

  lines = file_lines(file)

  if lines gt 0 then begin

    data = strarr(lines)

    openr, lun, file, /get_lun

    readf, lun, data

    free_lun, lun

 

    cpl = (total(strlen(data)) / lines)

  endif

endforeach

print, cpl/n_elements(filelist)

 

toc


In order to convert this code to use the IDL_IDLBridge, we need to covert this logic to use a master-controller paradigm.  The first thing we need to do is isolate the work which will be done on the worker threads.  For, this example, we have a loop which performs the same operation over and over with minimal dependence on variables outside of the loop.  Let's start by taking that functionality and putting it in its own function.

 

pro bridgeFunction, file, data

  compile_opt idl2

 

  lines = file_lines(file)

  if lines gt 0 then begin

    data = strarr(lines)

    openr, lun, file, /get_lun

    readf, lun, data

    free_lun, lun

 

    data = (total(strlen(data)) / lines)

  endif else begin

    data = 0

  endelse

end

 

Note: If you are using the IDL_IDLBridge make sure your functions are on your IDL_PATH.  A worker will only look for a function on the PATH.  If it can't find it, the program will fail. 

 

Next, let's set up the master.  The master is responsible for determining how many workers are needed, how to split up the work, and giving workers work when they are free.  The first thing the master needs to do is figure out how many IDL_IDLBridge objects are needed and create them.  This is system specific, but a good place to start is typically half the total number of threads available on the system.

 

; Create a bridge for half the total threads on the system

oBridge = objarr(!cpu.TPOOL_NTHREADS/2)

for i=0, oBridge.length-1 do begin

  oBridge[i] = obj_new('IDL_IDLBridge', $

    Callback='bridgeFunctionCallback')

  oBridge[i].setProperty, userData=0

endfor

 

The USERDATA and CALLBACK are used to determine which processes have completed execution and will be explained later.  The next step is setting up our iteration.  For each file in our directory, we want to tell a worker to count the characters per line.

 

while filesProcessed lt nFiles do begin

  for i=0, oBridge.length-1 do begin

    oBridge[i].execute, "bridgeFunction,'" + $

      filelist[nextIndex] + "', data"

    cpl += oBridge[i]->getVar('data')

  endfor

endwhile

 

Notice there is a problem with this logic.  Our code is still not threaded!  While each file will be processed in a different thread, each thread will complete before the next thread starts, thus loosing the benefit of a threaded design.  This problem is easily solved by added the /NOWAIT keyword to our call to execute.  One consequence of the /NOWAIT keyword, is that we are responsible for checking to make sure each bridge has completed its execution.  Lucky for us, the CALLBACK on the IDL_IDLBridge object can help us accomplish this. 

 

pro bridgeFunctionCallback, status, error, node, userdata

  compile_opt idl2

 

  node->setProperty, userData=2

end

 

IDL will call the callback when the thread has ended execution, we can use this to signal the master the worker has completed working on its file and is ready for another.  An important thing to keep in mind when programming in the master-worker paradigm is the state of a worker.  In this example we have three states: a ready for work state, a running state, and a finished execution state.  We can represent these states in the USERDATA field.  Accounting for these states, our iteration becomes:

 

; Process each file

while filesProcessed lt nFiles do begin

  for i=0, oBridge.length-1 do begin

    oBridge[i].getProperty, userdata=status

    ; Check the status of our thread

    switch (status) of

      0: begin

        ; Assign it work if there is work to be had

        if nextIndex lt nFiles then begin

          oBridge[i].setProperty, userData=1

          oBridge[i].execute, "bridgeFunction,'" + $

            filelist[nextIndex] + "', data",/nowait

          nextIndex++

        endif

        break

      end

      2: begin

        ; Capture the results

        filesProcessed++

        cpl += oBridge[i]->getVar('data')

        oBridge[i].setProperty, userData=0

        break

      end

      else: begin

      end

    endswitch

  endfor

endwhile

 

While we still have files to process, we check each of our threads to see if they need to be assigned a file to work on.  If a thread is done with a file, we fetch the output with GETVAR and set the thread to a ready state which will be picked up on the next iteration of the loop. 

 

The last thing to note is the overhead of creating an IDL_IDLBridge.  If you have a short running task, it will often be faster to execute in a single thread instead of using an IDL_IDLBridge.  However, being creative with your IDL_IDLBridge can lead to marked decrease in processing time.  On my machine, calculating the average character per line for the IDL install directory (over 20,000 files) took 72.182 seconds.  Using the threaded code, it took only 46.931 seconds.  How cool is that?

 

Sneak Peak:  I've seen code like this used to process large data files.  Every night a cron job would kick off an IDL process which would find all of the new files and process them.  In the next release of IDL we are introducing the WATCHFOLDER routine which will watch for changes inside of a specified folder and issue a CALLBACK when a change is noticed.  With WATCHFOLDER and IDL_IDLBridge, you could create a threaded system which would process new files when they arrived.

 

Below are the files used.  Copy each section into its own named file and make sure to save them somewhere on IDL's path.

 

; bridgeFunction.pro

;-------------------

pro bridgeFunction, file, data

  compile_opt idl2

 

  lines = file_lines(file)

  if lines gt 0 then begin

    data = strarr(lines)

    openr, lun, file, /get_lun

    readf, lun, data

    free_lun, lun

 

    data = (total(strlen(data)) / lines)

  endif else begin

    data = 0

  endelse

end

 

; bridgeExample.pro

;------------------

pro bridgeFunctionCallback, status, error, node, userdata

  compile_opt idl2

 

  node->setProperty, userData=2

end

 

;-----------------

pro bridgeexample

  compile_opt idl2

 

  tic

 

  ; Create a bridge for half the threads on the system

  print, 'Using ',strtrim(!cpu.TPOOL_NTHREADS/2,2),' threads...'

  oBridge = objarr(!cpu.TPOOL_NTHREADS/2)

  for i=0, oBridge.length-1 do begin

    oBridge[i] = obj_new('IDL_IDLBridge', $

      callback='bridgeFunctionCallback')

    oBridge[i].setProperty, userData=0

  endfor

 

 

  ; Set up our variables

  filepath = filepath('')

  filelist = file_search(filepath,'*',/TEST_REGULAR)

  filesProcessed = 0

  nextIndex=0

  nFiles = n_elements(filelist)

  cpl = 0

 

  ; Process each file

  while filesProcessed lt nFiles do begin

    for i=0, oBridge.length-1 do begin

      oBridge[i].getProperty, userdata=status

      ; Check the status of our thread

      switch (status) of

        0: begin

          ; Assign it work if there is work to be had

          if nextIndex lt nFiles then begin

            oBridge[i].setProperty, userData=1

            oBridge[i].execute, "bridgeFunction,'" + $

              filelist[nextIndex] + "', data",/nowait

            nextIndex++

          endif

          break

        end

        2: begin

          ; Capture the results

          filesProcessed++

          cpl += oBridge[i]->getVar('data')

          oBridge[i].setProperty, userData=0

          break

        end

        else: begin

        end

      endswitch

    endfor

  endwhile

 

  print,'Average characters per line:',cpl/nFiles

 

  toc

 

end

 

; nonbridgeExample.pro

;---------------------

pro nonBridgeExample

  compile_opt idl2

 

  ; Let's find all files inside of IDL's examples/data directory

  filepath = filepath('')

  filelist = file_search(filepath,'*', /TEST_REGULAR)

  cpl = 0

 

  tic

  ; Loop through the files and calculate the characters per line

  foreach file, filelist do begin   

    ; Perform some processing

    bridgefunction,file,data

    cpl += data

  endforeach

  print, 'Average characters per line:', cpl/n_elements(filelist)

  toc

 

end


Cheers

2 comments on article "Threaded Processing In IDL"

Avatar image

Paul Mallas

This code does not seem to work. I have cut and pasted it from the website and it does ot run.

First, oBridge.length does not return a value, but that is an easy enough work around using n_elements(oBridge).

Secondly, the oBridge[i]->getVar('data') states data is undefined:

% Compiled module: BRIDGEEXAMPLE.

Using 4 threads...

% IDL_IDLBRIDGE Error: Undefined variable: 'data'

% Execution halted at: BRIDGEEXAMPLE 87 C:\Users\.......

I am using Windows 7 64 bit IDL 8.3


Avatar image

Paul Mallas

I solved the oBridge[i]->getVar('data') issue but it brings up another. I explicitly set the path to my project containing these files in the Preferences->IDL->paths. However, I already had in the"IDL project Properties" the "Update IDL path when project is opened ot closed." I have generally found this work to add relevant paths. But here it did not seem to. Does this option not work with IDL_idlBridge for some reason?

Please login or register to post comments.