Multi-Threading FAQ

Starting with version 5.5, IDL offers a host of built-in multi-threaded algorithms for performance gains on multi-processor systems. These include many of IDL's core operators and mathematical functions, along with many image processing, array manipulation and type conversion routines. This capability allows you to harness additional CPUs to solve large numerical problems faster in IDL and ENVI.

The details of multi-threading may not be familiar to all IDL and ENVI users. Other terms such as multi-processing and distributed processing are sometimes confused with the concept of multi-threading. Further, the performance of multi-threaded programs is statistical and can depend on many factors. Here are answers to frequently-asked-questions regarding the multi-threading implementation in IDL and how it will affect your experience using IDL and ENVI.

What exactly does "multi-threading" mean?
Why was multi-threading introduced in IDL 5.5?
How do I take advantage of multi-threading in my IDL code?
What if I don't have a multi-processor system?
What routines are multi-threaded in IDL?
How does ENVI take advantage of multi-threading?
What kind of performance gains can I expect?
Under what conditions should multi-threading not be used?
What hardware should I buy?
What is planned for future enhancements to the multi-threading capability?
Why not expose threads at the IDL user level?
What about distributed processing with IDL and ENVI?
Where can I find more information on multi-threading?

What exactly does "multi-threading" mean?

Multi-threading is a way to let programs do more than one thing at a time. It is implemented within a single program running on a single system. It involves an operating system allowing programs to split tasks between multiple threads of execution. On a machine with multiple processors, these threads can execute concurrently, potentially speeding up the task significantly.

Exelis Visual Information Solutions added support for using threads internally to accelerate specific numerical computations on multi-processor systems. This is achieved through the IDL thread pool, which consists of a group of threads (excluding the main thread) that are created when IDL starts. On a system that supports N CPUs, the thread pool default is to have N-1 threads in the pool. Counting the main thread, this gives you one thread for each processor. While the thread pool sleeps, the main thread runs IDL much as it always has, as a single threaded application. When not involved in a calculation, the threads in the thread pool are inactive and consume little in the way of system resources. When IDL encounters a computation that can use the thread pool and which can benefit from parallel execution, the main thread assigns the N-1 threads of the thread pool work to do, and wakes them to run in parallel with the main thread. Once the helper threads finish their tasks and go back to sleep, the main thread continues. To the user, this looks and feels like a single threaded application that simply seems to run faster.

Back to top

Why was multi-threading introduced in IDL 5.5?

Simply put, Exelis implemented multi-threading in IDL in order to allow you to harness additional CPUs to do more work in less time. Multi-processor hardware is becoming cheap and easily available, and scientific data sets continue to grow in size faster than computers can process them. Multi-processor systems offer one way to handle larger problems faster.

The multi-threading capability applies to binary and unary operators, many core mathematical functions, and a number of image processing, array manipulation and type conversion routines. These computations are threaded transparently, providing an immediate and measurable benefit without requiring any special effort on your part.

Back to top

How do I take advantage of multi-threading in my IDL code?

It is not necessary to learn any new skills or rewrite your existing code to take advantage of multi-threading in IDL. When IDL encounters a computation that is able to use the thread pool, it decides whether to employ the thread pool to speed up the computation. This is based on the availability of multiple CPUs in the current system as well as on the number of data elements in the input array. Because there are certain situations in which multi-threading may not be desirable, you can control the use of the thread pool through both a global system variable and standard keywords that apply to individual function calls. See the Building IDL Applications documentation for more information on controlling the IDL thread pool.

Back to top

If I don't have a multi-processor system, will changes made in IDL to implement multi-threading affect me?

The IDL thread pool is safe and transparent on platforms that are unable to support threading. Those platforms that can benefit will use threads, and those that cannot will continue to produce correct results using a single thread, and with the same level of performance as previous versions of IDL.

Back to top

What routines are multi-threaded in IDL?

For a complete list of the routines that are threaded in IDL, see the Building IDL Applications documentation. Note also that IDL supported multi-threading previously for volume rendering. This support was re-implemented internally to use the thread pool in IDL 5.5.

Note that many IDL library routines are implemented in .pro code and may internally use threaded routines. For instance, on a multi-processor system, the CONGRID routine will employ multi-threading for most cases because it calls INTERPOLATE or POLY_2D, both of which are multi-threaded.

Exelis offers a Professional Services Group (PSG) that can be hired to help optimize user-written code or to parallelize specific algorithms beyond those that use the thread pool in IDL and ENVI.

Back to top

How does ENVI take advantage of multi-threading?

Versions of ENVI that are built on IDL 5.5 and up "inherit" the multi-threading capability under the hood.

Back to top

What kind of performance gains can I expect?

The multi-threading capability in IDL makes it possible to significantly accelerate numerical computations on multi-processor systems. Exelis testing for IDL 5.5 showed that on a Sun Ultra 80 system running Solaris 8 with four 450MHz UltraSPARC II processors and 4GB of RAM, an IDL multi-threaded floating point square root operation on a vector of 9.8 million elements ran 3.8 times faster than it did with only one processor on the same machine. The same IDL computation running on a Dell Precision 620 PC running Windows NT 4 with two 600MHz Intel Pentium III processors and 128MB RAM ran 1.9 times faster. On the Solaris quad processor system, ENVI 3.5's Maximum Likelihood Classification and MNF rotation both ran 2.7 times faster when applied to a 405 x 400 x 126 integer image cube as compared to the result for a single thread.

It is important to realize that threading performance depends on many factors, including the calculation, the size of the input data, the hardware and the underlying operating system. Multi-threading will not always speed up a computation. In fact, there are certain cases where the thread pool may actually increase execution time (see "Under what conditions should multi-threading not be used?").

The Multi-Threading in IDL white paper presents the results of testing on several different systems running various computations. While these measurements cannot be safely generalized, they do show that it is not uncommon to approach the ideal improvement of N times faster, where N is the number of available CPUs. They also show that the performance of threaded computations can vary greatly from system to system. The ideal improvement is not fully attainable because there is always some part of a computation that cannot be parallelized (including the time required to divide work between threads), and there is a natural diminishing rate of improvement in performance that comes with the addition of each CPU due to hardware constraints.

Back to top

Under what conditions should multi-threading not be used?

While multi-threading can provide significant speedups, it is possible to encounter certain situations that can result in poorer performance relative to the single-threaded alternative. For instance, if a computation involves too few elements, the overhead involved in splitting a problem between threads may exceed the gain. If a computation involves too many elements for the system memory, the virtual memory system will be activated (paging), and threads may begin competing for access to memory.

When IDL encounters eligible computations, it determines whether or not to use the IDL thread pool to carry them out. This is based on the availability of multiple CPUs in the current system as well as on the number of data elements in the input array. The latter criterion is somewhat heuristic because IDL cannot know all of the information necessary to determine the effect multi-threading will have on performance. Using the number of elements in the input array as a factor in deciding whether to employ the thread pool in a given computation is a good rule of thumb. As with all rules of thumb, there are situations in which it applies less well. There are also other reasons threading may not be desired. For instance, out of courtesy to other users on a multi-user system, or when the rounding of finite precision floating point types may produce different (although equally correct) results in algorithms that are sensitive to the order of operations. For all of these reasons, you are provided with a simple interface to control the parameters IDL uses when deciding to employ multi-threading. See the Building IDL Applications documentation for more information on controlling the IDL thread pool.

Back to top

What hardware should I buy?

Any supported multi-processor system can be used effectively with the IDL thread pool. However, the best system to buy depends entirely on the specific job for which you intend to use it. There are many variables involved in thread pool performance (hardware, data size, operating system, algorithm). While it is possible to measure the performance of the thread pool on multiple systems and multiple computations, the results cannot be safely generalized. The best piece of advice is to run your own program on a potential system, if at all possible, and use that information to make your choice. Naturally, raw performance will need to be weighed against other criteria when choosing hardware, including affordability, usability and maintainability.

Back to top

Why not expose threads at the IDL user level?

The Multi-Threading in IDL white paper has a good discussion of why the thread pool implementation was chosen over alternatives such as exposing threads at the user level. In summary, the reasons are:

Exposing threads at the user level in IDL would require a resource-intensive re-architecting of IDL. The result could be error-prone and have poorer performance overall, so it's not clear that this is even a good idea.
IDL users tend to be scientists, not computer scientists, who value the relative ease of programming in IDL compared to traditional languages such as C/C++. Programming with threads would introduce a high degree of complexity, including statistical bugs in IDL code that would be hard to track down. Most IDL customers want faster numerical computations, and these can be achieved more easily and quickly by tightly focused, internal threading with the IDL thread pool.

Back to top

What about distributed processing with IDL and ENVI?

The multi-threading capabilities in IDL are not the same as distributed processing. While distributed processing, sometimes called parallel processing, and multi-threading are both techniques for achieving parallelism (and can be used in combination), they are fundamentally different. Multi-threading involves splitting a problem between concurrent threads on a single system. This requires support from the underlying language and operating system. Distributed processing involves splitting a problem among separate machines connected over a fast network (e.g. a Linux cluster). The latter requires a support framework to manage communication. Distributed processing does not require direct internal support within IDL. For example, Tech-X has developed FastDL, a solution that allows IDL users to run IDL applications on networked Linux clusters. FastDL is comprised of two independent components that address the varying needs of parallel data analysis and visualization applications: TaskDL and mpiDL. See the Tech-X web site for more information.

Back to top

Where can I find more information on multi-threading?

The Building IDL Applications documentation contains a list of all threaded routines, a description of the thread pool and how to control it, and code examples. For a technical description of the motivation behind the thread pool and performance benchmarks, consult the Multi-Threading in IDL white paper. Exelis also maintains a searchable tech tip database that includes tips on multi-threading. These tech tips will continue to be generated as questions come up, so check back from time to time.

Back to top

Review on 12/31/2013 MM

FLEXlm error -15,10 "Cannot connect to license server" or "WinSock: Connection refused" U.S. Survey Feet vs. International Feet in ENVI