Mapping Earthquake Deformation in Taiwan With ENVI

12/15/2025

Unlocking Critical Insights With ENVI® Tools Taiwan sits at the junction of major tectonic plates and regularly experiences powerful earthquakes. Understanding how the ground moves during these events is essential for disaster preparedness, public safety, and building community resilience. But traditional approaches like field... Read More >

Comparing Amplitude and Coherence Time Series With ICEYE US GTR Data and ENVI SARscape

12/3/2025

Large commercial SAR satellite constellations have opened a new era for persistent Earth monitoring, giving analysts the ability to move beyond simple two-image comparisons into robust time series analysis. By acquiring SAR data with near-identical geometry every 24 hours, Ground Track Repeat (GTR) missions minimize geometric decorrelation,... Read More >

Empowering D&I Analysts to Maximize the Value of SAR

12/1/2025

Defense and intelligence (D&I) analysts rely on high-resolution imagery with frequent revisit times to effectively monitor operational areas. While optical imagery is valuable, it faces limitations from cloud cover, smoke, and in some cases, infrequent revisit times. These challenges can hinder timely and accurate data collection and... Read More >

Easily Share Workflows With the Analytics Repository

10/27/2025

With the recent release of ENVI® 6.2 and the Analytics Repository, it’s now easier than ever to create and share image processing workflows across your organization. With that in mind, we wrote this blog to: Introduce the Analytics Repository Describe how you can use ENVI’s interactive workflows to... Read More >

Deploy, Share, Repeat: AI Meets the Analytics Repository

10/13/2025

The upcoming release of ENVI® Deep Learning 4.0 makes it easier than ever to import, deploy, and share AI models, including industry-standard ONNX models, using the integrated Analytics Repository. Whether you're building deep learning models in PyTorch, TensorFlow, or using ENVI’s native model creation tools, ENVI... Read More >

1 2 3 4 5 6 7 8 9 10 Next Last

«

December 2025

»

Sun

Mon

Tue

Wed

Thu

Fri

Sat

30

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

1

2

3

4

5

6

7

8

9

10

8842 Rate this article:

3.0

String processing performance in IDL

Anonym Thursday, February 19, 2015

IDL performs array based operations very efficiently, but most processing tasks do require some amount of string parsing and manipulation. I have selected 3 common string processing tasks to analyze in more depth in order to find the best string processing strategies in each of these cases. The first example is to find all the strings that start with a given substring. IDL 8.4 has many new intrinsic methods for string type variables, and one of them is "StartsWith". Here is the code I used to compare 4 different approaches to find out which strings in a string array starts with the word "end".

pro StrTest_StartsWith

compile_opt idl2,logical_predicate

f = file_which('amoeba.pro')

str = strarr(file_lines(f))

openr, lun, f, /get_lun

readf, lun, str

free_lun, lun

first = str.StartsWith('end')

n = 50000

times = dblarr(4)

methods = ['StartsWith','STRCMP','STREGEX','STRPOS']

for method=0,3 do begin

t0 = tic()

case method of

0: for i=0, n-1 do x = str.StartsWith('end')

1: for i=0, n-1 do x = strcmp(str,'end',3)

2: for i=0, n-1 do x = stregex(str,'^end',/boolean)

3: for i=0, n-1 do x = strpos(str,'end') eq 0

endcase

times[method] = toc(t0)

print, array_equal(x,first) ? 'Same answer' : 'Different answer'

endfor

print, string(methods[sort(times)] + ':', format='(a-15)') + $

string(times[sort(times)], format='(g0)'), $

format='(a)'

end

The first method is to use the new intrinsic "StartsWith" method, the next is to use STRCMP with a 3rd argument specifying how many characters to compare. The third method uses a regular expression with STREGEX, and the final method uses STRPOS and compare the result to 0, meaning the pattern was found starting at position 0. The result I get when I run this code in IDL 8.4 is:

Same answer

STRCMP: 0.128

StartsWith: 0.147

STRPOS: 0.91

STREGEX: 1.497

All methods return a byte array of zeros and ones indicating where the matches are. STRCMP with 3 arguments ended up being the fastest, with the new "StartsWith" method being a close second. STREGEX should be avoided unless it is really needed for a more complex expression.

In this second example, the goal is to replace the first occurrence of an equal sign (=) with a color (:) on every line that contains at least one equal (=) sign. If there are additional equal signs, they should remain unchanged. This is mostly useful for converting the format of name/value pairs stored in a text file. I used 4 different methods to achieve the same result:

pro StrTest_Substring

compile_opt idl2,logical_predicate

f = file_which('amoeba.pro')

str = strarr(file_lines(f))

openr, lun, f, /get_lun

readf, lun, str

free_lun, lun

n = 2000

index = str.IndexOf('=')

w = where(index ne -1)

index = index[w]

first = str

first[w] = str[w].Substring(0,index-1)+':'+str[w].Substring(index+1)

methods = ['Substring','STRPUT','Split/Join','BYTARR']

times = dblarr(4)

for method=0,3 do begin

t0 = tic()

case method of

0: for i=0, n-1 do begin

index = str.IndexOf('=')

w = where(index ne -1)

index = index[w]

y = str[w]

x = str

x[w] = y.SubString(0,index-1)+':'+y.SubString(index+1)

endfor

1: for i=0, n-1 do begin

x = str

pos = strpos(str,'=')

foreach xx, x, j do begin

if pos[j] ne -1 then begin

strput, xx, ':', pos[j]

x[j] = xx

endif

endforeach

endfor

2: for i=0, n-1 do begin

x = str

foreach xx, x, j do begin

parts = xx.Split('=')

if parts.length gt 1 then x[j] = ([parts[0],parts[1:*].join('=')]).join(':')

endforeach

endfor

3: for i=0, n-1 do begin

b = byte(str)

b[maxInd[where(max(b eq 61b, dimension=1, maxInd))]] = 58b

x = string(b)

endfor

endcase

times[method] = toc(t0)

print, array_equal(x,first) ? 'Same answer' : 'Different answer'

endfor

print, string(methods[sort(times)] + ':', format='(a-15)') + $

string(times[sort(times)], format='(g0)'), $

format='(a)'

end

Same answer

BYTARR: 0.148

STRPUT: 0.187

Substring: 0.188

Split/Join: 1.456

The cryptic byte array method ended up being the fastest, even though it does perform a lot of copying, and doesn't contain any obvious string processing functions. This is because IDL can run operations on arrays very efficiently to speed up the computations. For example, the internal array indexing gives good predictable memory access patterns. However, I would not really recommend using this approach here, since the code is very hard to understand, and to modify if needed. I would also avoid using the SPLIT/JOIN approach as that is very inefficient. Using "IndexOf" and "Substring" is nice here, especially notice that the "Substring" method is similar to STRMID, but can handle an array of different positions matching the size of the string array. This is a significant improvement over the old STRMID. For example, to extract the beginnings of every string up and including the first "e", you could use:

IDL> a=['!Hello!', 'test','this one!']

IDL> a.Substring(0,a.IndexOf('e'))

!He

te

this one

Or, to extract the characters after the first colon:

IDL> x = ((orderedhash(!cpu))._overloadPrint())

IDL> x

HW_VECTOR: 0

VECTOR_ENABLE: 0

HW_NCPU: 6

TPOOL_NTHREADS: 6

TPOOL_MIN_ELTS: 100000

TPOOL_MAX_ELTS: 0

IDL> x.Substring(x.IndexOf(':'))

: 0

: 6

: 100000

: 0

The final example is replacing every occurrence of = with =>. I used 2 different methods for this, using the new "Replace"method on string types, and using STRSPLIT/STRJOIN. The results show that the new Replace method is much more efficient.

pro StrTest_Replace

compile_opt idl2,logical_predicate

f = file_which('amoeba.pro')

str = strarr(file_lines(f))

openr, lun, f, /get_lun

readf, lun, str

free_lun, lun

n = 5000

first = str.Replace('=', '=>')

methods = ['Replace','STRSPLIT']

times = dblarr(2)

for method=0,1 do begin

t0 = tic()

case method of

0: for i=0, n-1 do begin

x = str.Replace('=','=>')

endfor

1: for i=0, n-1 do begin

x = str

foreach xx, x, j do x[j] = strjoin(strsplit(xx,'=',/extract),'=>')

endfor

endcase

times[method] = toc(t0)

print, array_equal(x,first) ? 'Same answer' : 'Different answer'

endfor

print, string(methods[sort(times)] + ':', format='(a-15)') + $

string(times[sort(times)], format='(g0)'), $

format='(a)'

end

Same answer

Replace: 0.545

STRSPLIT: 2.778

Please login or register to post comments.

Getting the IDL process identifier on Linux

A New Era of Hyperspectral Imaging with ENVI® and Wyvern’s Open Data Program

NV5 Geospatial Blog

Mapping Earthquake Deformation in Taiwan With ENVI

Comparing Amplitude and Coherence Time Series With ICEYE US GTR Data and ENVI SARscape

Empowering D&I Analysts to Maximize the Value of SAR

Easily Share Workflows With the Analytics Repository

Deploy, Share, Repeat: AI Meets the Analytics Repository

String processing performance in IDL

Please login or register to post comments.