09 Jun 2015 02:43 AM |
|
Dear everybody,
I want to cut out differntly sized sub-arrays from one Source_Array into.
Given the following already existing 3 lists with equal number of elements N.
- Target_ArrayL
- Target_IndexL
- Source_IndexL
And a 3D Array:
- Source_Array
I have written the following for loop:
FOR i=1,N-1 DO $
Target_ArrayL[i,Target_IndexL[i,*,0],Target_IndexL[i,*,1],Target_IndexL[i,*,2]]$ = Source_Array[Source_IndexL[i,*,0],Source_IndexL[i,*,1],Source_IndexL[i,*,2]]
It works as expected but as the loop progresses the loop gets slower and slower and slower. I tried indexing all list elements in one run but as it turns out IDL does not allow that. So the following two questions:
Is there a way I could speed it up?
Why does the performance go down within the loop? The only thing that changes over the loop is that the NaN values in the Target_ArrayL get replaced by actual values.
Any insights are much appreciated
Niklas
|
|
|
|
Zachary Norman Basic Member
Posts:173  
10 Jun 2015 10:03 AM |
|
Niklas,
Do you think that you could include an entire example program with definitions for your arrays? I think that this would help me better understand your questions if I can see everything that you are doing and test it out. Thanks in advance for the additional information!
|
|
|
|
Deleted User New Member
Posts:15  
11 Jun 2015 06:54 AM |
|
Dear Zach,
I already found a solution that works for me and want to illustrate it in the example below. However one technical question remains (cp. end of post).
;-------------------------------------------------------------------------------------------
; map function used within main script, save and compile in "mapfunc.pro"
;-------------------------------------------------------------------------------------------
FUNCTION mapFunc,a,b
a[b[*,0],b[*,1],b[*,2]]=42.0D
return,a
END
;========================== start of main script =============
;-------------------------------------------------------------------------------------------
; Generate random sample data:
;-------------------------------------------------------------------------------------------
n=20000
a=Round( RandomU(seed, n) * 9 )+2
; Target_ArrayL: List of 2-D arrays with different sizes
Target_ArrayL=LIST()
FOR i=0,n-3 DO Target_ArrayL.Add,REPLICATE(!Values.F_NAN,a[i],a[i+1],a[i+2])
; Indices of those arrays
Target_IndexL=Target_ArrayL.map(LAMBDA(x:ARRAY_INDICES(x,WHERE(x))))
;-------------------------------------------------------------------------------------------
; quick version:
;-------------------------------------------------------------------------------------------
result=(Target_ArrayL).map('mapFunc',Target_IndexL)
;-------------------------------------------------------------------------------------------
; slow version:
;-------------------------------------------------------------------------------------------
t=LIST()
FOR i=0UL,n-3 DO BEGIN &$
Target_ArrayL[i,Target_IndexL[i,*,0],Target_IndexL[i,*,1],Target_IndexL[i,*,2]] = 42.0D &$
print,i,"/",n-3 &$
t.add,systime(1) &$
ENDFOR
;--------------------------------------------------------------------------------------------------------------------------------------------------------
; Plot processing time development within loop for slow version reveals exponential dependence of time for one loop run and i (number of loop)
;--------------------------------------------------------------------------------------------------------------------------------------------------------
plot(t.toarray())
;============ End Of Code ==============================================
Now just one technical question remains: Why does the loop (slow version) get exponentially slower? In the IDL documentation it is stated that "List is implemented as a singly-linked list of pointers" ( http://www.exelisvis.com/docs/LIST.html). Doesn't that suggest that the performance of changing the value of one element in a list does not depend on the rest of the list and thus each loop run should be equally fast? And why is the list comprehension version quicker?
|
|
|
|
Zachary Norman Basic Member
Posts:173  
12 Jun 2015 08:37 AM |
|
Hi Niklas,
I made a few changes to your code, see them below.
1) For the increasing time, I expect this result with the way that you programmed it. When you keep adding a value of the system time to the list, then that value of time is inherently increasing. To get the time to do each calculation in the loop, you will need to get the difference in time between the array elements as I did. This will give you a plot where there is a small range of times for the calculations. Additionally, to plot the time, I needed to remove the indices where the value of time passed was 0. I'm pretty sure that sometimes the PC does the loop so quickly, that it cannot detect a difference in the system time. With 20000 repeats of the loop, I have about 800-1000 values of time that are actual numbers.
2) For the loop speed, I made a small test. I included a small bit of code to see how long takes to use your Target_IndexL. Here is the code I added for that:
tic
FOR i=0,n-3 DO BEGIN
temp1 = Target_IndexL[i,*,0]
temp2 = Target_IndexL[i,*,1]
temp3 = Target_IndexL[i,*,2]
ENDFOR
toc
Timing that loop and timing the loop which alters your list elements, I get that my addition takes 14-15 seconds which is just as long as the loop where you alter the array indices. I think that this points out how efficient the List is and that it actually is pointers. I would say that it is your Target_IndexL that is slowing your code down because you need to generate 3, 3 by 1 arrays every step and that is passed through a lambda function. I would also say that using a mapFunc as you do is probably highly optimized for Lists, whereas your alternate method is not and it may be hard to even come close to using map for Lists
;-------------------------------------------------------------------------------------------
; map function used within main script, save and compile in "mapfunc.pro"
;-------------------------------------------------------------------------------------------
FUNCTION mapFunc,a,b
a[b[*,0],b[*,1],b[*,2]]=42.0D
return,a
END
pro looping_speed_question
compile_opt idl2
ireset, /no_prompt ;this closes all plot windows
;========================== start of main script =============
;-------------------------------------------------------------------------------------------
; Generate random sample data:
;-------------------------------------------------------------------------------------------
n=20000.
a=Round( RandomU(seed, n) * 9 )+2
; Target_ArrayL: List of 2-D arrays with different sizes
Target_ArrayL=LIST()
FOR i=0,n-3 DO Target_ArrayL.Add,REPLICATE(!Values.F_NAN,a[i],a[i+1],a[i+2])
; Indices of those arrays
Target_IndexL=Target_ArrayL.map(LAMBDA(x:ARRAY_INDICES(x,WHERE(x))))
;-------------------------------------------------------------------------------------------
; quick version:
;-------------------------------------------------------------------------------------------
result=(Target_ArrayL).map('mapFunc',Target_IndexL)
;-------------------------------------------------------------------------------------------
; slow version:
;-------------------------------------------------------------------------------------------
;look at time to get our three target_index values
tic
FOR i=0,n-3 DO BEGIN
temp1 = Target_IndexL[i,*,0]
temp2 = Target_IndexL[i,*,1]
temp3 = Target_IndexL[i,*,2]
ENDFOR
toc
tic
;get actual times to do calculations
t=list()
time_start = systime(1)
FOR i=0,n-3 DO BEGIN
Target_ArrayL[i,Target_IndexL[i,*,0],Target_IndexL[i,*,1],Target_IndexL[i,*,2]] = 42.0D
;print,i,"/",strtrim(n-3,1)
t.add,systime(1)
ENDFOR
toc
;correct for time passed sicne start of loop
temp = t.toarray() -time_start
;find the difference in time, not the total time
for i=n_elements(temp)-1, 1, -1 do temp[i] += -temp[i-1]
;plot the times only where they have a value
times_plot = temp[where(temp gt 0)]
p = plot(times_plot)
;============ End Of Code ==============================================
end
|
|
|
|
Deleted User New Member
Posts:15  
15 Jun 2015 06:47 AM |
|
Dear Zach,
thanks a lot for the detailed response!
I am still a little bit confused about the time it takes the algorithm to go through each loop. Maybe I just didn’t understand you 100%. To illustrate my confusion I added another 3 lines of code at the end:
; create plot of moving sum of 100 values
mSum=LIST()
for i=0,n_elements(temp)-100 DO mSum.add,TOTAL(temp[i:i+99])
p2 = plot(mSum.ToArray())
The resulting plot is pretty linear so that means that the time it takes to complete each step is increasing linearly through the loop. This could be expressed as:
TimePerRun(run)=k*run ; run: number of loop, k: some constant
NetTime=Integral of TimePerRun (run) = ½*k * run ^2.
This is the quadratic plot from my first post. Thus the net time for the whole loop depends quadratically on the number of loops. This would make sense if the operations would get more time consuming as with arrays that need to be accessed as a whole and grow within the loop. The use of pointers for each element in the list I figured would make the access of one element in the list independent of the values stored in the rest of the list and thus I expected a constant time for each loop and a linear relationship between the total number of runs and the total time needed to complete the loop.
I hope I am making sense :)
Have a good day
Niklas
;-------------------------------------------------------------------------------------------
; map function used within main script, save and compile in "mapfunc.pro"
;-------------------------------------------------------------------------------------------
FUNCTION mapFunc,a,b
a[b[*,0],b[*,1],b[*,2]]=42.0D
return,a
END
pro looping_speed_question
compile_opt idl2
ireset, /no_prompt ;this closes all plot windows
;========================== start of main script =============
;-------------------------------------------------------------------------------------------
; Generate random sample data:
;-------------------------------------------------------------------------------------------
n=20000.
a=Round( RandomU(seed, n) * 9 )+2
; Target_ArrayL: List of 2-D arrays with different sizes
Target_ArrayL=LIST()
FOR i=0,n-3 DO Target_ArrayL.Add,REPLICATE(!Values.F_NAN,a[i],a[i+1],a[i+2])
; Indices of those arrays
Target_IndexL=Target_ArrayL.map(LAMBDA(x:ARRAY_INDICES(x,WHERE(x))))
;-------------------------------------------------------------------------------------------
; quick version:
;-------------------------------------------------------------------------------------------
result=(Target_ArrayL).map('mapFunc',Target_IndexL)
;-------------------------------------------------------------------------------------------
; slow version:
;-------------------------------------------------------------------------------------------
;look at time to get our three target_index values
tic
FOR i=0,n-3 DO BEGIN &$
temp1 = Target_IndexL[i,*,0] &$
temp2 = Target_IndexL[i,*,1] &$
temp3 = Target_IndexL[i,*,2] &$
ENDFOR &$
toc
tic
;get actual times to do calculations
t=list()
time_start = systime(1)
FOR i=0,n-3 DO BEGIN &$
Target_ArrayL[i,Target_IndexL[i,*,0],Target_IndexL[i,*,1],Target_IndexL[i,*,2]] = 42.0D &$
;print,i,"/",strtrim(n-3,1)
t.add,systime(1) &$
ENDFOR
toc
;correct for time passed since start of loop
temp = t.toarray() -time_start
;find the difference in time, not the total time
for i=n_elements(temp)-1, 1, -1 do temp[i] += -temp[i-1]
;plot the times only where they have a value
times_plot = temp[where(temp gt 0)]
p = plot(times_plot)
; create plot of moving sum of 100 values
mSum=LIST()
for i=0,n_elements(temp)-100 DO mSum.add,TOTAL(temp[i:i+99])
p2 = plot(mSum.ToArray())
;============ End Of Code ==============================================
end
|
|
|
|
Zachary Norman Basic Member
Posts:173  
16 Jun 2015 03:28 PM |
|
Hi Niklas,
Here is my code again real quick:
;correct for time passed sicne start of loop
temp = t.toarray() -time_start
;find the difference in time, not the total time
for i=n_elements(temp)-1, 1, -1 do temp[i] += -temp[i-1]
I understand what you are saying about how the difference in time looked like a quadratic relationship. For what it is worth,I thought about it like this. The original definition of temp is basically defined as the total time that has passed since the value of time_start is set. Although this did increase like a quadratic function, I wanted to look at the time between each element where the system time was recorded to see if this number actually increases. That is what my other for loop is for, to find the time passed during each loop iteration. In addition to that, I removed the elements where the difference in time was equal to 0, because the loop was so quick between some of the iterations that there was almost no difference between the system times, causing some elements to have the same system time. I can up with a small example that might illustrate this more. Why don't you run it and see if it makes more sense. I included an image of what the output should look like. The green plot is the original values of time recorded and the blue plot is the difference in time between the elements.
n=10
temp=make_array(n,/float)
tic
for i=0,n-1 do begin
temp[i] = toc()
wait, .1
endfor
print, temp
;uncorrected data
p = plot(temp,color='green', YRANGE=[0,1.5])
;corrected data
for i=n_elements(temp)-1, 1, -1 do temp[i] += -temp[i-1]
;removes times that are 0
temp = temp[where(temp gt 0)]
p2 = plot(temp, color='blue', /current, /overplot)
|
|
|
|
Deleted User New Member
Posts:15  
17 Jun 2015 03:31 AM |
|
Dear Zach,
thanks for the explanation. I have reread your and my code, isolated the interesting part and here is a really simple example to illustrate why I am still confused:
;======= start of code
pro funny
n=10000
L=LIST(1,length=n)
print,"individual loops"
tic
for i=0,n-1 do z=L[i]
for i=0,n-1 do z=L[i]
for i=0,n-1 do z=L[i]
for i=0,n-1 do z=L[i]
toc
print,"one loop"
tic
for i=0,n-1 do begin
z=L[i]
z=L[i]
z=L[i]
z=L[i]
endfor
toc
END
; === End of Code
When I execute the code above IDL prints:
individual loops
% Time elapsed: 0.15399981 seconds.
one loop
% Time elapsed: 6.2080002 seconds.
So the core question is (as it seems): Why is it faster to have just one operation within one loop and repeat that loop 4 times than to integrate those 4 into one loop.
Good day
Niklas
|
|
|
|
Zachary Norman Basic Member
Posts:173  
17 Jun 2015 08:36 AM |
|
Niklas,
Thanks for the additional example, I understand exactly what you are saying now. I was doing some research into this and here is a paragraph from the documentation for LIST:
IDL lists can contain elements of any data type. The elements are stored as a singly-linked list of pointers to the data. Adding or removing elements from the beginning or end of the list is fast, because the list contains a special pointer to the head and tail. Adding or removing elements from the middle of the list will be slower because the linked list must be traversed. However, even in this case, it may still be faster than using an array because no memory needs to be copied. Indexing into the middle of a list will be slower than an array.
Specifically, the last line offers an answer to the difference in time that you may be seeing, but I'm not sure that it accounts for the huge increase in time for your second loop where there are 4 consecutive calls to the same list element. It may well be a bug and something that I will need to look into. I added a little bit to your example to show how much better arrays are than lists as far as computation time goes. Lists are very handy, however, when you don't know how many elements you will have. Another interesting this to mention is that the next release of IDL, which will be IDL 8.5, should have the functionality to go from arrays to lists similar to list.toarray(). This way you can convert your lists to arrays, perform your calculations, and then go back to lists. I'll let you know if it is a bug or not after I discuss it with some colleagues.
;======= start of code
pro funny
n=10000
L1=LIST(1,length=n)
print, ''
print,"LISTS: individual loops"
tic
for i=0,n-1 do z1=L1[i]
for i=0,n-1 do z2=L1[i]
for i=0,n-1 do z3=L1[i]
for i=0,n-1 do z4=L1[i]
toc
print,"LISTS: one loop"
tic
for i=0,n-1 do begin
z1=L1[i]
z2=L1[i]
z3=L1[i]
z4=L1[i]
endfor
toc
print, ''
L1 = L1.toarray()
print,"ARRAYS: individual loops"
tic
for i=0,n-1 do z1=L1[i]
for i=0,n-1 do z2=L1[i]
for i=0,n-1 do z3=L1[i]
for i=0,n-1 do z4=L1[i]
toc
print,"ARRAYS: one loop"
tic
for i=0,n-1 do begin
z1=L1[i]
z2=L1[i]
z3=L1[i]
z4=L1[i]
endfor
toc
print, ''
END
; === End of Code
|
|
|
|
Deleted User New Member
Posts:15  
19 Jun 2015 12:42 AM |
|
Dear Zach,
thanks a lot for the example and your quick responses. If you find out I will be happy to hear if this might be a bug and can be changed in future versions. But anyway my original question has been solved since I can use the list.map(...) functionality to speed everything up.
I am really looking forward to IDL 8.5.
All the best
Niklas
|
|
|
|
Zachary Norman Basic Member
Posts:173  
19 Jun 2015 12:07 PM |
|
Hi Niklas,
I spoke to one of our developers and this is what he had to say about the difference in speed for the two loops, which is a pretty good explanation for why it happens:
The List keeps track of the last element thatwas accessed, so it can proceed to the next element quickly. It's a"singly-linked" list, which means it's very expensive to go backwards(you have to start from the beginning of the list). So in your second loop, youare forcing the list to start over from the beginning for z2, z3, and z4
|
|
|
|
Deleted User New Member
Posts:15  
22 Jun 2015 03:51 AM |
|
Hi Zach,
that is exactly the answer I was looking for! Thanks a lot for looking into that.
So to refer to my original post(s): The exponential growth of time to process each iteration within the loop is due to the facts that:
1. Each part (e. g. Target_IndexL[i,*,0] and Target_IndexL[i,*,1]) has to go through the list at hand from 0 to i.
2. i growths within the loop
So the formula for the speed would be:
AdditionalTimePerRun(i)=i*k*s
i: "number of loop" (refered to as "run" in posts above)
k: "number of times the same list is indexed within the same loop iteration"-1
s: "some system dependent constant time value to go to the next part in a singly-linked list of pointers"
And the conclusion:
Access of the same list within a loop is only recommendable if the number of iterations is kept small.
|
|
|
|