19 Jul 2016 06:03 PM |
|
Hello,
I am trying to figure out how to get a percent pixel count from an envi math_doit command. I currently have a conditional statement that results in an output raster where any pixel that meets that requirement has an assigned value of 1, and any that does not, has an assigned value of 0. What I now need is a way to calculate what percent of the pixels are 1 and what are 0. I have tried to use envi stats but that just says my min and max and obviously I already know those are just 0 and 1. Is there a way to do this and have it export the percents to an excel or text file (excel preferably)? Note that this would have to be performed on a batch of files and thus the file output names would have to be individualized much like I have done with the raster outputs.
Any input is greatly appreciated!
The current code is below and I would like to thank Peg and Mari for all their helpful input on previous questions. Some of their very valuable suggestions were used in this effort!
PRO ccpercent2
compile_opt IDL2
envi=ENVI(/current)
; Search for files.
batchfiles=file_search('C:\temp\test\Stacked', '*.dat')
print, batchfiles
; Process each file.
foreach file, batchfiles do begin
rasterloop = envi.OpenRaster(file)
rasterloop_fid = ENVIRasterToFID(rasterLoop)
; Set the keywords and perform the Cover crop percent calculation.
envi_file_query, rasterloop_fid, DIMS=rasterloop_dims, NB=rasterloop_nb, BNAMES=rasterloop_bnames, FNAME=rasterloop_fname
; Increments the output file names to be unique.
inc = strtrim(string(n_elements(batchfiles)), 2)
print, 'Input file number is ', inc
; Set output location.
outfolder = 'C:\temp\test\output\'
cd, outfolder
; Name output files.
newname = file_basename(file, '.JPG')
out_name = newname +'_CCpercent.dat'
print, 'out_name is ', out_name
out_bname = 'CCpercent ('+newname+')'
print, 'out_bname is ', out_bname
;rep_name = newname + 'CC_percent.txt'
; Perform the band math calculation for Cover crop percent
envi_doit, 'math_doit',$
FID= [rasterloop_fid,rasterloop_fid], $
DIMS=rasterloop_dims, $
POS=[0,1], $
EXP='', $
OUT_BNAME=out_bname, R_FID=r_fid, OUT_NAME=out_name
; envi_doit, 'envi_stats_doit', dims=rasterloop_dims, fid=r_fid, pos=[0], $
; comp_flag=1, dmin=dmin, dmax=dmax, mean=mean, stdv=stdv, $
; rep_name=rep_name, report_flag=1
endforeach
end
|
|
|
|
MariM Veteran Member
Posts:2396  
20 Jul 2016 07:02 AM |
|
I believe if you output the histogram (COMP_FLAG=2), I believe you will get the percentage of points and the count, which could be used to calculate the percentage.
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 09:09 AM |
|
Awesome, that works, and I am able to get a report .txt file with the percents which is great.
The trick is that I will be processing over a thousand images and will need this percent for each one. Having to go through over 1000 .txt files to pull it out is really not feasible. Do you know if there is a way to get one report (or file) that would include this statistics summary for each image? For instance, a .txt or .csv file that had a column for the image name and then additional columns for the stats?
In an earlier incarnation of this code when I was just looking at a single band image I was actually using a conditional statement followed by some lines that told the program to count where the data equaled 1 or 0 and then calculate the percent. Using a print statement I then output those results to the IDL console. This worked well in that it would list each image name and then the percent for 1 and for 0 so that I had all the information for each image together. Unfortunately when I had to change my methodology to include two bands and the band_math function I could no longer get the count to work and had to shift to using the statistics. If I was able to somehow get the count statement to work again would I be able to print the results to a .txt file instead of the console? This might solve the question above.
Sorry to be so long winded.
|
|
|
|
MariM Veteran Member
Posts:2396  
20 Jul 2016 10:20 AM |
|
Assuming you are only getting the histogram of the mask (0's and 1's are the only values in your image), then you could take this information and manually calculate the percentage:
crop_percent=hist[1]/total(hist)
Then use PRINTF to print to a new file all of the results.
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 11:35 AM |
|
Thank you, this statement does calculate the percent value that I need but I am wondering how to write it out so that each percent is associated with the file it was calculated from. I have been trying to use the write_csv function but it is not working for me. If I use a printf statement I am only successful in getting it to write out the percent but no corresponding image name.
|
|
|
|
MariM Veteran Member
Posts:2396  
20 Jul 2016 12:28 PM |
|
You can get the file name using ENVI_FILE_QUERY using SNAME or FNAME or from your original file_search yes? Then if you want to use WRITE_CSV:
fname = 'c:\temp\crop_stats.txt'
write_csv, fname, sname, crop_pct
where 'crop_pct' is the percentage you calculated from the HIST array.
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 12:53 PM |
|
I've got the write_csv command working in parts as of now. The headers are giving me a problem, but more importantly, it is only writing one file, not all the files. Do I somehow have to tie it to the rasterloop?
Data1 = newname
Data2 = cover_crop_percent
write_csv, 'CCpercent_output_test.csv', Data1, Data2 ; Header='Image Name', 'Cover Crop Percent'
|
|
|
|
MariM Veteran Member
Posts:2396  
20 Jul 2016 01:00 PM |
|
I was thinking you wanted a single text file with all of the file names and stats. In this case, you need to 'collect' the output into a variable and then write it to the text file. Here is an example of just collecting the min/max values of a set of files and writing the file name, min, max to a csv:
--------------
pro compile_stats_test
compile_opt idl2
; Search for files.
batchfiles=file_search('C:\test', '*.tif')
;create text file name and array to hold stats
report_name = 'c:\temp\crop_stats.txt'
all_min = fltarr(n_elements(batchfiles))
all_max = all_min
; Process each file.
foreach file, batchfiles, index do begin
rasterloop = envi.OpenRaster(file)
rasterloop_fid = ENVIRasterToFID(rasterLoop)
envi_file_query, rasterloop_fid, DIMS=rasterloop_dims, NB=rasterloop_nb
envi_doit, 'envi_stats_doit', dims=rasterloop_dims, fid=rasterloop_fid, pos=[0], $
comp_flag=2, dmin=dmin, dmax=dmax, mean=mean, stdv=stdv, hist=hist, $
report_flag=2
all_min[index, *] = dmin
all_max[index, *] = dmax
endforeach
write_csv, report_name, batchfiles, all_min, all_max
end
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 01:26 PM |
|
You were right with your original thought. I do want just one csv file with all my images names and the corresponding percent covers listed. See an abbreviated example below :
Image name Crop percent
2012-10-12_39390A 0.2568
2012-10-12_40392D 0.3895
2012-10-12_89893B 0.9856
2012-10-12_65490A 0.5648
As I have it written now it works, but it only outputs one image name and one percent cover even though I am running a test set of 12 images. It looks like this (with no headers because I have not got that working yet):
2012-10-12_39390A 0.2568
I apologize for the confusion
|
|
|
|
MariM Veteran Member
Posts:2396  
20 Jul 2016 01:41 PM |
|
Did you add in the arrays that collect the stats as in my example? Can you post your changes?
The resulting text file from my example looks like:
File Name,Band Min,Band Max
"C:\test\0037.tif",29.1751,52.9314
"C:\test\0038.tif",29.5698,52.2835
"C:\test\0039.tif",29.3997,54.3388
I added in the headers using strings:
write_csv, report_name, batchfiles, all_min, all_max, header = ['File Name', 'Band Min', 'Band Max']
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 01:58 PM |
|
Here is the code I have now. I ave not yet added in the array you suggested because I am not sure where it would go and I am slightly unclear as to what it is doing. Do you I have to establish a .txt file for it to work? I already have a stats_doit which is required for the histogram, but I do not actually need the min and max, I need the output from the crop_percent line.
PRO ccpercent2
compile_opt IDL2
envi=ENVI(/current)
; Search for files.
batchfiles=file_search('C, '*.dat')
print, batchfiles
; Process each file.
foreach file, batchfiles do begin
rasterloop = envi.OpenRaster(file)
rasterloop_fid = ENVIRasterToFID(rasterLoop)
; Set the keywords and perform the Cover crop percent calculation.
envi_file_query, rasterloop_fid, DIMS=rasterloop_dims, NB=rasterloop_nb, BNAMES=rasterloop_bnames, FNAME=rasterloop_fname
; Increments the output file names to be unique. DO NOT ALTER
inc = strtrim(string(n_elements(batchfiles)), 2)
print, 'Input file number is ', inc
; Set output location. As above, you will cut and paste the location where you want your output files saved. Be sure to leave the final "\" after the folder name.
outfolder = 'C:\\output\'
cd, outfolder
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 02:14 PM |
|
To clarify, when I run the code posted above it works and outputs the .csv file in the exact format I won't, but it only writes out the results from one file, not all the files I processed. If I print, it shows all the cover_crop_percent outputs so I know it is working for all files. For some reason I am unable to attach a file or I would include an example of the output.
|
|
|
|
MariM Veteran Member
Posts:2396  
20 Jul 2016 02:18 PM |
|
The arrays I set up before the loop are for storing the stats values as you loop through each file. For example:
;create arrays to hold the stats I want to store and eventually write to csv
;the number of elements in the arrays should match the number of files
all_min = fltarr(n_elements(batchfiles))
all_max = all_min
hist_total = all_min
;using this to gather the sname (short name) from the ENVI_FILE_QUERY
all_sname = strarr(n_elements(batchfiles))
Then start the loop. After you calculate the stats, you are going to use these arrays to 'collect' the stats for each file that comes through the loop:
;collect the results in the arrays
all_min[index, *] = dmin
all_max[index, *] = dmax
hist_total[index,*] = total(hist)
all_sname[index,*] = sname
Then, after you have all the stats, *outside* of the loop, write the csv. If you write the csv in the loop, it will just overwrite itself.
endforeach
write_csv, report_name, all_sname, all_min, all_max, hist_total, header = ['File Name', 'Band Min', 'Band Max', 'Hist Total']
It doesn't matter what stats you want - use your calculation instead of 'max' or 'min' or whatever. Also, before you write the csv, do a print statement on the arrays to make sure they collecting the right stuff.
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 02:58 PM |
|
EDITED
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 03:17 PM |
|
EDITED
|
|
|
|
Deleted User New Member
Posts:  
20 Jul 2016 04:31 PM |
|
UPDATE: The code is running now and outputting a .csv., there is still a problem with the information written to the .csv though.
I am testing 3 images. It writes the correct information for the third image, but gives nothing for name and 0 for hist total for the first two images, I was unable to add a screenshot but here is what the .csv looks like:
Image Name Hist Total
0
0
2012-10-12_40392D 0.257271
If I print the array it lookscorrect (see below):
Input file number is 3
out_name is 2012-10-12_39390A - Copy_CCpercent
out_bname is CCpercent (2012-10-12_39390A - Copy)
0.000000 0.000000 0.000000
1.00000 0.000000 0.000000
0.225659 0.000000 0.000000
2012-10-12_39390A - Copy.dat
Input file number is 3
out_name is 2012-10-12_39390A_CCpercent
out_bname is CCpercent (2012-10-12_39390A)
0.000000 0.000000 0.000000
0.000000 1.00000 0.000000
0.000000 0.225659 0.000000
2012-10-12_39390A.dat
Input file number is 3
out_name is 2012-10-12_40392D_CCpercent
out_bname is CCpercent (2012-10-12_40392D)
0.000000 0.000000 0.000000
0.000000 0.000000 1.00000
0.000000 0.000000 0.257271
2012-10-12_40392D.dat
% Compiled module: WRITE_CSV.
|
|
|
|
MariM Veteran Member
Posts:2396  
21 Jul 2016 05:34 AM |
|
The report and arrays need to be initialized before you start the FOREACH loop. Move this section to the top of your code after you use FILE_SEARCH:
;create arrays to hold the stats I want to store and eventually write to csv
;the number of elements in the arrays should match the number of files
report_name = 'C:\Users\ahudson-dunn\Desktop\CLASSIFICATION\Input Files\Current Test\IDL_test\Stacked\output\CCpercent_stats.csv'
all_min = fltarr(n_elements(batchfiles))
all_max = all_min
hist_total = all_min
stats_name = strarr(n_elements(batchfiles))
You can remove the min/max arrays if you don't need that information.
|
|
|
|
Deleted User New Member
Posts:  
21 Jul 2016 06:43 AM |
|
Oh DUH, of course.....
It works perfect now! Thank you so much for your help!
|
|
|
|
Deleted User New Member
Posts:  
22 Jul 2016 09:12 AM |
|
Hi Mari,
Now that my code is performing perfectly I have another concern/question I was hoping you might be able to help me with. I know I already marked this issue as "resolved" and apologize if this is not the correct etiquette in asking a new questions. Please feel free to correct me if I am in the wrong. If it is ok, my question is below:
The input for this .pro is a two band raster. This two band raster was generated by using ENVI 'save as' to stack two non-georeferenced files. As I believe I previously mentioned, I am working with a very large dataset and will need to create hundreds and hundreds of these stacked images to input in the program above. I have been trying to write another .pro calling on the BuildBandStack task but I am beginning to think that getting this to work in batch mode involves IDL programming capabilities WELL beyond any that I have (which would not be hard). I would still prefer figuring out a way to perform the above task in batch mode, but if that is not really an option I was wondering if in lieu of having a two band stacked raster input, this crop_percent code could be modified to pull each band in from a different location. Of course one immediate issue with that technique is how to make sure that the correct input file1 associates with the correct input file2 seeing that each directory houses hundreds of separate inputs. Maybe I am thinking about this all wrong, but I know my conditional statement requires the input of both of the rasters and the code needs to perform in batch mode so I am not sure what other options there are.
Anyways, I thought I would throw this out there as you really seem to know your stuff and are so gracious in providing us novice IDL'ers with help. I understand if it is just too complicated to get in to though.
Either way, thank you for all your help with getting this crop percent code running!
|
|
|
|