INTERNAL: (Review) Information regarding Receiver Operating Characteristic (ROC) curves from Jim Young

Anonym Friday, November 30, 2001

Receiver Operating Characteristic (ROC) curves are used to visualize a classification’s performance to select a suitable operating point. ROC curves plot the probability of detection verses the probability of false alarms for a series of classifications. The selected operating point then depends on the desired probability of detection (PD) and probability of false alarm (PFA). In addition, ROC curves may show that a selected classification method cannot perform to the desired PD and PFA.

The following steps illustrate the processes for calculating a ROC curve in ENVI:

Compute a threshold and classify the rule image based on this threshold
Compare the classified image to the ground truth and count the number of pixels classified properly and the number of pixels classified improperly.
Compute one of the points for the ROC curve where:
- PD = points classified properly / # of points in class
- PFA = points classified improperly / # of point not in class
Repeat number 1-3 for a number of different threshold values. Each repeated point becomes a point on the ROC curve.

To fully understand the performance of a classification technique a ROC curve should be derived over a threshold range where the probability of detection and false alarm is zero to a point where they are both one. The optional probability of detection verse threshold plot is useful to determine a suitable operating (threshold) value for the classification Increasing the number of points used to generate the ROC curve may help detail areas of interest and the threshold range can also be used to detail areas of interest.

Additional Questions and Answers regarding ROC:
How can the PD vs. threshold curve reach 1.00, since there is never 100% accurate classification? For example, with threshold=0, the PD=0.71
- The accuracy can reach 100% (ie. PD=1) but it depends on the data and the ground truth. If you cannot reach PD=1 then most likely you have something wrong with the ground truth or pixels that just cannot be distinguished with the methods used.
Where are the unclassified pixels taken into account? The crux of the confusion seems to be when you change the threshold, more pixels get unclassified, instead of classified into the alternative category as in Bradley's paper and the usual application of ROC curves (where there are only 2 categories).
- The unclassified pixels are just unclassified. There is not a requirement for a pixel to jump into another class as you change the threshold.
Often people get confused with ROC curves because they don't really have good ground truth. For example, a good application of a ROC curve would be the case where you have a military vehicle in an image scene. Because of the constraints of image acquisition you know which pixels and partial pixels contain the target. Now you have an accurate ROI with the target pixels. Running different classification methods and using the ground truth ROI, you determine the classification method's performance. The military then looks at the problem from a statistical point of view and say we will blowup this type of vehicle if we can detect it with a PD=.95 and a PFA of .00001. Looking at the ROC curve they can achieve this performance using method X and a threshold of