12770 Rate this article: 4.4 How does ENVI's Pixel Purity Index Work? ENVI includes a tool called the Pixel Purity Index, which can be used to locate the purest pixels in a multispectral image. This tool was designed by one of the ENVI developers, and is unique to ENVI. The PPI image shows you the locations of the purest pixels in your image. By definition, a pure pixel is one which contains only one spectrally unique material. In contrast, most pixels contain mixtures of materials. The PPI calculation identifies only the pixels that are the least mixed. If you are particularly interested in certain areas of your image, the PPI analysis will not necessarily find pixels from those area, unless they are spectrally pure relative to the rest of the pixels in the image. The most common reason for wanting to find the spectrally purest pixels in an image is that the spectra for these pixels are the best candidates for endmember spectra, which are needed for ENVI’s subpixel analyses, including Linear Spectral Unmixing and Match Filtering. Endmembers are spectrally pure, unique materials which occur in the scene. Unmixing and Match Filtering analyses can give information about the amount of each endmember that occurs in every pixel in the image. The PPI calculation selects a random vector through the n-dimensional data cloud (each random vector is constrained to pass through the center - the mean value - of the data cloud) and then projects each pixel in the image onto the random vector. A histogram is calculated which shows how the pixels are projected onto the random vector. Pixels are considered pure if they fall into the tails of the histogram distribution. The histogram tails are defined by the PPI threshold value entered by the user. Throughout the PPI process, ENVI keeps track of which pixels are identified as pure for each random vector. Each time the same pixel is identified as pure using a new random vector, its value in the output image is incremented by one. Then a new random vector is chosen, and the process is repeated. The result of the PPI routine is an image where the value of each pixel corresponds to the number of times it was identified as a pure pixel during all of the PPI iterations. It is sometimes helpful to visualize the n-dimensional data cloud as an irregularly shaped volume. Some pixels will be inside the volume, while others will be on its surface. A small number of pixels will form corners of the volume. The pixels that define the corners of the volume are those that will be most frequently projected onto the tails of the random vectors. After the first iteration, there will only be a few pixels in the output image which have values of one, and all the rest of the pixels will have values of zero. After 100 iterations, a few pixels may have values between one and 100, but most pixels will still have values of zero. After thousands of iterations the pixels which end up with the highest PPI values are those which are closest to being corners of the data cloud, and represent the spectrally purest pixels in the image. (Depending on the threshold selected and the number of iterations, some high valued PPI pixels may be pixels which define an edge of the n-dimensional data simplex - still relatively pure when compared to the rest of the image pixels.) The threshold value (which defines the histogram tails for each random vector) should be specified using the same units as the pixel values of the input image. If the image on which you are running the PPI analysis is the result of an MNF transform (this is a good way to proceed - see the tech tip on MNF), then the image units are noise standard deviation units. The smaller the threshold value you choose, the fewer pixels will be identified as being pure (i.e., pixels will have to be more pure in order to project onto the tail). The ENVI developers recommend that the threshold value be set to a number slightly higher than the noise level of the data. Therefore, a good threshold value to use when calculating the PPI for an MNF image is two or three. As the number of dimensions increases, more iterations are necessary to identify all of the pure pixels. Unfortunately there is no way to know how many iterations will be required to identify all of the pure pixels for a given threshold level. The number of necessary iterations is typically on the order of thousands to tens of thousands. As the PPI is being calculated, ENVI displays and continuously updates a plot of the number of pixels that have been found to be pure in at least one iteration. When the cumulative number of pixels identified in this plot begins to level off, that means that each subsequent iteration is failing to find new pixels (i.e., the same pure pixels are being found over and over again). At this point it is relatively safe to assume that all of the pure pixels have been found. Because the speed of the PPI process is dependent on the image dimensions, it is often a good idea to apply a MNF transformation to the input image before running the PPI. This transformation segregates the noise into the high numbered MNF bands, and the information into the low numbered MNF bands. Thus, when you calculate PPI for a MNF transformed dataset, you can select a spectral subset of the MNF image which excludes those bands which contain mostly noise. See the tech tip about the MNF transformation for more details. Even for MNF transformed, spectrally subset hyperspectral data (derived from 224 band AVIRIS data), it is not uncommon to run as many as 30,000 PPI iterations before the cumulative number of pure pixels identified begins to level off. It is difficult to judge how fast the PPI should run on various machines. The most important issue is whether you can run the [FAST] version of the algorithm, which will require that the input scene fit into the machine's RAM. On a Windows NT PC with a Pentium II processor at 266 MHz, 30,000 PPI iterations of a 400 x 350 x 15 floating point image will take overnight (maybe 10-12 hours) using the FAST version of PPI. A typical PPI image has a large number of pixels with values that are zero or near zero, but the number of pixels quickly decreases with higher pixel values. Therefore, if you display the PPI image, and then bring up the Interactive Histogram tool from the display Functions menu (by choosing [Display Enhancements -> Interactive Stretching]), you will see only a large spike at zero. This is because most of the pixels in the image have the same value (zero), and the size of the histogram spike overwhelms the signal from other pixels. You can zoom in on the histogram by clicking the middle mouse button in the input histogram plot window. After you click the middle mouse button 5 or 6 times, you should start seeing the details in the PPI histogram. Please login or register to post comments. Why is constrained unmixing usually a bad idea? Tips for georeferencing AVHRR images in ENVI